I used AVAudioEngine to gather PCM data from the microphone in iOS and it worked fine, however when I tried moving the project to WatchOS, I get feedback while recording. How would I stop playback from the speakers while recording?
var audioEngine = AVAudioEngine()
try AVAudioSession.sharedInstance().setCategory(.playAndRecord, mode: .default)
try AVAudioSession.sharedInstance().setActive(true)
let input = audioEngine.inputNode
let format = input.inputFormat(forBus: 0)
audioEngine.connect(input, to: audioEngine.mainMixerNode, format: format)
try! audioEngine.start()
let mixer = audioEngine.mainMixerNode
let format = mixer.outputFormat(forBus: 0)
let sampleRate = format.sampleRate
let fft_size = 2048
mixer.installTap(onBus: 0, bufferSize: UInt32(fft_size), format: format,
block: {(buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
// Processing
}
For anyone else that runs into this, I fixed it by removing the connection from the inputNode to the mainMixerNode, and installed the tap straight on the inputNode. The way I was doing it before I guess creates a feedback loop where it's playing back what it's recording. Not sure why this only happens in WatchOS and not on iPhone... perhaps it was playing back from the ear speaker rather than the one next to the mic. Fixed code:
var audioEngine = AVAudioEngine()
try AVAudioSession.sharedInstance().setCategory(.playAndRecord, mode: .default)
try AVAudioSession.sharedInstance().setActive(true)
try! audioEngine.start()
let input = audioEngine.inputNode
let format = mixer.outputFormat(forBus: 0)
let sampleRate = format.sampleRate
let fft_size = 2048
input.installTap(onBus: 0, bufferSize: UInt32(fft_size), format: format,
block: {(buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
// Processing
}
Related
I need to play Youtube and recognize my voice at same time.I could succeed to recognize my voice when the system speaker was selected for Youtube play.How ever the recognition of my voice was failed when the bluetooth hands free device such as AirPod Pro was selected for Youtube play.If I play sound using AVPlayer and AVAudioPlayer and recognize my voice at same time, the recognition of my voice was succeeded via bluetooth hands free device.Please advice me how to pass my voice from BLE hands free device as the sound input when specifying allowBluetooth option to AVAudioSession.
Before playing YoutubePlayerKit, audioSession was set as follows.AVAudioSession options: .allowBluetooth worked fine for BLE and system sound input for AVPlayer and AVAudioPlayer but not for YoutubePlayerKit.
let configuration = YouTubePlayer.Configuration(
isUserInteractionEnabled: false,
autoPlay: false,
showControls: false,
enableJavaScriptAPI: true,
loopEnabled: false,
playInline: true
)
do {
audioSession = AVAudioSession.sharedInstance()
try audioSession.setCategory(.playAndRecord, mode: .measurement, options: .allowBluetooth)
// try audioSession.setCategory(.playAndRecord, mode: .measurement, options: .duckOthers)
try AVAudioSession.sharedInstance().setActive(true, options: .notifyOthersOnDeactivation)
} catch _ {
handleError(withMessage:"play youtube audio session error")// failed to record
}
youTubeID = URL(string: movieURL)!.lastPathComponent
let videoID = URL(string: movieURL)?.deletingLastPathComponent()
print("youTubeID \(String(describing: youTubeID)) videoID \(String(describing: videoID))")
if (videoID)!.absoluteString.contains("https://youtu.be/"){
playYoutube = true
DispatchQueue.main.async{ [self] in
youTubePlayer = YouTubePlayer(
source: .video(id: youTubeID),
configuration: configuration)
youTubePlayerView = YouTubePlayerHostingView(player: youTubePlayer)
youTubePlayer.pause()
youTubePlayerView.frame = self.topView.bounds
self.topView.addSubview(youTubePlayerView)
self.checkYoutubeDuration(player:youTubePlayer)
}
}
And set audioEngine as follows when doing voice recognition.
guard let recognizer = SFSpeechRecognizer(locale: Locale.init(identifier: "en-US")), recognizer.isAvailable else {
handleError(withMessage: "voice recobnition error")
return
}
audioEngine = AVAudioEngine()
inputNode = audioEngine.inputNode
inputNode.removeTap(onBus: 0)
guard let commonFormat = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: 44100, channels: 2, interleaved: false) else { return }
let recordingFormat = inputNode.outputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer: AVAudioPCMBuffer, when: AVAudioTime) in
self.recognitionRequest?.append(buffer)
}
self.audioEngine.prepare()
do {
try self.audioEngine.start()
print("audioEngine.start()")
}catch{
print("audioEngine.start failed")
}
// Create a speech recognition request.
recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
recognitionRequest!.shouldReportPartialResults = true
I am a beginner in working with sounds and AVAudioEngine in IOS, and I'm developing an application that captures the audio samples as buffers and analyzes it. Furthermore, the sample rate must be 8 kHz with an integer16 PCM data, but when I try to record from the inputNode and convert the data to 8 kHz, it shows 0s in the buffer. However, when I set the commonFormat to .pcmFormatFloat32 it works fine.
My Code:
let inputNode = audioEngine.inputNode
let downMixer = AVAudioMixerNode()
let main = audioEngine.mainMixerNode
let format = inputNode.inputFormat(forBus: 0)
let format16KHzMono = AVAudioFormat(commonFormat: AVAudioCommonFormat.pcmFormatInt16, sampleRate: 8000, channels: 1, interleaved: true)
audioEngine.attach(downMixer)
downMixer.installTap(onBus: 0, bufferSize: 640, format: format16KHzMono) { (buffer, time) -> Void in
do{
print(buffer.description)
if let channel1Buffer = buffer.int16ChannelData?[0] {
// print(channel1Buffer[0])
for i in 0 ... Int(buffer.frameLength-1) {
print((channel1Buffer[i])) //prints 0s :(
}
}
}
}
audioEngine.connect(inputNode, to: downMixer, format: format)
audioEngine.connect(downMixer, to: main, format: format16KHzMono)
audioEngine.prepare()
try! audioEngine.start()
Thanks
I am a beginner in working with sound processing and AVAudioEngine in iOS, and I'm developing an application that captures the audio samples as a buffer and analyzes it. Furthermore, the sample rate must be 8000 kHz and also must be encoded as PCM16Bit, but the default inputNode in the AVAudioEngine is 44.1 kHz.
In Android, the process is quite simple:
AudioRecord audioRecord = new AudioRecord(MediaRecorder.AudioSource.MIC,
8000, AudioFormat.CHANNEL_IN_MONO,
AudioFormat.ENCODING_PCM_16BIT, bufferSize);
and then start the reading function for the buffer.
I searched a lot, but I didn't find any similar example. Instead, all the examples in which I encountered are capturing the samples in the default node's sample rate(44.1 kHz) like:
let input = audioEngine.inputNode
let inputFormat = input.inputFormat(forBus: 0)
input.installTap(onBus: 0, bufferSize: 640, format: inputFormat) { (buffer, time) -> Void in
print(inputFormat)
if let channel1Buffer = buffer.floatChannelData?[0] {
for i in 0...Int(buffer.frameLength-1) {
print(channel1Buffer[i])
}
}
}
try! audioEngine.start()
So I would like to capture audio samples using AVAudioEngine with 8000 kHz sample rate and PCM16Bit encoding.
Edit:
I reached a solution to transform the input to 8 kHz:
let inputNode = audioEngine.inputNode
let downMixer = AVAudioMixerNode()
let main = audioEngine.mainMixerNode
let format = inputNode.inputFormat(forBus: 0)
let format16KHzMono = AVAudioFormat(commonFormat: AVAudioCommonFormat.pcmFormatInt16, sampleRate: 8000, channels: 1, interleaved: true)
audioEngine.attach(downMixer)
downMixer.installTap(onBus: 0, bufferSize: 640, format: format16KHzMono) { (buffer, time) -> Void in
do{
print(buffer.description)
if let channel1Buffer = buffer.int16ChannelData?[0] {
// print(channel1Buffer[0])
for i in 0 ... Int(buffer.frameLength-1) {
print((channel1Buffer[i]))
}
}
}
}
audioEngine.connect(inputNode, to: downMixer, format: format)
audioEngine.connect(downMixer, to: main, format: format16KHzMono)
audioEngine.prepare()
try! audioEngine.start()
, but when I use .pcmFormatInt16 it doesn't work. However, when I use .pcmFormatFloat32 it works fine!
Have you checked with settings parameter
let format16KHzMono = AVAudioFormat(settings: [AVFormatIDKey: AVAudioCommonFormat.pcmFormatInt16,
AVEncoderAudioQualityKey: AVAudioQuality.high.rawValue,
AVEncoderBitRateKey: 16,
AVNumberOfChannelsKey: 1,
AVSampleRateKey: 8000.0] as [String : AnyObject])
I'm using AVFoundation framework. Whenever the player plays the buffer, my background music gets stopped so I used below code to allow it to continue playing irrespective of the AVFoundation player.
try audioSession.setCategory(AVAudioSessionCategoryPlayAndRecord, with: [.mixWithOthers,.allowBluetooth])
try audioSession.setMode(AVAudioSessionModeDefault)
try audioSession.setActive(true)
It does work but the problem is the quality of the background music gets dramatically affected. The music don't have the bass effects anymore whenever the AVPlayer plays the buffer.
I want the background music uninterrupted while using AVPlayer. Is it possible?
update : I added full code if anyone wants to check. Can feel the difference in background itune music as soon as the app is opened or the session is activated when using this code.
class ViewCosdfntroller: UIViewController {
var engine = AVAudioEngine()
let audioSession = AVAudioSession.sharedInstance()
let player = AVAudioPlayerNode()
let mixer = AVAudioMixerNode()
override func viewDidLoad() {
super.viewDidLoad()
do {
try audioSession.setCategory(AVAudioSessionCategoryPlayAndRecord, with: [.mixWithOthers,.allowBluetooth])
try audioSession.setMode(AVAudioSessionModeDefault)
try audioSession.setActive(true)
} catch {
}
let input = engine.inputNode
let bus = 0
let inputFormat = input.outputFormat(forBus: bus)
let recordingFormat = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: 11025.0, channels: 1, interleaved: false)
engine.attach(player)
engine.attach(mixer)
engine.connect(input, to: mixer, format: input.outputFormat(forBus: 0))
engine.connect(player, to: engine.mainMixerNode, format: recordingFormat)
mixer.installTap(onBus: bus, bufferSize: AVAudioFrameCount(inputFormat.sampleRate * 0.4), format: inputFormat, block: { (buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
let Converter:AVAudioConverter = AVAudioConverter.init(from: inputFormat, to: recordingFormat!)!
let newbuffer = AVAudioPCMBuffer(pcmFormat: recordingFormat!,frameCapacity: AVAudioFrameCount((recordingFormat?.sampleRate)! * 0.4))
let inputBlock : AVAudioConverterInputBlock = { (inNumPackets, outStatus) -> AVAudioBuffer? in
outStatus.pointee = AVAudioConverterInputStatus.haveData
let audioBuffer : AVAudioBuffer = buffer
return audioBuffer
}
var error : NSError?
Converter.convert(to: newbuffer!, error: &error, withInputFrom: inputBlock)
self.player.scheduleBuffer(newbuffer!)
})
do {
try! engine.start()
player.play()
} catch {
print(error)
}
}
}
Unless this is some weird mixing quirk, the quality change you report may just be that recording categories change the default audio output device to the tiny, tinny receiver (because telephones, don't ask). Override this behaviour by adding .defaultToSpeaker to your setCategory() call:
try audioSession.setCategory(AVAudioSessionCategoryPlayAndRecord, with: [.mixWithOthers,.allowBluetooth, .defaultToSpeaker])
I think you need this one:
try audioSession.setCategory(AVAudioSessionCategoryAmbient)
Documentation:
https://developer.apple.com/documentation/avfoundation/avaudiosessioncategoryambient
When you use this category, audio from other apps mixes with your audio
I'm really excited about the new AVAudioEngine. It seems like a good API wrapper around audio unit. Unfortunately the documentation is so far nonexistent, and I'm having problems getting a simple graph to work.
Using the following simple code to set up an audio engine graph, the tap block is never called. It mimics some of the sample code floating around the web, though those also did not work.
let inputNode = audioEngine.inputNode
var error: NSError?
let bus = 0
inputNode.installTapOnBus(bus, bufferSize: 2048, format: inputNode.inputFormatForBus(bus)) {
(buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
println("sfdljk")
}
audioEngine.prepare()
if audioEngine.startAndReturnError(&error) {
println("started audio")
} else {
if let engineStartError = error {
println("error starting audio: \(engineStartError.localizedDescription)")
}
}
All I'm looking for is the raw pcm buffer for analysis. I don't need any effects or output. According to the WWDC talk "502 Audio Engine in Practice", this setup should work.
Now if you want to capture data from the input node, you can install a node tap and we've talked about that.
But what's interesting about this particular example is, if I wanted to work with just the input node, say just capture data from the microphone and maybe examine it, analyze it in real time or maybe write it out to file, I can directly install a tap on the input node.
And the tap will do the work of pulling the input node for data, stuffing it in buffers and then returning that back to the application.
Once you have that data you can do whatever you need to do with it.
Here are some links I tried:
http://hondrouthoughts.blogspot.com/2014/09/avfoundation-audio-monitoring.html
http://jamiebullock.com/post/89243252529/live-coding-audio-with-swift-playgrounds (SIGABRT in playground on startAndReturnError)
Edit: This is the implementation based on Thorsten Karrer's suggestion. It unfortunately does not work.
class AudioProcessor {
let audioEngine = AVAudioEngine()
init(){
let inputNode = audioEngine.inputNode
let bus = 0
var error: NSError?
inputNode.installTapOnBus(bus, bufferSize: 2048, format:inputNode.inputFormatForBus(bus)) {
(buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
println("sfdljk")
}
audioEngine.prepare()
audioEngine.startAndReturnError(nil)
println("started audio")
}
}
It might be the case that your AVAudioEngine is going out of scope and is released by ARC ("If you liked it then you should have put retain on it...").
The following code (engine is moved to an ivar and thus sticks around) fires the tap:
class AppDelegate: NSObject, NSApplicationDelegate {
let audioEngine = AVAudioEngine()
func applicationDidFinishLaunching(aNotification: NSNotification) {
let inputNode = audioEngine.inputNode
let bus = 0
inputNode.installTapOnBus(bus, bufferSize: 2048, format: inputNode.inputFormatForBus(bus)) {
(buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
println("sfdljk")
}
audioEngine.prepare()
audioEngine.startAndReturnError(nil)
}
}
(I removed the error handling for brevity)
UPDATED: I have implemented a complete working example of Recording mic input, applying some effects (reverbs, delay, distortion) at runtime, and save all these effects to an output file.
var engine = AVAudioEngine()
var distortion = AVAudioUnitDistortion()
var reverb = AVAudioUnitReverb()
var audioBuffer = AVAudioPCMBuffer()
var outputFile = AVAudioFile()
var delay = AVAudioUnitDelay()
//Initialize the audio engine
func initializeAudioEngine() {
engine.stop()
engine.reset()
engine = AVAudioEngine()
isRealTime = true
do {
try AVAudioSession.sharedInstance().setCategory(AVAudioSessionCategoryPlayAndRecord)
let ioBufferDuration = 128.0 / 44100.0
try AVAudioSession.sharedInstance().setPreferredIOBufferDuration(ioBufferDuration)
} catch {
assertionFailure("AVAudioSession setup error: \(error)")
}
let fileUrl = URLFor("/NewRecording.caf")
print(fileUrl)
do {
try outputFile = AVAudioFile(forWriting: fileUrl!, settings: engine.mainMixerNode.outputFormatForBus(0).settings)
}
catch {
}
let input = engine.inputNode!
let format = input.inputFormatForBus(0)
//settings for reverb
reverb.loadFactoryPreset(.MediumChamber)
reverb.wetDryMix = 40 //0-100 range
engine.attachNode(reverb)
delay.delayTime = 0.2 // 0-2 range
engine.attachNode(delay)
//settings for distortion
distortion.loadFactoryPreset(.DrumsBitBrush)
distortion.wetDryMix = 20 //0-100 range
engine.attachNode(distortion)
engine.connect(input, to: reverb, format: format)
engine.connect(reverb, to: distortion, format: format)
engine.connect(distortion, to: delay, format: format)
engine.connect(delay, to: engine.mainMixerNode, format: format)
assert(engine.inputNode != nil)
isReverbOn = false
try! engine.start()
}
//Now the recording function:
func startRecording() {
let mixer = engine.mainMixerNode
let format = mixer.outputFormatForBus(0)
mixer.installTapOnBus(0, bufferSize: 1024, format: format, block:
{ (buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
print(NSString(string: "writing"))
do{
try self.outputFile.writeFromBuffer(buffer)
}
catch {
print(NSString(string: "Write failed"));
}
})
}
func stopRecording() {
engine.mainMixerNode.removeTapOnBus(0)
engine.stop()
}
I hope this might help you. Thanks!
The above answer didn't work for me but the following did. I'm installing a tap on a mixer node.
mMixerNode?.installTapOnBus(0, bufferSize: 4096, format: mMixerNode?.outputFormatForBus(0),
{
(buffer: AVAudioPCMBuffer!, time:AVAudioTime!) -> Void in
NSLog("tapped")
}
)
nice topic
hi brodney
in your topic i find my solution . here is similar topic Generate AVAudioPCMBuffer with AVAudioRecorder
see lecture Wwdc 2014 502 - AVAudioEngine in Practice capture microphone => in 20 min create buffer with tap code => in 21 .50
here is swift 3 code
#IBAction func button01Pressed(_ sender: Any) {
let inputNode = audioEngine.inputNode
let bus = 0
inputNode?.installTap(onBus: bus, bufferSize: 2048, format: inputNode?.inputFormat(forBus: bus)) {
(buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
var theLength = Int(buffer.frameLength)
print("theLength = \(theLength)")
var samplesAsDoubles:[Double] = []
for i in 0 ..< Int(buffer.frameLength)
{
var theSample = Double((buffer.floatChannelData?.pointee[i])!)
samplesAsDoubles.append( theSample )
}
print("samplesAsDoubles.count = \(samplesAsDoubles.count)")
}
audioEngine.prepare()
try! audioEngine.start()
}
to stop audio
func stopAudio()
{
let inputNode = audioEngine.inputNode
let bus = 0
inputNode?.removeTap(onBus: bus)
self.audioEngine.stop()
}