I am trying to record audio using AVAudioEngine. The file gets recorded and plays correctly. However, I also need to send AVAudioPCMBuffer that I receive in the tap handler to my server via socket. I am converting AVAudioPCMBuffer to NSData and sending it. The server is receiving it - however the file doesn't play correctly on the server. Am I missing something while converting AVAudioPCMBuffer to NSData or is my recording missing some configuration.
Any help would be appreciated guys. Thanks!
let audioEngine = AVAudioEngine()
let inputNode = audioEngine.inputNode
let bus = 0
try file = AVAudioFile(forWriting: URLFor("recording.wav")!, settings: audioEngine.inputNode!.inputFormatForBus(0).settings)
inputNode!.installTapOnBus(bus, bufferSize: 4096, format: inputNode!.inputFormatForBus(bus)) {
(buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
self.file?.writeFromBuffer(buffer)
self.socketio.send(self.toNSData(buffer))
}
do{
audioEngine.prepare()
try audioEngine.start()
}
catch{
print("catch")
}
func toNSData(PCMBuffer: AVAudioPCMBuffer) -> NSData {
let channelCount = 1 // given PCMBuffer channel count is 1
let channels = UnsafeBufferPointer(start: PCMBuffer.floatChannelData, count: channelCount)
let ch0Data = NSData(bytes: channels[0], length:Int(PCMBuffer.frameCapacity * PCMBuffer.format.streamDescription.memory.mBytesPerFrame))
return ch0Data
}
Related
I used AVAudioEngine to gather PCM data from the microphone in iOS and it worked fine, however when I tried moving the project to WatchOS, I get feedback while recording. How would I stop playback from the speakers while recording?
var audioEngine = AVAudioEngine()
try AVAudioSession.sharedInstance().setCategory(.playAndRecord, mode: .default)
try AVAudioSession.sharedInstance().setActive(true)
let input = audioEngine.inputNode
let format = input.inputFormat(forBus: 0)
audioEngine.connect(input, to: audioEngine.mainMixerNode, format: format)
try! audioEngine.start()
let mixer = audioEngine.mainMixerNode
let format = mixer.outputFormat(forBus: 0)
let sampleRate = format.sampleRate
let fft_size = 2048
mixer.installTap(onBus: 0, bufferSize: UInt32(fft_size), format: format,
block: {(buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
// Processing
}
For anyone else that runs into this, I fixed it by removing the connection from the inputNode to the mainMixerNode, and installed the tap straight on the inputNode. The way I was doing it before I guess creates a feedback loop where it's playing back what it's recording. Not sure why this only happens in WatchOS and not on iPhone... perhaps it was playing back from the ear speaker rather than the one next to the mic. Fixed code:
var audioEngine = AVAudioEngine()
try AVAudioSession.sharedInstance().setCategory(.playAndRecord, mode: .default)
try AVAudioSession.sharedInstance().setActive(true)
try! audioEngine.start()
let input = audioEngine.inputNode
let format = mixer.outputFormat(forBus: 0)
let sampleRate = format.sampleRate
let fft_size = 2048
input.installTap(onBus: 0, bufferSize: UInt32(fft_size), format: format,
block: {(buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
// Processing
}
I will briefly go over all the elements of my app:
I have an application that records audio to an AVAudioPCMBuffer. This buffer is then converted to NSData and then to [UInt8]. It is then streamed over an OutputStream. On another device, this data is received using an InputStream. Then it is converted to NSData, and back to an AVAudioPCMBuffer. This buffer is then played.
The issue is that the audio is very jittery and you can't make out voices, only that the audio gets louder or quieter depending on if the other person is talking.
When scheduling the buffer:
self.peerAudioPlayer.scheduleBuffer(audioBuffer, completionHandler: nil)
I have delayed playing this audio for a few seconds and then played it, hoping that this would make the audio clearer, however it did not help. My best guess is that the buffer I'm creating is somehow cutting off some of the audio. So I will show you my relevant code:
Here is how I record audio:
localInput?.installTap(onBus: 1, bufferSize: 4096, format: localInputFormat) {
(buffer, when) -> Void in
let data = self.audioBufferToNSData(PCMBuffer: buffer)
let output = self.outputStream!.write(data.bytes.assumingMemoryBound(to: UInt8.self), maxLength: data.length)
}
audioBufferToNSData is just a method which converts AVAudioPCMBuffer to NSData and here it is:
func audioBufferToNSData(PCMBuffer: AVAudioPCMBuffer) -> NSData {
let channelCount = 1 // given PCMBuffer channel count is 1
let channels = UnsafeBufferPointer(start: PCMBuffer.floatChannelData, count: channelCount)
let data = NSData(bytes: channels[0], length:Int(PCMBuffer.frameCapacity * PCMBuffer.format.streamDescription.pointee.mBytesPerFrame))
return data
}
I'm wondering if the issue could be at the method above. Possibly when I calculate the length of the NSData object, maybe I am cutting off part of the audio.
On the receiving end I have this:
case Stream.Event.hasBytesAvailable:
DispatchQueue.global().async {
var tempBuffer: [UInt8] = .init(repeating: 0, count: 17640)
let length = self.inputStream!.read(&tempBuffer, maxLength: tempBuffer.count)
self.testBufferCount += length
self.testBuffer.append(contentsOf: tempBuffer)
if (self.testBufferCount >= 17640) {
let data = NSData.init(bytes: &self.testBuffer, length: self.testBufferCount)
let audioBuffer = self.dataToPCMBuffer(data: data)
self.peerAudioPlayer.scheduleBuffer(audioBuffer, completionHandler: nil)
self.testBuffer.removeAll()
self.testBufferCount = 0
}
}
The reason I check for 17640 is because the data being sent is exactly 17640 bytes, so I need to get all of this data before I play it.
Furthermore, the dataToPCMBuffer method just converts NSData to an AVAudioPCMBuffer so that it can be played. Here is that method:
func dataToPCMBuffer(data: NSData) -> AVAudioPCMBuffer {
let audioFormat = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: 44100, channels: 1, interleaved: false) // given NSData audio format
let audioBuffer = AVAudioPCMBuffer(pcmFormat: audioFormat, frameCapacity: UInt32(data.length) / audioFormat.streamDescription.pointee.mBytesPerFrame)
audioBuffer.frameLength = audioBuffer.frameCapacity
let channels = UnsafeBufferPointer(start: audioBuffer.floatChannelData, count: Int(audioBuffer.format.channelCount))
data.getBytes(UnsafeMutableRawPointer(channels[0]) , length: data.length)
return audioBuffer
}
Thank you in advance!
I think that in audioBufferToNSData you should use frameLength instead of frameCapacity.
let data = NSData(bytes: channels[0], length:Int(PCMBuffer.frameLength * PCMBuffer.format.streamDescription.pointee.mBytesPerFrame))
PCMBuffer.frameCapacity -> how much can be stored
PCMBuffer.frameLength -> how much of PCMBuffer.frameCapacity is actual valid data
i'm trying to get the float data of a realtime mic input with AVAudioEngine. To proceed a fft and a special algorithm after the fft.
When i compile the code im becoming this output on the console:0x0000000000000000
What i doing wrong?
Many thanks for help
Here is my code to get the float data:
let audioEngine = AVAudioEngine()
override func loadView() {
super.loadView()
let inputNode = audioEngine.inputNode
let bus = 0
inputNode!.installTapOnBus(bus, bufferSize: 2048, format: inputNode!.inputFormatForBus(bus)) {
(buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
print(buffer.floatChannelData[50])
}
audioEngine.prepare()
do{
try audioEngine.start()
}catch{
print("Error")
}
}
floatChannelData is a pointer to a pointer, so if you want the first channel (which is all you'll get on iOS unless you plug in a stereo microphone), you can do this:
Try
let firstChannel = buffer.floatChannelData[0]
let arr = Array(UnsafeBufferPointer(start: firstChannel, count: Int(buffer.frameLength)))
// Do something with your array of Floats
I have two classes, MicrophoneHandler, and AudioPlayer. I have managed to use AVCaptureSession to tap microphone data using the approved answer here, and and converted the CMSampleBuffer to NSData using this function:
func sendDataToDelegate(buffer: CMSampleBuffer!)
{
let block = CMSampleBufferGetDataBuffer(buffer)
var length = 0
var data: UnsafeMutablePointer<Int8> = nil
var status = CMBlockBufferGetDataPointer(block!, 0, nil, &length, &data) // TODO: check for errors
let result = NSData(bytesNoCopy: data, length: length, freeWhenDone: false)
self.delegate.handleBuffer(result)
}
I would now like to play the audio over the speaker by converting the NSData produced above to AVAudioPCMBuffer and play it using AVAudioEngine. My AudioPlayerclass is as follows:
var engine: AVAudioEngine!
var playerNode: AVAudioPlayerNode!
var mixer: AVAudioMixerNode!
override init()
{
super.init()
self.setup()
self.start()
}
func handleBuffer(data: NSData)
{
let newBuffer = self.toPCMBuffer(data)
print(newBuffer)
self.playerNode.scheduleBuffer(newBuffer, completionHandler: nil)
}
func setup()
{
self.engine = AVAudioEngine()
self.playerNode = AVAudioPlayerNode()
self.engine.attachNode(self.playerNode)
self.mixer = engine.mainMixerNode
engine.connect(self.playerNode, to: self.mixer, format: self.mixer.outputFormatForBus(0))
}
func start()
{
do {
try self.engine.start()
}
catch {
print("error couldn't start engine")
}
self.playerNode.play()
}
func toPCMBuffer(data: NSData) -> AVAudioPCMBuffer
{
let audioFormat = AVAudioFormat(commonFormat: AVAudioCommonFormat.PCMFormatFloat32, sampleRate: 8000, channels: 2, interleaved: false) // given NSData audio format
let PCMBuffer = AVAudioPCMBuffer(PCMFormat: audioFormat, frameCapacity: UInt32(data.length) / audioFormat.streamDescription.memory.mBytesPerFrame)
PCMBuffer.frameLength = PCMBuffer.frameCapacity
let channels = UnsafeBufferPointer(start: PCMBuffer.floatChannelData, count: Int(PCMBuffer.format.channelCount))
data.getBytes(UnsafeMutablePointer<Void>(channels[0]) , length: data.length)
return PCMBuffer
}
The buffer reaches the handleBuffer:buffer function when self.delegate.handleBuffer(result) is called in the first snippet above.
I am able to print(newBuffer), and see the memory locations of the converted buffers, but nothing comes out of the speakers. I can only imagine something is not consistent between the conversions to and from NSData. Any ideas? Thanks in advance.
Skip the raw NSData format
Why not use AVAudioPlayer all the way? If you positively need NSData, you can always load such data from the soundURL below. In this example, the disk buffer is something like:
let soundURL = documentDirectory.URLByAppendingPathComponent("sound.m4a")
It makes sense to record directly to a file anyway for optimal memory and resource management. You get NSData from your recording this way:
let data = NSFileManager.defaultManager().contentsAtPath(soundURL.path())
The code below is all you need:
Record
if !audioRecorder.recording {
let audioSession = AVAudioSession.sharedInstance()
do {
try audioSession.setActive(true)
audioRecorder.record()
} catch {}
}
Play
if (!audioRecorder.recording){
do {
try audioPlayer = AVAudioPlayer(contentsOfURL: audioRecorder.url)
audioPlayer.play()
} catch {}
}
Setup
let audioSession = AVAudioSession.sharedInstance()
do {
try audioSession.setCategory(AVAudioSessionCategoryPlayAndRecord)
try audioRecorder = AVAudioRecorder(URL: self.directoryURL()!,
settings: recordSettings)
audioRecorder.prepareToRecord()
} catch {}
Settings
let recordSettings = [AVSampleRateKey : NSNumber(float: Float(44100.0)),
AVFormatIDKey : NSNumber(int: Int32(kAudioFormatMPEG4AAC)),
AVNumberOfChannelsKey : NSNumber(int: 1),
AVEncoderAudioQualityKey : NSNumber(int: Int32(AVAudioQuality.Medium.rawValue))]
Download Xcode Project:
You can find this very example here. Download the full project, which records and plays on both simulator and device, from Swift Recipes.
I'm really excited about the new AVAudioEngine. It seems like a good API wrapper around audio unit. Unfortunately the documentation is so far nonexistent, and I'm having problems getting a simple graph to work.
Using the following simple code to set up an audio engine graph, the tap block is never called. It mimics some of the sample code floating around the web, though those also did not work.
let inputNode = audioEngine.inputNode
var error: NSError?
let bus = 0
inputNode.installTapOnBus(bus, bufferSize: 2048, format: inputNode.inputFormatForBus(bus)) {
(buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
println("sfdljk")
}
audioEngine.prepare()
if audioEngine.startAndReturnError(&error) {
println("started audio")
} else {
if let engineStartError = error {
println("error starting audio: \(engineStartError.localizedDescription)")
}
}
All I'm looking for is the raw pcm buffer for analysis. I don't need any effects or output. According to the WWDC talk "502 Audio Engine in Practice", this setup should work.
Now if you want to capture data from the input node, you can install a node tap and we've talked about that.
But what's interesting about this particular example is, if I wanted to work with just the input node, say just capture data from the microphone and maybe examine it, analyze it in real time or maybe write it out to file, I can directly install a tap on the input node.
And the tap will do the work of pulling the input node for data, stuffing it in buffers and then returning that back to the application.
Once you have that data you can do whatever you need to do with it.
Here are some links I tried:
http://hondrouthoughts.blogspot.com/2014/09/avfoundation-audio-monitoring.html
http://jamiebullock.com/post/89243252529/live-coding-audio-with-swift-playgrounds (SIGABRT in playground on startAndReturnError)
Edit: This is the implementation based on Thorsten Karrer's suggestion. It unfortunately does not work.
class AudioProcessor {
let audioEngine = AVAudioEngine()
init(){
let inputNode = audioEngine.inputNode
let bus = 0
var error: NSError?
inputNode.installTapOnBus(bus, bufferSize: 2048, format:inputNode.inputFormatForBus(bus)) {
(buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
println("sfdljk")
}
audioEngine.prepare()
audioEngine.startAndReturnError(nil)
println("started audio")
}
}
It might be the case that your AVAudioEngine is going out of scope and is released by ARC ("If you liked it then you should have put retain on it...").
The following code (engine is moved to an ivar and thus sticks around) fires the tap:
class AppDelegate: NSObject, NSApplicationDelegate {
let audioEngine = AVAudioEngine()
func applicationDidFinishLaunching(aNotification: NSNotification) {
let inputNode = audioEngine.inputNode
let bus = 0
inputNode.installTapOnBus(bus, bufferSize: 2048, format: inputNode.inputFormatForBus(bus)) {
(buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
println("sfdljk")
}
audioEngine.prepare()
audioEngine.startAndReturnError(nil)
}
}
(I removed the error handling for brevity)
UPDATED: I have implemented a complete working example of Recording mic input, applying some effects (reverbs, delay, distortion) at runtime, and save all these effects to an output file.
var engine = AVAudioEngine()
var distortion = AVAudioUnitDistortion()
var reverb = AVAudioUnitReverb()
var audioBuffer = AVAudioPCMBuffer()
var outputFile = AVAudioFile()
var delay = AVAudioUnitDelay()
//Initialize the audio engine
func initializeAudioEngine() {
engine.stop()
engine.reset()
engine = AVAudioEngine()
isRealTime = true
do {
try AVAudioSession.sharedInstance().setCategory(AVAudioSessionCategoryPlayAndRecord)
let ioBufferDuration = 128.0 / 44100.0
try AVAudioSession.sharedInstance().setPreferredIOBufferDuration(ioBufferDuration)
} catch {
assertionFailure("AVAudioSession setup error: \(error)")
}
let fileUrl = URLFor("/NewRecording.caf")
print(fileUrl)
do {
try outputFile = AVAudioFile(forWriting: fileUrl!, settings: engine.mainMixerNode.outputFormatForBus(0).settings)
}
catch {
}
let input = engine.inputNode!
let format = input.inputFormatForBus(0)
//settings for reverb
reverb.loadFactoryPreset(.MediumChamber)
reverb.wetDryMix = 40 //0-100 range
engine.attachNode(reverb)
delay.delayTime = 0.2 // 0-2 range
engine.attachNode(delay)
//settings for distortion
distortion.loadFactoryPreset(.DrumsBitBrush)
distortion.wetDryMix = 20 //0-100 range
engine.attachNode(distortion)
engine.connect(input, to: reverb, format: format)
engine.connect(reverb, to: distortion, format: format)
engine.connect(distortion, to: delay, format: format)
engine.connect(delay, to: engine.mainMixerNode, format: format)
assert(engine.inputNode != nil)
isReverbOn = false
try! engine.start()
}
//Now the recording function:
func startRecording() {
let mixer = engine.mainMixerNode
let format = mixer.outputFormatForBus(0)
mixer.installTapOnBus(0, bufferSize: 1024, format: format, block:
{ (buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
print(NSString(string: "writing"))
do{
try self.outputFile.writeFromBuffer(buffer)
}
catch {
print(NSString(string: "Write failed"));
}
})
}
func stopRecording() {
engine.mainMixerNode.removeTapOnBus(0)
engine.stop()
}
I hope this might help you. Thanks!
The above answer didn't work for me but the following did. I'm installing a tap on a mixer node.
mMixerNode?.installTapOnBus(0, bufferSize: 4096, format: mMixerNode?.outputFormatForBus(0),
{
(buffer: AVAudioPCMBuffer!, time:AVAudioTime!) -> Void in
NSLog("tapped")
}
)
nice topic
hi brodney
in your topic i find my solution . here is similar topic Generate AVAudioPCMBuffer with AVAudioRecorder
see lecture Wwdc 2014 502 - AVAudioEngine in Practice capture microphone => in 20 min create buffer with tap code => in 21 .50
here is swift 3 code
#IBAction func button01Pressed(_ sender: Any) {
let inputNode = audioEngine.inputNode
let bus = 0
inputNode?.installTap(onBus: bus, bufferSize: 2048, format: inputNode?.inputFormat(forBus: bus)) {
(buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
var theLength = Int(buffer.frameLength)
print("theLength = \(theLength)")
var samplesAsDoubles:[Double] = []
for i in 0 ..< Int(buffer.frameLength)
{
var theSample = Double((buffer.floatChannelData?.pointee[i])!)
samplesAsDoubles.append( theSample )
}
print("samplesAsDoubles.count = \(samplesAsDoubles.count)")
}
audioEngine.prepare()
try! audioEngine.start()
}
to stop audio
func stopAudio()
{
let inputNode = audioEngine.inputNode
let bus = 0
inputNode?.removeTap(onBus: bus)
self.audioEngine.stop()
}