I'm working on adding a feature to an existing app, to take audio input from the device microphone, convert it to frequency domain via an FFT and sends it to a coreML model. I'm using a standard AVCaptureDevice:
guard let microphone = AVCaptureDevice.default(.builtInMicrophone,
for: .audio,
position: .unspecified),
let microphoneInput = try? AVCaptureDeviceInput(device: microphone) else {
fatalError("Can't create microphone.")
}
The issue is, I require a custom sample rate to be defined for the microphone. Following Apple's documentation, setPreferredSampleRate (link) should be able to do that in a range between 8000-48000 Hz. However no matter which value I choose, the sample rate won't change, and no error is thrown:
print("Microphone sample rate: ", AVAudioSession.sharedInstance().sampleRate)
do { var flag = try AVAudioSession.sharedInstance().setPreferredSampleRate(20000) }
catch { print("Unable to set microphone sampling rate!") }
print("Microphone sample rate: ", AVAudioSession.sharedInstance().sampleRate)
Output:
Microphone sample rate: 48000.0
Microphone sample rate: 48000.0
How could I define the sampling rate for iOS devices?
EDIT:
Following the suggestion of using AVAudioConverter to resample microphone input, what's the most efficient way of doing this, considering I'm using AVCaptureAudioDataOutputSampleBufferDelegate and the corresponding captureOutput method to collect raw audio input from the microphone:
extension AudioSpectrogram: AVCaptureAudioDataOutputSampleBufferDelegate {
public func captureOutput(_ output: AVCaptureOutput,
didOutput sampleBuffer: CMSampleBuffer,
from connection: AVCaptureConnection) {
var audioBufferList = AudioBufferList()
var blockBuffer: CMBlockBuffer?
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(
sampleBuffer,
bufferListSizeNeededOut: nil,
bufferListOut: &audioBufferList,
bufferListSize: MemoryLayout.stride(ofValue: audioBufferList),
blockBufferAllocator: nil,
blockBufferMemoryAllocator: nil,
flags: kCMSampleBufferFlag_AudioBufferList_Assure16ByteAlignment,
blockBufferOut: &blockBuffer)
guard let data = audioBufferList.mBuffers.mData else {
return
}
if self.rawAudioData.count < self.sampleCount * 2 {
let actualSampleCount = CMSampleBufferGetNumSamples(sampleBuffer)
let ptr = data.bindMemory(to: Int16.self, capacity: actualSampleCount)
let buf = UnsafeBufferPointer(start: ptr, count: actualSampleCount)
rawAudioData.append(contentsOf: Array(buf))
}
while self.rawAudioData.count >= self.sampleCount {
let dataToProcess = Array(self.rawAudioData[0 ..< self.sampleCount])
self.rawAudioData.removeFirst(self.hopCount)
self.processData(values: dataToProcess)
}
}
Related
So I am using Replaykit to try stream my phone screen on a web browser.
override func processSampleBuffer(_ sampleBuffer: CMSampleBuffer, with sampleBufferType: RPSampleBufferType) {
//if source!.isSocketConnected {
switch sampleBufferType {
case RPSampleBufferType.video:
// Handle video sample buffer
break
case RPSampleBufferType.audioApp:
// Handle audio sample buffer for app audio
break
case RPSampleBufferType.audioMic:
// Handle audio sample buffer for mic audio
break
#unknown default:
break
}
}
So how do we send that data to WebRTC?
In order to use WebRTC, I learned that you need a signaling server.
Is it possible to start a signaling server on your mobile, just like http server?
Hi Sam WebRTC have one function which can process CMSampleBuffer frames to get Video Frames. But it is working with CVPixelBuffer. So you have to firstly convert your CMSampleBuffer to CVPixelBuffer. And than add this frames into your localVideoSource with RTCVideoCapturer. i have solved similar problem on AVCaptureVideoDataOutputSampleBufferDelegate. This delegate produces CMSampleBuffer as ReplayKit. i hope that below code lines could be help to you. You can try at the below code lines to solve your problem.
private var videoCapturer: RTCVideoCapturer?
private var localVideoSource = RTCClient.factory.videoSource()
private var localVideoTrack: RTCVideoTrack?
private var remoteVideoTrack: RTCVideoTrack?
private var peerConnection: RTCPeerConnection? = nil
public static let factory: RTCPeerConnectionFactory = {
RTCInitializeSSL()
let videoEncoderFactory = RTCDefaultVideoEncoderFactory()
let videoDecoderFactory = RTCDefaultVideoDecoderFactory()
return RTCPeerConnectionFactory(encoderFactory: videoEncoderFactory, decoderFactory: videoDecoderFactory)
}()
extension RTCClient : AVCaptureVideoDataOutputSampleBufferDelegate {
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
print("didOutPut: \(sampleBuffer)")
guard let imageBuffer: CVImageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
let timeStampNs: Int64 = Int64(CMTimeGetSeconds(CMSampleBufferGetPresentationTimeStamp(sampleBuffer)) * 1000000000)
let rtcPixlBuffer = RTCCVPixelBuffer(pixelBuffer: imageBuffer)
let rtcVideoFrame = RTCVideoFrame(buffer: rtcPixlBuffer, rotation: ._90, timeStampNs: timeStampNs)
self.localVideoSource.capturer(videoCapturer!, didCapture: rtcVideoFrame)
}
}
Also you need configuration like that for mediaSender,
func createMediaSenders() {
let streamId = "stream"
let videoTrack = self.createVideoTrack()
self.localVideoTrack = videoTrack
self.peerConnection!.add(videoTrack, streamIds: [streamId])
self.remoteVideoTrack = self.peerConnection!.transceivers.first { $0.mediaType == .video }?.receiver.track as? RTCVideoTrack
}
private func createVideoTrack() -> RTCVideoTrack {
let videoTrack = RTCClient.factory.videoTrack(with: self.videoSource, trackId: "video0")
return videoTrack
}
I have gone through the Apple Sample Code on Equalizing Audio with vDSP, where the audio file is filtered in AVAudioSourceNode and reproduced.
My objective is to do exactly the same, but instead of taking the audio from an audio file, take it in real-time from the microphone. Is it possible to do so in AVAudioEngine? A couple of ways to do so are based on installTap or AVAudioSinkNode, as described in First strategy and Second strategy sections.
So far, I got a bit closer to my objective with the following 2 strategies.
First strategy
// Added new class variables
private lazy var sinkNode = AVAudioSinkNode { (timestep, frames, audioBufferList) -> OSStatus in
let ptr = audioBufferList.pointee.mBuffers.mData?.assumingMemoryBound(to: Float.self)
var monoSamples = [Float]()
monoSamples.append(contentsOf: UnsafeBufferPointer(start: ptr, count: Int(frames)))
self.page = monoSamples.
for frame in 0..<frames {
print("sink: " + String(monoSamples[Int(frame)]))
}
return noErr
}
// AVAudioEngine connections
engine.attach(sinkNode)
// Audio input is passed to the AVAudioSinkNode and the [Float] array is pased to the AVAudioSourceNode through the _page_ variable
engine.connect(input, to: sinkNode, format: formatt)
engine.attach(srcNode)
engine.connect(srcNode,
to: engine.mainMixerNode,
format: format)
engine.connect(engine.mainMixerNode,
to: engine.outputNode,
format: format)
// The AVAudioSourceNode access the self.page array through the getSinalElement() function.
private func getSignalElement() -> Float {
return page.isEmpty ? 0 : page.removeFirst()
}
This approach made it possible to play the audio through the AVAudioSourceNode, but, the audio stops playing after a few seconds (even though, I still successfully get the self.page array in AVAudioSourceNode) and the app finally crashes.
2 strategy
In a similar approach, I used installtap
engine.attach(srcNode)
engine.connect(srcNode,
to: engine.mainMixerNode,
format: format)
engine.connect(engine.mainMixerNode,
to: engine.outputNode,
format: format)
input.installTap(onBus: 0, bufferSize:1024, format:formatt, block: { [weak self] buffer, when in
let arraySize = Int(buffer.frameLength)
let samples = Array(UnsafeBufferPointer(start: buffer.floatChannelData![0], count:arraySize))
self!.page = samples
})
// The AVAudioSourceNode access the self.page array through the getSinalElement() function.
private func getSignalElement() -> Float {
return page.isEmpty ? 0 : page.removeFirst()
}
The outcome after implementing Second strategy is the same as in First strategy. Which can be the issues making these approaches fail?
You can use AvAudioEngine().inputNode like following:
let engine = AVAudioEngine()
private lazy var srcNode = AVAudioSourceNode { _, _, frameCount, audioBufferList -> OSStatus in
return noErr
}
// Attach First
engine.attach(srcNode)
// Then connect nodes
let input = engine.inputNode
engine.connect(input, to: srcNode, format: input.inputFormat(forBus: 0))
It is important to use input.inputFormat(...) as format type.
do{
try audioSession.setCategory(.playAndRecord, mode: .default, options: [.mixWithOthers, .defaultToSpeaker,.allowBluetoothA2DP,.allowAirPlay,.allowBluetooth])
try audioSession.setActive(true)
} catch{
print(error.localizedDescription)
}
engine.attach(player)
//Add this only you want putch
let pitch = AVAudioUnitTimePitch()
// pitch.pitch = 1000 //Filtered Voice
//pitch.rate = 1 //Normal rate
// engine.attach(pitch)
engine.attach(srcNode)
engine.connect(srcNode,
to: engine.mainMixerNode,
format: engine.inputNode.inputFormat(forBus: 0))
engine.connect(engine.mainMixerNode,
to: engine.outputNode,
format: engine.inputNode.inputFormat(forBus: 0))
engine.prepare()
engine.inputNode.installTap(onBus: 0, bufferSize: 512, format: engine.inputNode.inputFormat(forBus: 0)) { (buffer, time) -> Void in
// self.player.scheduleBuffer(buffer)
let arraySize = Int(buffer.frameLength)
let samples = Array(UnsafeBufferPointer(start: buffer.floatChannelData![0], count:arraySize))
self.page = samples
print("samples",samples)
}
engine.mainMixerNode.outputVolume = 0.5
We would like to use WebRTC to send an iOS devices’ screen capture using ReplayKit.
The ReplayKit has a processSampleBuffer callback which gives CMSampleBuffer.
But here is where we are stuck, we can’t seem to get the CMSampleBuffer to be sent to the connected peer.
We have tried to create pixelBuffer from the sampleBuffer, and then create RTCVideoFrame.
we also extracted the RTCVideoSource from RTCPeerConnectionFactory and then used an RTCVideoCapturer and stream it to the localVideoSource.
Any idea what we are doing wrong?
var peerConnectionFactory: RTCPeerConnectionFactory?
override func processSampleBuffer(_ sampleBuffer: CMSampleBuffer, with sampleBufferType: RPSampleBufferType) {
switch sampleBufferType {
case RPSampleBufferType.video:
// create the CVPixelBuffer
let pixelBuffer:CVPixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)!;
// create the RTCVideoFrame
var videoFrame:RTCVideoFrame?;
let timestamp = NSDate().timeIntervalSince1970 * 1000
videoFrame = RTCVideoFrame(pixelBuffer: pixelBuffer, rotation: RTCVideoRotation._0, timeStampNs: Int64(timestamp))
// connect the video frames to the WebRTC
let localVideoSource = self.peerConnectionFactory!.videoSource()
let videoCapturer = RTCVideoCapturer()
localVideoSource.capturer(videoCapturer, didCapture: videoFrame!)
let videoTrack : RTCVideoTrack = self.peerConnectionFactory!.videoTrack(with: localVideoSource, trackId: "100”)
let mediaStream: RTCMediaStream = (self.peerConnectionFactory?.mediaStream(withStreamId: “1"))!
mediaStream.addVideoTrack(videoTrack)
self.newPeerConnection!.add(mediaStream)
break
}
}
This is a great idea to implement you just have to render the RTCVideoFrame in the method that you have used in the snippet, and all the other object will initialize outsize the method, best way. for better understanding, I am giving you a snippet.
var peerConnectionFactory: RTCPeerConnectionFactory?
var localVideoSource: RTCVideoSource?
var videoCapturer: RTCVideoCapturer?
func setupVideoCapturer(){
// localVideoSource and videoCapturer will use
localVideoSource = self.peerConnectionFactory!.videoSource()
videoCapturer = RTCVideoCapturer()
// localVideoSource.capturer(videoCapturer, didCapture: videoFrame!)
let videoTrack : RTCVideoTrack = self.peerConnectionFactory!.videoTrack(with: localVideoSource, trackId: "100")
let mediaStream: RTCMediaStream = (self.peerConnectionFactory?.mediaStream(withStreamId: "1"))!
mediaStream.addVideoTrack(videoTrack)
self.newPeerConnection!.add(mediaStream)
}
override func processSampleBuffer(_ sampleBuffer: CMSampleBuffer, with sampleBufferType: RPSampleBufferType) {
switch sampleBufferType {
case RPSampleBufferType.video:
// create the CVPixelBuffer
let pixelBuffer:CVPixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)!;
// create the RTCVideoFrame
var videoFrame:RTCVideoFrame?;
let timestamp = NSDate().timeIntervalSince1970 * 1000
videoFrame = RTCVideoFrame(pixelBuffer: pixelBuffer, rotation: RTCVideoRotation._0, timeStampNs: Int64(timestamp))
// connect the video frames to the WebRTC
localVideoSource.capturer(videoCapturer, didCapture: videoFrame!)
break
}
}
Hope this will help you.
I am doing research over four days, But I am not found any solution for calling over Bluetooth between two iOS devices within a distance.
I found that audio streaming is possible between two iOS devices using multipeer connectivity framework but this is not helpful for me. I want real time voice chat between two devices over Bluetooth.
Is there any CO-DAC for voice over Bluetooth?
My code is:
var engine = AVAudioEngine()
var file: AVAudioFile?
var player = AVAudioPlayerNode()
var input:AVAudioInputNode?
var mixer:AVAudioMixerNode?
override func viewDidLoad() {
super.viewDidLoad()
mixer = engine.mainMixerNode
input = engine.inputNode
engine.connect(input!, to: mixer!, format: input!.inputFormat(forBus: 0))
}
#IBAction func btnStremeDidClicked(_ sender: UIButton) {
mixer?.installTap(onBus: 0, bufferSize: 2048, format: mixer?.outputFormat(forBus: 0), block: { (buffer: AVAudioPCMBuffer, AVAudioTime) in
let byteWritten = self.audioBufferToData(audioBuffer: buffer).withUnsafeBytes {
self.appDelegate.mcManager.outputStream?.write($0, maxLength: self.audioBufferToData(audioBuffer: buffer).count)
}
print(byteWritten ?? 0)
print("Write")
})
do {
try engine.start()
}catch {
print(error.localizedDescription)
}
}
func audioBufferToData(audioBuffer: AVAudioPCMBuffer) -> Data {
let channelCount = 1
let bufferLength = (audioBuffer.frameCapacity * audioBuffer.format.streamDescription.pointee.mBytesPerFrame)
let channels = UnsafeBufferPointer(start: audioBuffer.floatChannelData, count: channelCount)
let data = Data(bytes: channels[0], count: Int(bufferLength))
return data
}
Thanks in Advance :)
Why is MultipeerConnectivity not helpful for you? It is a great way to stream audio over bluetooth or even wifi.
When you call this:
audioEngine.installTap(onBus: 0, bufferSize: 17640, format: localInputFormat) {
(buffer, when) -> Void in
You need to use the buffer, which has type AVAudioPCMBuffer. You then need to convert that to NSData and write to the outputStream that you would've opened with the peer:
data = someConverstionMethod(buffer)
_ = stream!.write(data.bytes.assumingMemoryBound(to: UInt8.self), maxLength: data.length)
Then on the other device you need to read from the stream and convert from NSData back to an AVAudioPCMBuffer, and then you can use an AVAudioPlayer to playback the buffer.
I have done this before with a very minimal delay.
i have setup an EZAudio in swift to calculate the fft of the realtime mic input, and then i run a special algorithm over the fft data.
My problem is i can access the fft data when i put this in the view controller, with dispatch_async.(See code the last func)
class MasterKey:NSObject,EZMicrophoneDelegate, EZAudioFFTDelegate{
var microphone: EZMicrophone!
var fft: EZAudioFFTRolling!
var tone:String = ""
var sampleRate:Float = 0.0
var fftWindowSize:vDSP_Length = 8192
var keys:MKHRangeToKey!
init(tone:String){
super.init()
self.tone = tone
/*
* setup all dependencys for the fft analysis
*/
//setup audio session
let session = AVAudioSession.sharedInstance()
do{
try session.setCategory(AVAudioSessionCategoryPlayAndRecord)
try session.setActive(true)
}catch{
print("Audio Session setup Fails")
}
//create a mic instance
microphone = EZMicrophone(delegate: self, startsImmediately: true)
self.sampleRate = Float(microphone.audioStreamBasicDescription().mSampleRate)
//create a fft instace
fft = EZAudioFFTRolling(windowSize: fftWindowSize, sampleRate: sampleRate, delegate: self)
//start the mic
microphone.startFetchingAudio()
self.keys = MKHRangeToKey(tone: tone, sampleRate: sampleRate, fftWindowSize: Int(fftWindowSize))
}
//get the mic data
func microphone(microphone: EZMicrophone!, hasAudioReceived buffer: UnsafeMutablePointer<UnsafeMutablePointer<Float>>, withBufferSize bufferSize: UInt32, withNumberOfChannels numberOfChannels: UInt32) {
//calc the fft
if fft != nil{
fft.computeFFTWithBuffer(buffer[0], withBufferSize: bufferSize)
}
}
//get the fft data from last calculstion
func fft(fft: EZAudioFFT!, updatedWithFFTData fftData: UnsafeMutablePointer<Float>, bufferSize: vDSP_Length){
dispatch_async(dispatch_get_main_queue(), {
print(fftData)
})
}
}
But how can i put this in a separate class to call it when i needed?
Pleas pleas Help
You indicated you're using Swift. Why not just create a separate AudioFunctions.swift file and move the function (and anything related) there? You can call it from anywhere in your app without worrying about an include.
Important Note: A function doesn't have to belong to a class.