I have an USB audio acquisition device ( 2 channels, 1 microphone per channel).
This device is connected with an Iphone.
The aim is to design a program in Swift in order to store the data in a buffer and then in pcm arrays (one array per channel).
I tested the connection between the Iphone and the acqusition device using AVAudiosession object.
The IPhone recognises the device. (ios compatble according to the manufacturer)
The process don’t need to work in real time :
1) Data acquisition
2) Stop acquisition
3) Finally processing data stored in pcm arrays (note in a file)
But I must confess that I’m little bit lost…
Do I have to consider two channels or one channel (one frame with one sample per channel) ?
(For Example : audiosession.inputNumberOfChannels = 1 or 2 ?)
Do I have to use Audio Queue, Audio Units Objects or another object?
I carried out several investigations on the Internet but I'm lost...
I would appreciate if you can get me on track by giving me some tips.
Thanks a lot for you support !
Best regards.
Jeff
Related
So I have two questions:
Is there another (maybe low-level) way to get float* samples of the audio that is currently playing?
Is it possible to do it from inside a framework? I mean when you don't have access to the instance of AVPlayer(or AVAudioPlayerNode, AudioEngine, or even low-level CoreAudio classes, whatever) who owns the audio file? Is there a way to subscribe (in order to analyze, or also may be for modifying/equalizing) to audio samples that are being played via speakers/earphones?
I've tried to install a tap on audioEngine.mainMixerNode which works, but when I set the bufferSize more than 4096 (in order to compute high-density FFT), the callback is called less frequently than it should (about 3 times in a second instead of 30 times or even frequently).
mixerNode.installTap(onBus: 0,
bufferSize: 16384, //or 8192
format: mixerNode.outputFormat(forBus: 0))
{[weak self] (buffer, time) in
//this block is being called LESS frequently...
}
I know that CoreAudio is very powerful and there should be something for this kind of purposes..
An iOS app can only get played audio samples from raw PCM samples that the app itself is playing. Any visibility into samples output by other apps or processes is blocked by the iOS security sandbox. An iOS app can sample audio from the device's microphone.
In an audio engine tap-on-bus, audio samples are delivered to the application's main thread, and thus limited in callback frequency and latency. In order to get the most recent few milliseconds of microphone audio samples, an app needs to use the RemoteIO Audio Unit callback API, where audio samples can be delivered in a high-priority audio context thread.
We have a VOIP app that generally transfers audio packets with a sample rate of 32Khz. For what would seem to be a reasonable match, we've typically set the AVAudioSessions preferred sample rate to 32Khz as well. On later iPhones (e.g. iPhone XS) we've found the speakerphone no longer plays or is garbled when using a sample rate of 32Khz. But the audio session seems to happily accept (with read back confirmation) a preferredSampleRate of 32Khz. I've read that iPhone 6S codec (and perhaps later devices) only support 48Khz sample rates... but if that is the case why wouldn't iOS override the setPreferredSampleRate?
Is it possible to have a common implementation of a Core Audio based audio driver bridge for iOS and OSX ? Or is there a difference in the Core Audio API for iOS versus the Core Audio API for OSX?
The audio bridge only needs to support the following methods:
Set desired sample rate
Set desired audio block size (in samples)
Start/Stop microphone stream
Start/Stop speaker stream
The application supplies 2 callback function pointers to the audio bridge and the audio bridge sets everything up so that:
The speaker callback is called on regular time intervals where it's requested to return an audio block
The microphone callback is called on regular time intervals where it receives an audio block
I was told that it's not possible to have a single implementation which works on both iOS and OSX as there are differences between the iOS Core Audio API and the OSX Core Audio API.
Is this true?
There are no significant differences between the Core Audio API on OS X and on iOS. However there are significant differences in obtaining the correct Audio Unit for the microphone and the speaker to use. There are only 2 units on iOS (RemoteIO and one for VOIP), but more and potentially many more on a Mac, plus the user might change the selection. There are also differences in some of the Audio Unit parameters (buffer size, sample rates, etc.) allowed/supported by the hardware.
I'm trying to sync music sent from a host iPhone to a client iPhone.. the audio is read using AVAssetReader and sent via packets to the client, which in turns feeds it to a ring buffer, which in turn populates the audioqueue buffers and starts playing.
I was going over the AudioQueue docs and there seems to be two different concepts of a timestamp related to the audioQueue: Audio Queue Time and Audio Queue Device Time. I'm not sure how those two are related and when one should be used rather (or in conjunction with) the other.
While I realize that AirPlay has inherent lag/latency, I'm wondering if there's a way for a (currently hypothetical) iPhone app to detect what that latency is. If so, how precise can that latency value be? I'm more curious in whether an app can "know" its own AirPlay latency, rather than simply minimize it.
The latency does not come from network jitter, but rather is decided by the source device (your iPhone).
Long story short:
It's always precisely 2s (down to the millisecond) with Apple devices.
There is no way to tweak it with public APIs.
Audio latency needs to be very accurate so that multiple outputs can play in sync.
Some explanations about AirPlay's implementation:
The protocol starts with several RTSP commands. During this handshake, the source transmits rtpTime, the time at which the playback starts, which is also your latency. The typical value is 88200 = 2s x 44100 Hz.
AirPlay devices can sync their clock with the source's with NTP to mitigate the network latency.
During playback, the source periodically sends a SYNC packet to adjust the audio latency and make sure that all devices are still in sync.
It's possible to change the latency if you use a custom implementation, but Apple usually rejects them.
Check this writeup for more information. You can also read the unofficial protocol documentation.
The short answer is: no, Apple does not provide a way to do this. Assuming you need your app to be approved in the App Store, you're sort of out of luck. If you can run your app on a jailbroken device you can search around for undocumented APIs that will let you do more.
If you need your app to be available in Apple's App Store, most things you can do network-wise are outlined in the "Reachability" sample app.
The only way I can think of to get a good guess would be to use Bonjour to identify the host (see sample code here https://developer.apple.com/library/ios/#samplecode/BonjourWeb/Introduction/Intro.html) and then ping the host.
However:
If there is more than 1 Airplay station you will need to guess or ask which the user is connected to, or maybe take an average.
The device may not respond to a ping at all (Apple TV and Airport Express both respond to ping, not sure about 3rd party devices.)
The ping may not reflect the actual latency of the audio
Instead of spending too much time on this, you should follow Apple's guidelines for preparing your audio for AirPlay and enriching your app for AirPlay: http://developer.apple.com/library/ios/#documentation/AudioVideo/Conceptual/AirPlayGuide/PreparingYourMediaforAirPlay/PreparingYourMediaforAirPlay.html#//apple_ref/doc/uid/TP40011045-CH4-SW1
Hope this helps! :)
You can query iOS's current hardware audio latency by -[AVAudioSession outputLatency],
According to the document for outputLatency:
Using an AirPlay enabled device for your audio content can result in a 2-second delay. Check for this delay in game content.
And according to my experience, this value changes when switching output device, eg:
Speaker: ~10ms
Bluetooth: ~100+ms
AirPlay: 2s