I need to have a series of sound samples (audio files) being played back at the touch of a button. The audio samples need to be played back simultaneously and precisely (think 4 voices in a piece of music).
I managed to do this with several instances of AVAudioPlayer but it will go out of sync.
Reading about it, due to lack of precision it seems to not be the right choice for I'm trying to do.
Audio queue (is this part of Core Audio?) seems to be able to do what I want but I can hardly find any code bits in Swift to setup what I’m trying to do, which is:
Load the audio file, prepare it to be played, then play it (I would trigger it with an NSTimer).
Is this straightforward to implement with audio queue or should I look elsewhere?
If you could point me into the right direction I would be very grateful.
Thanks a lot!
Related
As is demonstrated in this answer, I have recently learned how to play audio files using both AVAudioPlayer and AudioToolbox. I have successfully played a single audio test file using both methods. However, I want to ask about which one I should actually use in my app (or if it even matters).
Here are the relevant characteristics of my app:
There are about 800 audio clips.
Most of the clips last less than one second.
Any could be chosen to be played at random by the user, but only a small subset will be used on any particular run.
No special volume control or playback options are needed.
These are my questions:
Which method for playing a sound would be better? Why?
Should I preload the sounds or just load them when they are needed? I'm guessing that preloading 800 sounds every time is a bad idea. But if I wait to load them until they are needed, I am worried about performance (ie, a noticeable pause before the clip is played)
Do I need to play sounds on a background thread?
So my concerns in choosing which audio player to go with are memory and performance. I couldn't tell from any of the documentation that I saw which is better in this case.
I'm looking for a way to create a audio bars visualizer similar to this in iOS.
Every white bar will move up and down depending of audio wave. I'm really lost because haven't much experience dealing with audio in Objective-c.
EDIT: What i'm seeking is what Overcast's app does on its visualizer (the group of vertical orange bars on the lower part of the podcast's image)
Anyone can help?
Thanks
EDIT: Thanks to Tomer's answer I finally made it. First I did this tutorial in order to make it all clear. Then I created my own VisualizerView for my project, you can find it in this gist. Maybe is not perfect but it does what I needed to do.
Generally, you have a few options if you want to get an idea of what something sounds like in iOS:
Use the simple AVAudioPlayer audio player, and then use the [audioPlayer averagePowerForChannel:] method to get the avarage audio level for the current moment. Check out this tutorial.
Use the Audio Queue API, which lets you send whatever audio you want to the speaker: You would read audio from your source and fill the buffers with it every time. (If you're reading from a file, use AVAssetReader) This way you always know exactly what waveform you're playing, so you can, for example, calculate its avarage power or process it in other ways like FFT. Then you'd update the bars accordingly.
EDIT: The standard way of doing such a thing is to use the Fast Fourier Transform (FFT) - it extracts frequency information from a sound. Here's a good example of using it on iOS (Apple's guide here). But, of course, to use it you have to know exactly what waveform you're playing every time, so you'd probably want to use a lower-level API such as Audio Queue.
I'm going to ask this at the risk of being too vague or asking too many things in one question, but I'm really just looking for a point in the right direction.
In my app I want to record audio, show a waveform while recording, and scroll through the waveform to record and playback from a specified time. For example, if I have 3 minutes of audio, I should be able to scroll back to 2:00 and start recording from there to fix a mistake.
In Voice Memos, this is accomplished instantaneously, without any delay or loading time. I'm trying to figure out how the did this, if anyone has a clue.
What I've tried:
EZAudio - This library is great, but doesn't do what I want. You can't scroll through the waveform. It deletes the waveform data at the beginning and begins appending it to the end once it reaches a certain length.
SCWaveformView - This waveform is nice, but it uses images. Once the waveform is too long, putting it in a scroll view causes really jittery scrolling. Also you can't build the waveform while recording, only afterward.
As far as appending, I've used this method: https://stackoverflow.com/a/11520553/1391672
But there is significant processing time, even when appending two very short clips of audio together (in my experience).
How does Voice Memos do what it does? Do you think the waveform is drawn in OpenGL or CoreGraphics? Are they using Core Audio or AVAudioRecorder? Has anyone built anything like this that can point me in the right direction?
When zoomed-in, a scrollview only needs to draw the small portion of the waveform that is visible. When zoomed-out, a graph view might only drawn every Nth point of the audio buffer, or use some other DSP down-sampling algorithm on the data before rendering. This likely has to be done using your own custom drawing or graphics rendering code inside a UIScrollView or similar custom controller. The waveform rendering code during and after recording don't have to be the same.
The recording API and the drawing API you use can be completely independent, and can be almost anything, from OpenGL to Metal to Core Graphics (on newer faster devices). On the audio end, Core Audio will help provide the lowest latency, but Audio Queues and the AVAudioEngine might also be suitable.
I am interested in recording media using an AVCaptureSession in iOS while playing media back using an AVPlayer (specifically, I am playing back audio and recording video, but I'm not sure it matters).
The problem is, when I play the resulting media back together later, they are out of sync. Is it possible to synchronize them, either by ensuring that playback and recording start simultaneously, or by discovering what the offset is between them? I probably need the sync to be on the order of 10 ms. It is unreasonable to assume that I can always capture audio (since the user may use headphones), so syncing via analysis of original and recorded audio is not an option.
This question suggests that it's possible to end playback and recording simultaneously and determine the initial offset from the resulting lengths that way, but I'm unclear how to get them to end simultaneously. I have two cases: 1) the audio playback runs out, and 2), the user hits the "stop recording" button.
This question suggests priming and then applying a fixed, but possibly device-dependent delay, which is obviously a hack, but if it's good enough for audio it's obviously worth considering for video.
Is there another media layer I can use to perform the required synchronization?
Related: this question is unanswered.
If you are specifically using AVPlayer to playback Audio and i would suggest you to use AudioQueueServices for the same. Its seamless and fast as it reads buffer by buffer and play pause is faster than AVPLlayer
There can also be the possibility that you are missing the initial statement of [avPlayer prepareToPlay] which might be causing much overhead for it to sync before playing the Audio.
Hope it helps you.
I'm a Unity dev and need to help out colleagues with doing this natively in Obj-C. In Unity it's no big deal :
1)samples are stored in memory as a List of float[]
2)A helper function returns float[] of n size for any given sample, at any given offset
3)Another helper function fades the data if needed
4)An AudioClip object is created with the right size to accomodate all cut samples, and is then filled at appropriate offsets.
5)The AudioClip is assigned to a player component(AudioSource).
6)AudioSource.Play(ulong offsetInSamples), plays at a sample accurate time in the future. Looping is also just a matter of setting the AudioSource object's loop parameter.
I would very much appreciate if someone could point me towards the right classes to achieve similar results in Obj-C, for iOS devices. I'm pretty sure a lot of iOS audio newbies would be intersted too. Many thanks in advance!
Gregzo
A good overview of the relevant audio APIs available in iOs is here
The highest level framework that makes sense for patching together audio clips, setting their volume levels, and playing them back in your case is probably AVFoundation.
It will involve creating AVAssets, adding them to AVPlayerItems, possibly putting them into AVMutableCompositions to merge multiple items together and adjust their volumes (audioMix), and them playing them back with AVPlayer.
AVFoundation works with AVAsset, for converting between relevant formats and lower level bytes you'll want to have a look at AudioToolbox (I can't post more than two links yet).
For an somewhat simpler API with less control have a look at AVAudioPlayer. If you need greater control (eg: games - real time / low latency) you might need to use OpenAL for playback.