I need to extract waveform data asynchronously from a file after having applied some equalization. Using the AVAudioEngine I encounter in random silences at the beginning of the samples (it's more of a delay of the real data but unfortunately such delay can't be detected for instance with a longer number of samples).
I asked directly to Apple using the code support tickets but they don't have a solution. Looks like AVAudioEngine offline rendering mode is bugged.
Looking for third party solutions, I came across AudioKit which seems to be using AVAudioEngine. So, assuming AudioKit can extract asynchronously a waveform after EQ is applied, would it do it bypassing AVAudioEngine offline mode?
Related
My iOS project requires to retrieve some audio data (i.e. frequency , decibel) from an audio file.
By using AudioKit framework, I can get those data from the microphone by use AKFrequencyTracker, however, I am struggling on how to get the frequency straight away from the audio file without playing it. Because I need those data to plot some graphs (i.e. frequency vs. time, etc)
PS: I'm saving the record as an m4a format at the moment. (the format is optional)
Thanks in advance
You can use Accelerate framework FFT API's to get the frequency information from an audio file.
Here is a useful library to understand vDSP API usage.
https://github.com/tomer8007/real-time-audio-fft
I've been working with AudioKit to create a sequencer that I would like to play a perfectly quantized sequence (i.e. all subdivisions metrically perfect). However, when I add notes to a sequence I hear fluctuations/imperfections in the time; the subdivisions aren't lining up in a metrically perfect way. When I print the current position of the sequencer in beats to the console during note on events, the fluctuations are shown: the notes are only consistent to two decimal places or so, and then they show variations in the placement. In the callback, I would expect perhaps, with a slight delay: 1.001, 2.001, 3.001. But the output displays seemingly random numbers after two decimals places.
I've created a project to demonstrate the issue here
What am I doing wrong here?
Note that in the project I've made use of AKCallbackInstrument, but the issue persists even if I plug the sampler that will play the sound directly into the sequencer. Also, in the project I've added notes to the sequencer "manually," but the issue persists even if I load a .mid file directly to the sequencer. The sampler in the demo project uses a sound font (.sf2), but the issue exists when I load a .wav or .mp3 sample as well.
I don't think you're doing anything wrong. The AKSequencer is based off of Apple's own MIDI Sequencer, so we provide AKSequencer as a wrapper to that functionality. However, there are known timing accuracies in Apple's sequencer that we can't address because it is closed source. We are working on a replacement to AKSequencer (which will be called AKSequencer, moving the current sequencer to AKAppleSequencer). This should be done in July. In the meantime, you can use AKTimeline to build your own sequencer as was done in the MetronomeSampleSync examples in AudioKit.
Statement of Problem:
I have a collection of sound effects in my app stored as.m4a files (AAC format, 48 KHz, 16-bit) that I want to play at a variety of speeds and pitches, without having to pre-generate all the variants as separate files.
Although the .rate property of an AVAudioPlayer object can alter playback speed, it always maintains the original pitch, which is not what I want. Instead, I simply want to play the sound sample faster or slower and have the pitch go up or down to match — just like speeding up or slowing down an old-fashioned reel-to-reel tape recorder. In other words, I need some way to essentially alter the audio sample rate by amounts like +2 semitones (12% faster), –5 semitones (33% slower), +12 semitones (2x faster), etc.
Question:
Is there some way fetch the Linear PCM audio data from an AVAudioPlayer object, apply sample rate conversion using a different iOS framework, and stuff the resulting audio data into a new AVAudioPlayer object, which can then be played normally?
Possible avenues:
I was reading up on AudioConverterConvertComplexBuffer. In particular kAudioConverterSampleRateConverterComplexity_Mastering, and kAudioConverterQuality_Max, and AudioConverterFillComplexBuffer() caught my eye. So it looks possible with this audio conversion framework. Is this an avenue I should explore further?
Requirements:
I actually don't need playback to begin instantly. If sample rate conversion incurs a slight delay, that's fine. All of my samples are 4 seconds or less, so I would imagine that any on-the-fly resampling would occur quickly, on the order of 1/10 second or less. (More than 1/2 would be too much, though.)
I'd really rather not get into heavyweight stuff like OpenAL or Core Audio if there is a simpler way to do this using a conversion framework provided by iOS. However, if there is a simple solution to this problem using OpenAL or Core Audio, I'd be happy to consider that. By "simple" I mean something that can be implemented in 50–100 lines of code and doesn't require starting up additional threads to feed data to the a sound device. I'd rather just have everything taken care of automatically — which is why I'm willing to convert the audio clip prior to playing.
I want to avoid any third-party libraries here, because this isn't rocket science and I know it must be possible with native iOS frameworks somehow.
Again, I need to adjust the pitch and playback rate together, not separately. So if playback is slowed down 2x, a human voice would become very deep and slow-spoken. And if playback is sped up 2–3x, a human voice would sound like a fast-talking chipmunk. In other words, I absolutely do not want to alter the pitch while keeping the audio duration the same, because that operation results in an undesirably "tinny" sound when bending the pitch upward more than a couple semitones. I just want to speed the whole thing up and have the pitch go up as a natural side-effect, just like old-fashioned tape recorders used to do.
Needs to work in iOS 6 and up, although iOS 5 support would be a nice bonus.
The forum link Jack Wu mentions has one suggestion, which involves overriding the AIFF header data directly. This may work, but you will need to have AIFF files since it relies on a specific range of the AIFF header to write into. This also needs to be done before you create the AVAudioPlayer, which means that you can't modify the pitch once it is running.
If you are willing to go to the AudioUnits route, a complete simple solution is probably ~200 lines (note that this assumes the code style that has one function take up to 7 lines with one parameter on each line). There is an Varispeed AudioUnit, which does exactly what you want by locking pitch to rate. You would basically need to look at the API, docs and some sample AudioUnit code to get familiar and then:
create/init the audio graph and stream format (~100 lines)
create and add to the graph a RemoteIO AudioUnit (kAudioUnitSubType_RemoteIO) (this outputs to the speaker)
create and add a varispeed unit, and connect the output of the varispeed unit (kAudioUnitSubType_Varispeed) to the input of the RemoteIO Unit
create and add to the graph a AudioFilePlayer (kAudioUnitSubType_AudioFilePlayer) unit to read the file and connect it to the varispeed unit
start the graph to begin playback
when you want to change the pitch, do it via AudioUnitSetParameter, and the pitch and playback rate change will take effect while playing
Note that there is a TimePitch audio unit which allows independent control of pitch and rate, as well.
For iOS 7, you'd want to look at AVPlayerItem's time-pitch algorithm (audioTimePitchAlgorithm) called AVAudioTimePitchAlgorithmVarispeed. Unfortunately this feature is not available on early systems.
I see that the only effect unit on iOS is the ipod EQ. Is there any other way to change the high, mid and low frequencies of an audio unit on iOS?
Unfortunately, the iPhone doesn't really allow custom AudioUnits (ie. an AudioUnit's ID cannot be registered for use by an AUGraph). What you can do is register a render callback and process the raw PCM data yourself. Sites like musicdsp.org have sample DSP code that you can utilize to implement any effect you can imagine.
Also, here is a similar StackOverflow question for reference: How to make a simple EQ AudioUnit
There are a bunch of built-in Audio Units including a set of filters, delay and even reverb. A good clue is to look in AUComponent.h. You will need to get their ABSD's properly setup otherwise they throw an error or produce silence. But they do work.
I've implemented an AUGraph similar to the one on the iOS Developer's Library. In my App, however, I need to be able to playback sound at different sample rates (probably two different ones).
I've been looking around Apple's documentation and haven't found a way to set the sample rate at runtime. I've been thinking of three possible work-arounds:
Re-initialize the AUGraph every time I need to change the sample rate;
Initialize a different AUGraph for each different sample rate;
Convert the sample rate of every sound before playing;
These methods all seem really clunky and heavy on the processor.
What is the best way of changing sample rate of an AUGraph at runtime?
typically #1 for continuous audio streaming scenarios.
your program may have a special need or benefit by using another approach you have listed:
#2: you need to process where reinitialization is not a concern.
#3: you need to mix and process two streams with different input sample rates together at the same time, particularly if you find yourself SRCing the signal multiple times.
but, if you just need playback with SRC and lowest latency is not a concern, you may want to try an AudioQueue instead.
I'm pretty sure that it can't be done in runtime. Solution #2 is your best bet, along with #3. For sample rate conversion, libsndfile can probably be adapted to your needs.
If you don't want latency from tearing down and rebuilding the audio graph, you may need to resample the sound data (for all but one sample rate).
You could either resample the sounds data before starting to play it, or run a real-time resampler as part of the audio graph. Many iOS music apps do the latter as part of a built-in sampler-based synth unit, so the device has plenty of compute power to do so.