How can I use AVAudioPlayer to play audio faster *and* higher pitched? - ios

Statement of Problem:
I have a collection of sound effects in my app stored as.m4a files (AAC format, 48 KHz, 16-bit) that I want to play at a variety of speeds and pitches, without having to pre-generate all the variants as separate files.
Although the .rate property of an AVAudioPlayer object can alter playback speed, it always maintains the original pitch, which is not what I want. Instead, I simply want to play the sound sample faster or slower and have the pitch go up or down to match — just like speeding up or slowing down an old-fashioned reel-to-reel tape recorder. In other words, I need some way to essentially alter the audio sample rate by amounts like +2 semitones (12% faster), –5 semitones (33% slower), +12 semitones (2x faster), etc.
Question:
Is there some way fetch the Linear PCM audio data from an AVAudioPlayer object, apply sample rate conversion using a different iOS framework, and stuff the resulting audio data into a new AVAudioPlayer object, which can then be played normally?
Possible avenues:
I was reading up on AudioConverterConvertComplexBuffer. In particular kAudioConverterSampleRateConverterComplexity_Mastering, and kAudioConverterQuality_Max, and AudioConverterFillComplexBuffer() caught my eye. So it looks possible with this audio conversion framework. Is this an avenue I should explore further?
Requirements:
I actually don't need playback to begin instantly. If sample rate conversion incurs a slight delay, that's fine. All of my samples are 4 seconds or less, so I would imagine that any on-the-fly resampling would occur quickly, on the order of 1/10 second or less. (More than 1/2 would be too much, though.)
I'd really rather not get into heavyweight stuff like OpenAL or Core Audio if there is a simpler way to do this using a conversion framework provided by iOS. However, if there is a simple solution to this problem using OpenAL or Core Audio, I'd be happy to consider that. By "simple" I mean something that can be implemented in 50–100 lines of code and doesn't require starting up additional threads to feed data to the a sound device. I'd rather just have everything taken care of automatically — which is why I'm willing to convert the audio clip prior to playing.
I want to avoid any third-party libraries here, because this isn't rocket science and I know it must be possible with native iOS frameworks somehow.
Again, I need to adjust the pitch and playback rate together, not separately. So if playback is slowed down 2x, a human voice would become very deep and slow-spoken. And if playback is sped up 2–3x, a human voice would sound like a fast-talking chipmunk. In other words, I absolutely do not want to alter the pitch while keeping the audio duration the same, because that operation results in an undesirably "tinny" sound when bending the pitch upward more than a couple semitones. I just want to speed the whole thing up and have the pitch go up as a natural side-effect, just like old-fashioned tape recorders used to do.
Needs to work in iOS 6 and up, although iOS 5 support would be a nice bonus.

The forum link Jack Wu mentions has one suggestion, which involves overriding the AIFF header data directly. This may work, but you will need to have AIFF files since it relies on a specific range of the AIFF header to write into. This also needs to be done before you create the AVAudioPlayer, which means that you can't modify the pitch once it is running.
If you are willing to go to the AudioUnits route, a complete simple solution is probably ~200 lines (note that this assumes the code style that has one function take up to 7 lines with one parameter on each line). There is an Varispeed AudioUnit, which does exactly what you want by locking pitch to rate. You would basically need to look at the API, docs and some sample AudioUnit code to get familiar and then:
create/init the audio graph and stream format (~100 lines)
create and add to the graph a RemoteIO AudioUnit (kAudioUnitSubType_RemoteIO) (this outputs to the speaker)
create and add a varispeed unit, and connect the output of the varispeed unit (kAudioUnitSubType_Varispeed) to the input of the RemoteIO Unit
create and add to the graph a AudioFilePlayer (kAudioUnitSubType_AudioFilePlayer) unit to read the file and connect it to the varispeed unit
start the graph to begin playback
when you want to change the pitch, do it via AudioUnitSetParameter, and the pitch and playback rate change will take effect while playing
Note that there is a TimePitch audio unit which allows independent control of pitch and rate, as well.

For iOS 7, you'd want to look at AVPlayerItem's time-pitch algorithm (audioTimePitchAlgorithm) called AVAudioTimePitchAlgorithmVarispeed. Unfortunately this feature is not available on early systems.

Related

IOS: Custom real time audio effect for audioEngine?

What is best way for creating custom real time audio effect for audioEngine in iOS ?
I want to process audio at a low level, how to do it right? Be sure to use AudioUnitExtension? By simpler, I meant, is it possible to inherit from Audio Unit and using C code to change audio Data and send it back to the audioUnit connection chain in audioEngine?
You may find that the framework AudioKit will let you do what you want, the problem with manipulating audio at a low level is you have to deal with a lot of complex stuff tangental to what you are trying to achieve, just changing the playback rate of a sample means you have to deal with interpolation and antialias filters, AudioKit handles all of that for you but it may mean you have to change your way of thinking about what you are trying to.
https://github.com/AudioKit/AudioKit

Modifying Low, Mid, High Frequencies Core Audio IOS

I see that the only effect unit on iOS is the ipod EQ. Is there any other way to change the high, mid and low frequencies of an audio unit on iOS?
Unfortunately, the iPhone doesn't really allow custom AudioUnits (ie. an AudioUnit's ID cannot be registered for use by an AUGraph). What you can do is register a render callback and process the raw PCM data yourself. Sites like musicdsp.org have sample DSP code that you can utilize to implement any effect you can imagine.
Also, here is a similar StackOverflow question for reference: How to make a simple EQ AudioUnit
There are a bunch of built-in Audio Units including a set of filters, delay and even reverb. A good clue is to look in AUComponent.h. You will need to get their ABSD's properly setup otherwise they throw an error or produce silence. But they do work.

Changing sample rate of an AUGraph on iOS

I've implemented an AUGraph similar to the one on the iOS Developer's Library. In my App, however, I need to be able to playback sound at different sample rates (probably two different ones).
I've been looking around Apple's documentation and haven't found a way to set the sample rate at runtime. I've been thinking of three possible work-arounds:
Re-initialize the AUGraph every time I need to change the sample rate;
Initialize a different AUGraph for each different sample rate;
Convert the sample rate of every sound before playing;
These methods all seem really clunky and heavy on the processor.
What is the best way of changing sample rate of an AUGraph at runtime?
typically #1 for continuous audio streaming scenarios.
your program may have a special need or benefit by using another approach you have listed:
#2: you need to process where reinitialization is not a concern.
#3: you need to mix and process two streams with different input sample rates together at the same time, particularly if you find yourself SRCing the signal multiple times.
but, if you just need playback with SRC and lowest latency is not a concern, you may want to try an AudioQueue instead.
I'm pretty sure that it can't be done in runtime. Solution #2 is your best bet, along with #3. For sample rate conversion, libsndfile can probably be adapted to your needs.
If you don't want latency from tearing down and rebuilding the audio graph, you may need to resample the sound data (for all but one sample rate).
You could either resample the sounds data before starting to play it, or run a real-time resampler as part of the audio graph. Many iOS music apps do the latter as part of a built-in sampler-based synth unit, so the device has plenty of compute power to do so.

karaoke (mpeg) component for delphi 7

I am looking for karaoke (mpeg) component for delphi 7.
Added from duplicate
I mean a component that can play mpeg files or do you want a special karaoke component that filters the voices from the music?
Have a look at Ultrastar deluxe, an open source Singstar clone based on Pascal/Delphi.
It now uses Free Pascal for portability, but afaik used Delphi originally (and maybe still for win32 target)
If you are trying to filter the vocals from a mpeg clip, then you are going to have a hard time trying to do this. The issue here is you are trying to filter out a variable frequency from the audio signal, which over time you have no idea what it is going to be. The closest thing that you may be able to achieve is some audio recordings deliberately record the voice track 90 degrees out of phase between the left and right channels, in which case you can 'cancel' the vocal track out by combining the audio with the same signal 90 degrees out of phase, but i beleive that MPEG compression will negate that anyways due to its spacial compression.
So no, i dont beleive this can be done, you will be better off trying to find the musical soundtrack and combining it with the video clip then playing this.
If you are simply trying to display text over a video clip (i.e. overlay) then you may want to look at:
Looking for a OSD component
If you also need to play video files in Delphi you can use the built in media player (TMediaPlayer) or another video component (such as TVideograbber http://www.datastead.com) - the latter supports overlay/text over screen.

iOS: Audio Units vs OpenAL vs Core Audio

Could someone explain to me how OpenAL fits in with the schema of sound on the iPhone?
There seem to be APIs at different levels for handling sound. The higher level ones are easy enough to understand.
But my understanding gets murky towards the bottom. There is Core Audio, Audio Units, OpenAL.
What is the connection between these? Is openAL the substratum, upon which rests Core Audio (which contains as one of its lower-level objects Audio Units) ?
OpenAL doesn't seem to be documented by Xcode, yet I can run code that uses its functions.
This is what I have figured out:
The substratum is Core Audio. Specifically, Audio Units.
So Audio Units form the base layer, and some low-level framework has been built on top of this. And the whole caboodle is termed Core Audio.
OpenAL is a multiplatform API -- the creators are trying to mirror the portability of OpenGL. A few companies are sponsoring OpenAL, including Creative Labs and Apple!
So Apple has provided this API, basically as a thin wrapper over Core Audio. I am guessing this is to allow developers to pull over code easily. Be warned, it is an incomplete implementation, so if you want OpenAL to do something that Core Audio can do, it will do it. But otherwise it won't.
Kind of counterintuitive -- just looking at the source, it looks as if OpenAL is lower level. Not so!
Core Audio covers a lot of things, such as reading and writing various file formats, converting between encodings, pulling frames out of streams, etc. Much of this functionality is collected as the "Audio Toolbox". Core Audio also offers multiple APIs for processing streams of audio, for playback, capture, or both. The lowest level one is Audio Units, which works with uncompressed (PCM) audio and has some nice stuff for applying effects, mixing, etc. Audio Queues, implemented atop Audio Units, are a lot easier because they work with compressed formats (not just PCM) and save you from some threading challenges. OpenAL is also implemented atop Audio Units; you still have to use PCM, but at least the threading isn't scary. Difference is that since it's not from Apple, its programming conventions are totally different from Core Audio and the rest of iOS (most obviously, it's a push API: if you want to stream with OpenAL, you poll your sources to see if they've exhausted their buffers and push in new ones; by contrast, Audio Queues and Audio Units are pull-based, in that you get a callback when new samples are needed for playback).
Higher level, as you've seen, is nice stuff like Media Player and AV Foundation. These are a lot easier if you're just playing a file, but probably aren't going to give you deep enough access if you want to do some kind of effects, signal processing, etc.

Resources