Stripping silence from AVAudioRecorder recorded audio - ios

I'm working on an iOS app that allows a user to record some audio. The audio is recorded using AVAudioRecorder then saved to a file.
I'd like to strip the silence from the beginning and the end of the recorded audio.
Any ideas?

I am currently working no a similar task. It isn't trivial by any means. Because silence is not going to be a straightforward line of zeros. there is going to be some fluctuation.
If you're guaranteed a clean signal, it would be fairly trivial to set a marker on the first sample with an absolute value greater than say 0.001.
You can set an end marker without having to walk backwards through the file. all you do is, every sample greater than this threshold, you set the end marker to this sample.
If your input has the possibility of containing blips and squips before it starts properly, you will need a more advanced technique. Post a comment below and I will extend the answer.

When I've built audio apps for iOS, the audio eventually ends up on a server. I don't know if your app is similar in that regard. If so, you can do what I did:
use SoX in the backend to post-process the audio, removing silence using a threshold.
If you need to do it all on the phone, it going to be harder. You should build a power level filter using OpenAL or an OpenAL wrapper library

Related

What kind of drum sampling options does Audiokit have?

Working in audio kit and I am looking to understand how people have incorporated drums. Obviously, the sampler is an option, but I am wondering if there is a built in option similar to some of the basic synthesis options.
There are a few options. I personally like the AppleSampler/MidiSampler like in the example but instead of using audio files you can create a EXS Sampler instrument in Logic where you can assign notes for different velocities. AppleSampler can also load AUPresets made in GarageBand and SoundFonts (SF2). The DunneAudioKit Sampler is an option if you are working with SFZ files, but I think that might be a work-in-progress in AudioKit 5. Loading WAV files directly into AppleSampler is also a good option if you just want one shot sounds.
I'm assuming you're mostly talking about playback of samples, not recording.
The best built-in option I've seen (other than AppleSampler/MidiSampler) is AudioPlayer, which lets you load in a sample and play it back on demand (from an on-screen pad, etc). MIDIListener can then help you respond to external MIDI events, etc. It works (I have a pretty big branch in my app where I tried it), but not sure it works well.
I wouldn't recommend DunneAudioKit Sampler for drums. There is no one-shot playback (so playing the same note in quick succession will cut off the previous note, even if you mess with the release). If you're trying to build a complex/realistic acoustic drum instrument, you'll also want round-robins so that variations of the same hit can be played, which Dunne also doesn't have. It can load SFZ files, but only a very limited subset of SFZ's opcodes (so again, it's missing things like round robins, mute groups, one-shot, etc).
Having gone down all those roads, I would suggest starting with AppleSampler, and I would build the EXS or aupreset file in Logic or Mainstage rather than trying to build something programmatically.
If your needs are really simple, the examples in AudioKit's recently released drum pad playground is a great place to start, loading single samples into a specific note on AppleSampler.

Quantized Sequence using AudioKit

I've been working with AudioKit to create a sequencer that I would like to play a perfectly quantized sequence (i.e. all subdivisions metrically perfect). However, when I add notes to a sequence I hear fluctuations/imperfections in the time; the subdivisions aren't lining up in a metrically perfect way. When I print the current position of the sequencer in beats to the console during note on events, the fluctuations are shown: the notes are only consistent to two decimal places or so, and then they show variations in the placement. In the callback, I would expect perhaps, with a slight delay: 1.001, 2.001, 3.001. But the output displays seemingly random numbers after two decimals places.
I've created a project to demonstrate the issue here
What am I doing wrong here?
Note that in the project I've made use of AKCallbackInstrument, but the issue persists even if I plug the sampler that will play the sound directly into the sequencer. Also, in the project I've added notes to the sequencer "manually," but the issue persists even if I load a .mid file directly to the sequencer. The sampler in the demo project uses a sound font (.sf2), but the issue exists when I load a .wav or .mp3 sample as well.
I don't think you're doing anything wrong. The AKSequencer is based off of Apple's own MIDI Sequencer, so we provide AKSequencer as a wrapper to that functionality. However, there are known timing accuracies in Apple's sequencer that we can't address because it is closed source. We are working on a replacement to AKSequencer (which will be called AKSequencer, moving the current sequencer to AKAppleSequencer). This should be done in July. In the meantime, you can use AKTimeline to build your own sequencer as was done in the MetronomeSampleSync examples in AudioKit.

Generate a sound (not from a file)

I'm building a small game prototype, and I'd like to be able to play simple sounds whose length/tone/pitch will vary based on what the user is doing.
This is surprisingly hard to do. Closest resource I found was:
http://www.tmroyal.com/playing-sounds-in-swift-audioengine.html
But this does not actually generate any sound on my device or on the iOS simulator.
Does anyone know of any working code to play ANY procedurally generated audio? Simple Sine Wave would do.
https://gist.github.com/rgcottrell/5b876d9c5eea4c9e411c
This code on the other hand works, and it's beautifully written...
Success!
You can try AudioKit.
It's an audio framework built on top of Core Audio.
In their Continuous Control example they use a simple FM oscillator with controlled parameters.

How can I use AVAudioPlayer to play audio faster *and* higher pitched?

Statement of Problem:
I have a collection of sound effects in my app stored as.m4a files (AAC format, 48 KHz, 16-bit) that I want to play at a variety of speeds and pitches, without having to pre-generate all the variants as separate files.
Although the .rate property of an AVAudioPlayer object can alter playback speed, it always maintains the original pitch, which is not what I want. Instead, I simply want to play the sound sample faster or slower and have the pitch go up or down to match — just like speeding up or slowing down an old-fashioned reel-to-reel tape recorder. In other words, I need some way to essentially alter the audio sample rate by amounts like +2 semitones (12% faster), –5 semitones (33% slower), +12 semitones (2x faster), etc.
Question:
Is there some way fetch the Linear PCM audio data from an AVAudioPlayer object, apply sample rate conversion using a different iOS framework, and stuff the resulting audio data into a new AVAudioPlayer object, which can then be played normally?
Possible avenues:
I was reading up on AudioConverterConvertComplexBuffer. In particular kAudioConverterSampleRateConverterComplexity_Mastering, and kAudioConverterQuality_Max, and AudioConverterFillComplexBuffer() caught my eye. So it looks possible with this audio conversion framework. Is this an avenue I should explore further?
Requirements:
I actually don't need playback to begin instantly. If sample rate conversion incurs a slight delay, that's fine. All of my samples are 4 seconds or less, so I would imagine that any on-the-fly resampling would occur quickly, on the order of 1/10 second or less. (More than 1/2 would be too much, though.)
I'd really rather not get into heavyweight stuff like OpenAL or Core Audio if there is a simpler way to do this using a conversion framework provided by iOS. However, if there is a simple solution to this problem using OpenAL or Core Audio, I'd be happy to consider that. By "simple" I mean something that can be implemented in 50–100 lines of code and doesn't require starting up additional threads to feed data to the a sound device. I'd rather just have everything taken care of automatically — which is why I'm willing to convert the audio clip prior to playing.
I want to avoid any third-party libraries here, because this isn't rocket science and I know it must be possible with native iOS frameworks somehow.
Again, I need to adjust the pitch and playback rate together, not separately. So if playback is slowed down 2x, a human voice would become very deep and slow-spoken. And if playback is sped up 2–3x, a human voice would sound like a fast-talking chipmunk. In other words, I absolutely do not want to alter the pitch while keeping the audio duration the same, because that operation results in an undesirably "tinny" sound when bending the pitch upward more than a couple semitones. I just want to speed the whole thing up and have the pitch go up as a natural side-effect, just like old-fashioned tape recorders used to do.
Needs to work in iOS 6 and up, although iOS 5 support would be a nice bonus.
The forum link Jack Wu mentions has one suggestion, which involves overriding the AIFF header data directly. This may work, but you will need to have AIFF files since it relies on a specific range of the AIFF header to write into. This also needs to be done before you create the AVAudioPlayer, which means that you can't modify the pitch once it is running.
If you are willing to go to the AudioUnits route, a complete simple solution is probably ~200 lines (note that this assumes the code style that has one function take up to 7 lines with one parameter on each line). There is an Varispeed AudioUnit, which does exactly what you want by locking pitch to rate. You would basically need to look at the API, docs and some sample AudioUnit code to get familiar and then:
create/init the audio graph and stream format (~100 lines)
create and add to the graph a RemoteIO AudioUnit (kAudioUnitSubType_RemoteIO) (this outputs to the speaker)
create and add a varispeed unit, and connect the output of the varispeed unit (kAudioUnitSubType_Varispeed) to the input of the RemoteIO Unit
create and add to the graph a AudioFilePlayer (kAudioUnitSubType_AudioFilePlayer) unit to read the file and connect it to the varispeed unit
start the graph to begin playback
when you want to change the pitch, do it via AudioUnitSetParameter, and the pitch and playback rate change will take effect while playing
Note that there is a TimePitch audio unit which allows independent control of pitch and rate, as well.
For iOS 7, you'd want to look at AVPlayerItem's time-pitch algorithm (audioTimePitchAlgorithm) called AVAudioTimePitchAlgorithmVarispeed. Unfortunately this feature is not available on early systems.

Virtual Instrument App Recording Functionality With RemoteIO

I'm developing a virtual instrument app for iOS and am trying to implement a recording function so that the app can record and playback the music the user makes with the instrument. I'm currently using the CocosDenshion sound engine (with a few of my own hacks involving fades etc) which is based on OpenAL. From my research on the net it seems I have two options:
Keep a record of the user's inputs (ie. which notes were played at what volume) so that the app can recreate the sound (but this cannot be shared/emailed).
Hack my own low-level sound engine using AudioUnits & specifically RemoteIO so that I manually mix all the sounds and populate the final output buffer by hand and hence can save said buffer to a file. This will be able to be shared by email etc.
I have implemented a RemoteIO callback for rendering the output buffer in the hope that it would give me previously played data in the buffer but alas the buffer is always all 00.
So my question is: is there an easier way to sniff/listen to what my app is sending to the speakers than my option 2 above?
Thanks in advance for your help!
I think you should use remoteIO, I had a similar project several months ago and wanted to avoid remoteIO and audio units as much as possible, but in the end, after I wrote tons of code and read lots of documentations from third party libraries (including cocosdenshion) I end up using audio units anyway. More than that, it's not that hard to set up and work with. If you however look for a library to do most of the work for you, you should look for one written a top of core audio not open al.
You might want to take a look at the AudioCopy framework. It does a lot of what you seem to be looking for, and will save you from potentially reinventing some wheels.

Resources