I am designing an AUGraph for an iOS application and would appreciate help on the following things.
If I want to play a number of audio files at once, does each file need an audio unit?
From the Core-Audio docs
Linear PCM and IMA/ADPCM (IMA4) audio You can play multiple linear PCM or IMA4 format sounds simultaneously in iOS without incurring CPU resource problems.
AAC, MP3, and Apple Lossless (ALAC) audio Playback for AAC, MP3, and Apple Lossless (ALAC) sounds uses efficient hardware-based decoding on iPhone and iPod touch. You can play only one such sound at a time.
So multiple AAC or MP3 files cannot be played at the same time. What is the optimal LPCM format to play multiple sounds at once?
Does this apply to Audio-Units too, as this in under the AudioQueue documentation.
Can an audio unit in an AUGraph be inactive? If an AUGraph looks like this
Speaker/output < recorder unit < mixer unit < number of audio file playing units
what happens if the recorder is not active, would it still pull, but just not write the buffers to a file?
No; you need to use the mixer audio unit. Check this:
http://developer.apple.com/library/ios/DOCUMENTATION/MusicAudio/Conceptual/AudioUnitHostingGuide_iOS/ConstructingAudioUnitApps/ConstructingAudioUnitApps.html#//apple_ref/doc/uid/TP40009492-CH16-SW1
Mostly reading the document above, wrapping the sample code in a class and creating a pair of utility structures, I coded this 'Simple Sound Engine' from scratch:
ttp://nicolasmiari.com/blog/a-simple-sound-engine-for-ios-using-the-audio-unit-framework/
(Link to article in my blog containing the source code). Sorry, moved blog to Jekyll/Github and this article didn't make the cut.
...I was going to start a repo on github, but it's too much trouble. I am a visual guy, still pretty much git-phobic. Okay, that was a long time ago... Now I use git from the command line :-)
You can use it as-is, or extract the Audio Unit-related code and adapt it to your project.
I believe the Cocos Denshion 'Simple Audio Engine' does pretty much the same thing, but haven't checked the source code.
Known issues
If you have an exception breakpoint set for C++ exceptions, when debugging, the code will stop 2 or 3 times on AUGraphInitialize(). This is a 'non-crashing' exception, so you can click on continue and the code works OK.
To convert your wav files to the uncompressed .caf format, use this command on the Terminal:
%afconvert -f caff -d LEI16 mysoundFile.wav mySoundFile.caf
EDIT: So I created a GitHub repo after all:
https://github.com/nicolas-miari/Sound-Engine
Both ordinary common .wav and .caf files contain raw PCM audio samples, and can be played without hardware assist or DSP processing if already at the destination sample rate.
When there's no audio file or other synthesized data to feed an audio unit that's pulling buffers, the usual practice is to feed it buffers of silence (or perhaps a taper to zero if the previous buffer ended with non-zero amplitude).
Related
I am building an application which needs to do real time audio recording. I am using Swift for the project - so unable to use Novocaine library (as it has some Obj-C++ code).
What I need is get small chunks of the audio recording (real-time) which I can process or send to my websocket. Is there a Swift library that I can use to achieve this?
In addition to getting the live audio from the microphone, I also need to show a real time waveform.
Start recording
Get an event every for few bytes of recorded data, where I can send these bytes to my websocket.
Showing a waveform for the audio.
Let me know.
You do not need any of 3-rd party tools for getting audio from mic. It can be set up easily using AVAudioEngine. However, for minimising network traffic I suggest to use lame for compressing raw PCM audio stream into mp3.
Here you can find project with minimal functionality for getting mic input and compressing into mp3. In this example project mp3 stores into Documents folder, so you can try and listen to make sure it works.
From this point you can take mp3 buffer and send via socket. You can also play with lame settings to change quality, etc.
There is another branch called no-lame where same functionality implemented without lame encoding. Look here
TL;DR
I want to convert fMP4 fragments to TS segments (for HLS) as the fragments are being written using FFmpeg on an iOS device.
Why?
I'm trying to achieve live uploading on iOS while maintaining a seamless, HD copy locally.
What I've tried
Rolling AVAssetWriters where each writes for 8 seconds, then concatenating the MP4s together via FFmpeg.
What went wrong - There are blips in the audio and video at times. I've identified 3 reasons for this.
1) Priming frames for audio written by the AAC encoder creating gaps.
2) Since video frames are 33.33ms long, and audio frames 0.022ms long, it's possible for them to not line up at the end of a file.
3) The lack of frame accurate encoding present on Mac OS, but not available for iOS Details Here
FFmpeg muxing a large video only MP4 file with raw audio into TS segments. The work was based on the Kickflip SDK
What Went Wrong - Every once in a while an audio only file would get uploaded, with no video whatsoever. Never able to reproduce it in-house, but it was pretty upsetting to our users when they didn't record what they thought they did. There were also issues with accurate seeking on the final segments, almost like the TS segments were incorrectly time stamped.
What I'm thinking now
Apple was pushing fMP4 at WWDC this year (2016) and I hadn't looked into it much at all before that. Since an fMP4 file can be read, and played while it's being written, I thought that it would be possible for FFmpeg to transcode the file as it's being written as well, as long as we hold off sending the bytes to FFmpeg until each fragment within the file is finished.
However, I'm not familiar enough with the FFmpeg C API, I only used it briefly within attempt #2.
What I need from you
Is this a feasible solution? Is anybody familiar enough with fMP4 to know if I can actually accomplish this?
How will I know that AVFoundation has finished writing a fragment within the file so that I can pipe it into FFmpeg?
How can I take data from a file on disk, chunk at a time, pass it into FFmpeg and have it spit out TS segments?
Strictly speaking you don't need to transcode the fmp4 if it contains h264+aac, you just need to repackage the sample data as TS. (using ffmpeg -codec copy or gpac)
Wrt. alignment (1.2) I suppose this all depends on your encoder settings (frame rate, sample rate and GOP size). It is certainly possible to make sure that audio and video align exactly at fragment boundaries (see for example: this table). If you're targeting iOS, I would recommend using HLS protocol version 3 (or 4) allowing timing to be represented more accurately. This also allows you to stream audio and video separately (non-multiplexed).
I believe ffmpeg should be capable of pushing a live fmp4 stream (ie. using a long-running HTTP POST), but playout requires origin software to do something meaningful with it (ie. stream to HLS).
I have run through an audio units tutorial for a sine wave generator and done a bit of reading, and I understand basically how it is working. What I would actually like to do for my app, is play a short sound file in response to some external event. These sounds would be about 1-2 seconds in duration and occur at a rate of about about 1-2 per second.
Basically where I am at right now is trying to figure out how to play an actual audio file using my audio unit, rather than generating a sine wave. So basically my question is, how do I get an audio unit to play an audio file?
Do I simply read bytes from the audio file into the buffer in the render callback?
(if so what class do I need to deal with to open / convert / decompress / read the audio file)
or is there some simpler method where I could maybe just hand off the entire buffer and tell it to play?
Any names of specific classes or APIs I will need to look at to accomplish this would be very helpful.
OK, check this:
http://developer.apple.com/library/ios/samplecode/MixerHost/Introduction/Intro.html
EDIT: That is a sample project. This page has detailed instructions with inline code to setup common configurations: http://developer.apple.com/library/ios/ipad/#DOCUMENTATION/MusicAudio/Conceptual/AudioUnitHostingGuide_iOS/ConstructingAudioUnitApps/ConstructingAudioUnitApps.html#//apple_ref/doc/uid/TP40009492-CH16-SW1
If you don't mind being tied to IOS 5+, you should look into AUFilePlayer. It is much easer then using the callbacks and you don't have to worry about setting up your own ring buffer (something that you would need to do if you want to avoid loading all of your audio data into memory on start-up)
I'm investigating a straightforward task:
open an audio file from the iPhone's 'iPod audio library'
allows the user to select a chunk by setting two markers: start and end time
time-reverse this chunk
save it as a new file
What are my options?
I will list the results of a couple of hours of research: ( forgive the mess, I will as always tidy pu once I have figured it out )
http://lists.apple.com/archives/coreaudio-api/2005/May/msg00096.html <-- 'I'm currently trying to create a program that plays back audio using an AUAudioFilePlayer
AudioUnit plugin that streams the audio to an output AudioUnit'
AUFilePlayer
http://lists.apple.com/archives/coreaudio-api/2008/Dec/msg00156.html
http://zerokidz.com/audiograph/docs/audiograph.pdf <-- this possibly links to code that does it, but it says it is in beta
When reading audio file with ExtAudioFile read, is it possible to read audio floats not consecutively? <-- this leads to an OS X project that reads an audio file from disk into memory; looking through the code leads us to:
https://developer.apple.com/library/mac/#documentation/MusicAudio/Reference/ExtendedAudioFileServicesReference/Reference/reference.html
as far as I can see the audio Graph project attempts to stream the audio from file in real-time, whereas Stephan's Project just exposes the audio; however it looks like he is using obsolete API calls.
this looks like the right code ( apart from the fact that there seems to be a bug in it ): https://stackoverflow.com/questions/8533143/decoding-mp3-files-by-extaudiofileopenurl
http://cocoadev.com/forums/discussion/499/core-audio/p1
https://developer.apple.com/library/ios/#samplecode/iPhoneExtAudioFileConvertTest/Introduction/Intro.html <-- here is an official Apple sample project that could probably be modified to get what I'm after
I believe you can use IphoneExtAudioFileConvertTest like you said to get what you need, after the user marks what times he wants, first thing you want to do is convert to PCM, next you need to find out which packets are the ones you need, write that to an audio file and then recompress... I wrote an answer here on how to get x amount of seconds from an audio file, ive only tested it with mp3 and m4a, but it can be adapted to PCM (pcm should be easier to do since its linear).
Daniel
In Xcode 3.2.5 I would like to play multiple audio files in sequence (50+) from a single UIButton. I've tried several codes but they leak memory. Any suggestions? I'm still learning so please include header and implimentation file codes. My thanks in advance.
Use the interfaces in Audio Queue Services (AudioToolbox/AudioQueue.h). Create one audio queue object for each sound that you want to play. Then specify simultaneous start times for the first audio buffer in each audio queue, using the AudioQueueEnqueueBufferWithParameters function.
The following limitations pertain for simultaneous sounds in iPhone OS, depending on the audio data format:
AAC, MP3, and ALAC (Apple Lossless) audio: You may play multiple AAC, MP3, and ALAC format sounds simultaneously; playback of multiple sounds of these formats will require CPU resources for decoding.
Linear PCM and IMA/ADPCM (IMA4 audio): You can play multiple linear PCM or IMA4 format sounds simultaneously without CPU resource concerns.
Taken from play multiple sounds simultaneously
This is just conceptual, but what about (a) creating an array of sound names you want to play (this can be during runtime), in the proper order, then (b) writing a function where each soundHandler-type object checks to see where it is in the array; if it's not last it constructs a soundPlayer, loads the sound, plays and then calls the next soundHandler in the array. (If it's last it just constructs/loads/plays, and maybe notifies the parent that it's done.) Each soundHandler (I'm just making that up, you'll have to write it) then can dealloc itself when complete.
If you run into latency/loading issues, you could always have each soundHandler call n+2 in the array, and of course then check to see if it's penultimate instead of the end.
Is that more what you had in mind?