Audio Framework Confusion - ios

I've read quite a bit both here (Audio Framework in iPhone) and abroad but am still confused as to which Audio Framework to use.
I'm able to get some easier things done, like recording and playing back but I'm looking to the future of the app where I'll be doing more complex things, like managing past recordings (although maybe that's a NSURL bookmark thing) and editing audio.
Right now I'm using AVFoundation but have started reading the docs for Core Audio (and there's also AudioToolbox). I wish there was a developer doc called "Understanding the Different Audio Frameworks and How and When to use them" because, well, the docs are dense and I'm having trouble figuring out which path to go down.
Links to good docs would also be much appreciated!

I recommend you take a look at the recent Learning Core Audio book. The purpose of it was to disambiguate the confusion around audio frameworks on Mac OS and iOS. If you want "good docs", it's well worth getting.
Depending on your requirements, you might also want to consider some of the non-Apple audio frameworks, particularly the MoMu release of STK, which in may respects will be simpler and easier-to-use than Apple's frameworks.

Related

iOS Multi-Channel Audio with AVFoundation and Swift

I am currently in the research and prototyping stages of a project to develop a native iOS app (Swift 3) that includes a multi-channel audio player (multiple stereo MP3 files). I have found very limited information online, particularly written in Swift 3, so thought as I continue my research I would pose a question here.
Regarding frameworks it seems clear from what I've looked at so far that AVFoundation is going to do the job. It's not too low level and has a good set of functionality. It has support for playing multiple audio files with AVAudioPlayer. I am planning to start prototyping something with this soon.
But I am new to Swift and to iOS development with its huge number of libraries, so I'm wondering if I'm missing anything, if I'm on the right track here. Any answers with general information and thoughts on this will be up-voted. For an accepted answer some sample outline code using an appropriate framework, AVFoundation or a justified alternative.
If no answer is forthcoming I will post my own code when I get there.
Specifically I need from two to ten input channels, from MP3 files within the project resources, each with their own gain that can be individually adjusted, and then all of these mixed, maintaining their stereo channels, to a single output (the device) with a master gain. Some of the tracks need to loop, others not. The tracks need to be accurately synchronised. This is just for info and outline code would be fine covering the important points.
Research Notes and Resources
Apple: AVFoundation
A collection of resources relating to AVFoundation.
Apple: AVFoundation Programming Guide
This document seems encouraging at first, but actually only deals with video. It says:
There are two facets to the AVFoundation framework—APIs related to video and APIs related just to audio. The older audio-related classes provide easy ways to deal with audio. They are described in the Multimedia Programming Guide, not in this document.
The "Multimedia Programming Guide" which is also mentioned elsewhere at Apple in relation to this, is never linked and Google results point to not found pages on the Apple site. It seems to have disappeared.
Rudi Strahl: Mixing Multiple Audio Tracks with AVFoundation
Compares using AVComposition to using multiple AVPlayers. Example code is Objective-C. Not sure how the AVPlayers are mixed in the second solution. Perhaps with AVAudioMix. Currently looking at this. The article talks a little about it but doesn't deliver any specifics.
Audio Session Programming Guide
This document looks at AVAudioSession which provides supporting functionality:
AVAudioSession gives you control your app’s audio behavior. You can:
Select the appropriate input and output routes for your app
Determine how your app integrates audio from other apps
Handle interruptions from other apps
Automatically configure audio for the type of app your are creating
Techotopia: Playing Audio on iOS 10 using AVAudioPlayer
Some useful information on using AVAudioPlayer.
Stack Overflow: Playing a Sound with AVAudioPlayer
Basic Swift code for playing a sound. Some answers include a little extra functionality.
Hacking with Swift: How to Play Sounds Using AVAudioPlayer
Again, covers the basics.
Sweet Tutos: How To Play Sounds Files And Manage Duration Progress – AVAudioPlayer Tutorial
Updated to Swift 3. Some useful info.
Xamarin: Playing Sound with AVAudioPlayer
Written in Swift 2, I think.
Apple Video: WWDC 2013 Moving to AV Kit and AV Foundation
While not directly related, I found the first 30 minutes of this video introducing developers to AV Kit and AV Foundation in OS X 10 provides a useful overview of the technology.
I was working on the same problem, best what I could do it is, to transcode media content to be playing using avplayer, here is a draft, maybe it can help.

"Sound" Recognition in Swift?

I'm working on an applicaion in Swift and I was thinking about a way to get Non-Speech sound recognition in my project.
I mean is there a way in which I can take in sound inputs and match them against some predefined sounds already incorporated in the project and if a match occurs, it should do some particular action?
Is there any way to do the above? I'm thinking breaking up the sounds and doing the checks, but can't seem to get any further than that.
My personal experience follows matt's comment above: requires serious technical knowledge.
There are several ways to do this, and one is typically as follows: extract some properties from the sound segment of interest (audio feature extraction), and classify this audio feature vector with some kind of machine learning technique. This typically requires some training phase where the machine learning technique was given some examples to learn what sounds you want to recognize (your predefined sounds) so that it can build a model from that data.
Without knowing what types of sounds you're aiming for to be recognized, maybe our C/C++ SDK available here might do the trick for you: http://www.samplesumo.com/percussive-sound-recognition
There's a technical demo on that page that you can download and try with your sounds. It's a C/C++ library, and there is a Mac, Windows and iOS version, so you should be able to integrate it with a Swift app on iOS. Maybe this will allow you to do what you need?
If you want to develop your own technology, you may want to start by finding and reading some scientific papers using the keywords "sound classification", "audio recognition", "machine listening", "audio feature classification", ...
Matt,
We've been developing a bunch of cool tools to speed up iOS development, specially in Swift. One of these tools is what we called TLSphinx: a Swift wrapper around Pocketsphinx which can perform speech recognition without the audio leaving the device.
I assume TLSphinx can help you solve your problem since it is a totally open source library. Search for it on Github ('TLSphinx') and you can also download our iOS app ('Tryolabs Mobile Showcase') and try the module live to see how it works.
Hope it is useful!
Best!

Designing a library for Hardware-accelerated unsupported containers on iOS (and Airplay)

I'm trying to put together an open source library that allows iOS devices to play files with unsupported containers, as long as the track formats/codecs are supported. e.g.: a Matroska video (MKV) file with an H264 video track and an AAC audio track. I'm making an app that surely could use that functionality and I bet there are many more out there that would benefit from it. Any help you can give (by commenting here or—even better— collaborating with me) is much appreciated. This is where I'm at so far:
I did a bit of research trying to find out how players like AVPlayerHD or Infuse can play non-standard containers and still have hardware acceleration. It seems like they transcode small chunks of the whole video file and play those in sequence instead.
It's a good solution. But if you want to throw that video to an Apple TV, things don't work as planned since the video is actually a bunch of smaller chunks being played as a playlist. This site has way more info, but at its core streaming to Apple TV is essentially a progressive download of the MP4/MPV file being played.
I'm thinking a sort of streaming proxy is the way to go. For the playing side of things, I've been investigating AVSampleBufferDisplayLayer (more info here) as a way of playing the video track. I haven't gotten to audio yet. Things get interesting when you think about the AirPlay side of things: by having a "container proxy", we can make any file look like it has the right container without the file size implications of transcoding.
It seems like GStreamer might be a good starting point for the proxy. I need to read up on it; I've never used it before. Does this approach sound like a good one for a library that could be used for App Store apps?
Thanks!
Finally got some extra time to go over GStreamer. Especially this article about how it is already updated to use the hardware decoding provided by iOS 8. So no need to develop this; GStreamer seems to be the answer.
Thanks!
The 'chucked' solution is no longer necessary in iOS 8. You should simply set up a video decode session and pass in NALUs.
https://developer.apple.com/videos/wwdc/2014/#513

MTAudioProcessingTap on iOS - Should I use VLC or build a video player from scratch?

I'm trying to build an iOS app that plays video files and does some interesting things using MTAudioProcessingTap. I need it to be able to play all sorts of formats, including some that are not supported by Apple. I'm thinking of branching out from VLC, but I can't figure out if it uses Core Audio/Video at any point or if it's running something else completely.
If it's not, is there a library I can use to take care of the 203572964 codecs being used out there?
Thanks.
Preliminary note: I'm the developer of VLC for iOS so the following may be biased.
MobileVLCKit for iOS includes 2 different audio output modules. One of them is a high level module based on AudioQueue which is fairly incomplex but a bit slow. The other is based on AudioUnit, the low level framework of CoreAudio, quite a bit more complex, but way faster. Depending on your current experience, either module would be a good way to start.
Regarding the one library supporting all codecs thing: basically there are two forks of the same library: libav and FFmpeg. VLC supports either flavor and abstracts the complexity and the ever-changing APIs (which are a real pain if you intend to keep maintaining your app across multiple releases of those libraries). Additionally, we include a quite well performing OpenGL ES 2 video output module which is using shaders to do chroma conversation. All you need to do is embedding a UIView. MobileVLCKit handles the rest.
Speaking of MobileVLCKit: this is a thin ObjC layer on top of libvlc simplifying the use of this library in third party applications by abstracting most commonly used features.
As implicitly mentioned by HalR, libvlc does not use hardware accelerated decoding on iOS yet. We are working with the libav developers on a generic approach, but we are not quite there yet. Thus, we have to do all the decoding on the CPU, which leads to the heating but allows us to play virtually anything instead of H264/MP4 using the default, accelerated API.
If you can't figure out how its playing the video, at its lower level, that perhaps is a sign that you should keep working with it instead of trying to outdo it. Video processing is pretty difficult and often unsupported formats are unsupported due to patent issues. I really haven't seen anything better than VLC that is publicly available.
VLC 2.1.x appears to use AudioToolbox and AVFoundation.
One other issue, though, is that when I was doing work with VLC, I was stunned how it turned my iPod Touch into a miniature iron, because it was working so hard to process the video. Manually processing video is very processor intensive and really is a drain. So your way or VLC could still have some additional issues.

wavetables implemented on iOS

I just saw an iPhone app which uses wavetables to generate sounds. I wish to know how it is possible to implement.
I am pretty much sure that core audio have to be used, but any other idea where to go for some other info will be appreciated.
You'll want CoreAudio or AudioUnits for a responsive program (e.g. AudioQueue's latency is a bit high).
You'll want AudioFile APIs (in AudioToolbox) for reading the tables if you save them as a common audio file format (just wave files with a new shape every cycle, which is every N samples).
Beyond that, you'll probably have to write the wavetable engine. I have done that; It's not tough if you know how wavetable synthesis works and are familiar with audio signals. It's one of the most basic synthesis types.
musicdsp.org may have something you can use as a starting point for this.
After huge investigating I have found an open source project regarding this. http://gitorious.org/pdlib/
Audio file I/O: I found a great resource here. This guy created an excellent API for using ExtAudioFileServices.
A must read is Learning Core Audio. Chris Adamson and company have really put together a great resource. Chris's blog can also be found here
Also, sign up for the Core Audio mailing list.
Michael Tyson's blog/ resources are great too A Tasty Pixel.
Hope this helps!
Take a look at this tutorial on how to use the STK: http://arielelkin.github.io/articles/mandolin/
It is an open-source C++ library with cool synths, some with wavetables.

Resources