I want to record two voices and compare them. I think there is some Apple sample code for voice recording. I have no idea about
comparing two audio files. What is the right approach for this? Is there any framework Apple provides for this purpose or is there any third party framework?
It's not in objective C, but it does contain some fantastic explanation about how audio is compared by Shazam, and includes sample code (and source for a working application) in Java:
Check this out
Additionally, This Question has a fantastic link to audio fingerprinting, which is essentially the same as the article above, but more in depth.
Hope this helps
I'm using Visqol for this purpose. If your audio files are generally not more than 10sek this could be something worth looking into. Also check ffmpeg library for converting the files into the desired format(Visqol will require certain sample rate depending if it is just music or speech).
https://github.com/google/visqol
Related
I use AVPlayer to play audio(streaming or local file). For this audio I want to apply some effects - boost volume, skip silence, reduce noise, change speed(in 0.1 intervals).
I did same thing in android by creating own player, decoding different audio formats into pcm data and then using some c libraries to modify it. It was quite complicated.
Is it possible to do with AVPlayer or how can I do that? Something like modifying audio already decoded by AVPlayer. Is there some ios api (AVAudioEngine?) or frameworks (audioKit?) which can do this?
Thanks!
IMHO the best solution is to use https://github.com/audiokit/AudioKit as it is well maintained and supports most of the requirements you listed.
Another approach is to import the C library you used in the Android project and have a wrapper around it so it can easily used in ObjectiveC/Swift. With this approach you will have less code to maintain and you guarantee similar results between the two platforms. Do you care to share more about this code ?
I'm working on an applicaion in Swift and I was thinking about a way to get Non-Speech sound recognition in my project.
I mean is there a way in which I can take in sound inputs and match them against some predefined sounds already incorporated in the project and if a match occurs, it should do some particular action?
Is there any way to do the above? I'm thinking breaking up the sounds and doing the checks, but can't seem to get any further than that.
My personal experience follows matt's comment above: requires serious technical knowledge.
There are several ways to do this, and one is typically as follows: extract some properties from the sound segment of interest (audio feature extraction), and classify this audio feature vector with some kind of machine learning technique. This typically requires some training phase where the machine learning technique was given some examples to learn what sounds you want to recognize (your predefined sounds) so that it can build a model from that data.
Without knowing what types of sounds you're aiming for to be recognized, maybe our C/C++ SDK available here might do the trick for you: http://www.samplesumo.com/percussive-sound-recognition
There's a technical demo on that page that you can download and try with your sounds. It's a C/C++ library, and there is a Mac, Windows and iOS version, so you should be able to integrate it with a Swift app on iOS. Maybe this will allow you to do what you need?
If you want to develop your own technology, you may want to start by finding and reading some scientific papers using the keywords "sound classification", "audio recognition", "machine listening", "audio feature classification", ...
Matt,
We've been developing a bunch of cool tools to speed up iOS development, specially in Swift. One of these tools is what we called TLSphinx: a Swift wrapper around Pocketsphinx which can perform speech recognition without the audio leaving the device.
I assume TLSphinx can help you solve your problem since it is a totally open source library. Search for it on Github ('TLSphinx') and you can also download our iOS app ('Tryolabs Mobile Showcase') and try the module live to see how it works.
Hope it is useful!
Best!
I've read quite a bit both here (Audio Framework in iPhone) and abroad but am still confused as to which Audio Framework to use.
I'm able to get some easier things done, like recording and playing back but I'm looking to the future of the app where I'll be doing more complex things, like managing past recordings (although maybe that's a NSURL bookmark thing) and editing audio.
Right now I'm using AVFoundation but have started reading the docs for Core Audio (and there's also AudioToolbox). I wish there was a developer doc called "Understanding the Different Audio Frameworks and How and When to use them" because, well, the docs are dense and I'm having trouble figuring out which path to go down.
Links to good docs would also be much appreciated!
I recommend you take a look at the recent Learning Core Audio book. The purpose of it was to disambiguate the confusion around audio frameworks on Mac OS and iOS. If you want "good docs", it's well worth getting.
Depending on your requirements, you might also want to consider some of the non-Apple audio frameworks, particularly the MoMu release of STK, which in may respects will be simpler and easier-to-use than Apple's frameworks.
guys.
I'm working on some audio services on iOS.
I trying to search any examples or tutorials about
how audio service or stream can read a existing audio file than
process something like filter, than write another file.
Is there any body who can help me?
Dirac3LE (by Stephan M. Bernsee) is a great library for this job.
There are examples and manual included in the download.
It is particulary inteded for time and pitch manipulation
but in your case you'll be interested in its EAFRead and EAFWrite
classes.
If you want to get familiar with the lower level library that you can also use for microphone input/sound output, and that you can get raw samples into and out of, I would suggest taking a look at Audio Queue Services.
I used it in my side project to get audio from the microphone, and I also wrote some code you might find useful to do fast vectorized, FFT based FIR filtering on input audio. You can find the code here https://github.com/jamescarlson/FreeAPRS
I just saw an iPhone app which uses wavetables to generate sounds. I wish to know how it is possible to implement.
I am pretty much sure that core audio have to be used, but any other idea where to go for some other info will be appreciated.
You'll want CoreAudio or AudioUnits for a responsive program (e.g. AudioQueue's latency is a bit high).
You'll want AudioFile APIs (in AudioToolbox) for reading the tables if you save them as a common audio file format (just wave files with a new shape every cycle, which is every N samples).
Beyond that, you'll probably have to write the wavetable engine. I have done that; It's not tough if you know how wavetable synthesis works and are familiar with audio signals. It's one of the most basic synthesis types.
musicdsp.org may have something you can use as a starting point for this.
After huge investigating I have found an open source project regarding this. http://gitorious.org/pdlib/
Audio file I/O: I found a great resource here. This guy created an excellent API for using ExtAudioFileServices.
A must read is Learning Core Audio. Chris Adamson and company have really put together a great resource. Chris's blog can also be found here
Also, sign up for the Core Audio mailing list.
Michael Tyson's blog/ resources are great too A Tasty Pixel.
Hope this helps!
Take a look at this tutorial on how to use the STK: http://arielelkin.github.io/articles/mandolin/
It is an open-source C++ library with cool synths, some with wavetables.