How to use vDSP in iOS for converting sound file to FFT - ios

I am new to Audio framework but after searching a while i found Accelerate framework provided by iOS api for Digital Signal Processing. In my project i want to convert a sound file to fft so that i can compare two sounds using fft. So how should i proceed with this? I have gone through apples aurio touch sample app but they didnt use accelerate framework. Can any body help me to convert a sound file to fft and then compare using correlation .

The FFT is a complex beast, not something that can be comprehensively discussed in a single text box (I know accomplished engineers who have taken multiple semesters of classes studying topics that boil down to Fourier Transform analysis). Because of the nature of Accelerate framework's tasks, it too is a non-trivial discussion topic.
I would suggest reading Mike Ash's Friday Q&A on FFTs, where he covers some basic use of the vDSP functions to get FFT values, as a starting place.
See this DSP Stack Exchange answer for discussion on convolution and cross-correlation.

Related

Compare two spectrograms in iOS

I am drawing spectrograms using the sample code aurio touch provided by apple. Now I want to compare the two spectrograms in iOS to see if they are same. Is it possible to compare the two spectrograms using the Accelerate framework?
If it is possible, does anyone know how to compare two spectrograms? If not, is there any other algorithm or library which can be used in iOS for comparing spectrograms?
What you're looking for is called cross-correlation. It's doesn't involve the spectrograms directly, but is based on the same math that allows the spectrograms to be drawn (the Fourier Transform). There's a DSP stack exchange answer here: How do I implement cross-correlation to prove two audio files are similar? that covers the basics of implementing this.
The Accelerate framework will only help you with low-level things like vector and matrix arithmetic, Fourier transforms, etc. What you need to do is figure out how to compare two spectrograms (whatever you mean by compare) using pencil and paper (or just your head if you're pro) and then implement it in code with the aid of frameworks such as Accelerate.
vDSP has all of the building blocks to do cross correlation and convolution, which is what you would need to implement this.
https://developer.apple.com/library/mac/#documentation/Accelerate/Reference/vDSPRef/Reference/reference.html

DSP library with LPC encoder / decoder

I'm trying to create a lightweight diphone speech synthesizer. Everything seems pretty straightforward because my native language has pretty simple pronunciation and text processing rules. The only problem I've stumbled upon is pitch control.
As far as I understand, to control the pitch of the voice, most speech synthesizers are using LPC (linear predictive coding), which essentially separates the pitch information away from the recorded voice samples, and then during synthesis I can supply my own pitch as needed.
The problem is that I'm not a DSP specialist. I have used a Ooura FFT library to extract AFR information, I know a little bit about using Hann and Hamming windows (have implemented C++ code myself), but mostly I treat DSP algorithms as black boxes.
I hoped to find some open-source library which is just bare LPC code with usage examples, but I couldn't find anything. Most of the available code (like Festival engine) is tightly integrated in to the synth and it would be pretty hard task to separate it and learn how to use it.
Is there any C/C++/C#/Java open source DSP library with a "black box" style LPC algorithm and usage examples, so I can just throw a PCM sample data at it and get the LPC coded output, and then throw the coded data and synthesize the decoded speech data?
it's not exactly what you're looking for, but maybe you get some ideas from this quite sophisticated toolbox: Praat

Accelerate's vImage vs. vDSP

I'm trying to use the Accelerate framework on iOS to bypass the fact that Core Image on iOS doesn't support custom filters/kernels. I'm developing an edge detection filter using two convolutions with a Sobel kernel, but starting with a simple Gaussian blur to get the hangs of it. I know vImage is geared towards image manipulation as matrices and vDSP focuses in processing digital signals using Fourier transforms. But although I started using the vImage functions (vImageConvolve_XXXX, etc), I'm hearing a lot of people discussing the use of vDSP's functions (vDSP_conv, vDSP_imgfir, etc) to do such things as convolutions. So that leads me to the question at hand: when should I use one over the other? What are the differences between them with regards to convolution operations? I've looked everywhere but couldn't find a clear answer. Can someone shed some lights on it, or point me in the right direction?
Thanks!
If vImage provides the operation you need, it is usually simplest to use that. vImage does cache blocking and threading for you, vDSP does not. vImage provides operations on interleaved and integer formats, which are often useful for image processing.
Last time I experimented, neither of these frameworks took advantage of kernel separability, which affords a huge performance boost when convolving in the spatial domain -- a far larger performance boost than vectorized instructions will ever buy you. The Sobel kernel in particular is separable, so if you're using vDSP or vImage (instead of say OpenCV), be sure to separate the kernel yourself.

What's a good resource for learning about creating software for signal processing

I'd like to programatically do some signal processing on a live sound feed.
Specifically I'd like to be able to isolate certain bands of frequencies and play around with phase shifting.
I've not worked in this area before from a purely software perspective and a quick google search turned up very little useful information.
Does anyone know of any good information resources for this topic area?
Matlab is a good starting point. It has the necessary toolboxes and functions that will allow you to capture audio signals, run different kind of filters over them and write them to wav files. The UI is easy to navigate through and it's simple enough to generate plots and visualize results.
http://www.mathworks.com/products/signal/
If, however, you're looking to develop real-world applications, then Python can come in handy. They have toolkits like SciPy, Numpy, Audiolab that offer the same functions as Matlab does.
http://www.scipy.org
Link
http://scikits.appspot.com/audiolab
In a nutshell, Matlab is good for testing ideas and prototyping, Python is good for testing as well as real-world application development. And Python is free. Matlab might cost you if you're not a student anymore.
http://www.dspguide.com/
This is a super excellent reference on digital signal processing techniques in general. It's not a programming guide, per se, but covers the techniques and the theory clearly and simply, and provides pseudocode and examples so that you can implement in the language of your choice. You'll be hard up to find a more complete reference, and you can download it for free online!

what are the steps in object detection?

I'm new to image processing and I want to do a project in object detection. So help me by suggesting a step-by-step procedure to this project. Thanx.
Object detection is a very complex problem that includes some real hardcore math and long tuning of parameters to the computation methods involved. Your best bet is to use some freely available library for that - Google will help.
There are lot of algorithms about the theme and no one is the best of all. It's usually a mixture of them what makes the best solution to the solution.
For example, for object movement detection you could look at frame differencing and misture of gaussians.
Also, it's very dependent of your application, the environment (i.e. noise, signal quality), the processing capacity you may have available, the allowable error margin...
Besides, for it to work, most of time it's first necessary to do some kind of image processing to the input data like median filter, sobel filter, contrast enhancement and a large so on.
I think you should start reading all you can: books, google and, very important, a lot of papers about the subjects (there are many free in internet) you are interested in.
And first of all, i think it's fundamental (at least it has been for me) having a good library for testing. The one i have used/use is OpenCV. It's very complete, implement many of the actual more advanced algorithms, is very active, has a big community and it's free.
Open Computer Vision Library (OpenCV)
Have luck ;)
Take a look at AForge.NET. It's nowhere near Project Natal's levels of accuracy or usefulness, but it does give you the tools to learn the algorithms easily. It's an image processing and AI library and there are several tutorials on colored object tracking and motion detection.
Another one to look at is OpenCV from Intel. I believe it's a bit more advanced, but it's written in C.
Take a look at this. It might get you started in this complex field. The algorithm pages that it links to are interesting reading.
http://sun-valley.stanford.edu/projects/helicopters/final.html
This lecture by Jeff Hawkins, will give you an idea about the state of the art in this super-difficult field.
Seems that video disappeared... but this vid should cover similar ground.

Resources