sound manipulation with program : where should I start learning? - signal-processing

I want to develop an application which could process recorded sound / taken audio file and manipulate it's sound. Where should I start to learn.

In order to obtain recording from microphone using HTML5 you can start here.
For signal processing the good starting point can be MATLAB, if you have access to it.
Python and numpy package have very good opensource tools as well.
I would start by learning how to load a wav or mp3 file into MATLAB or Python (I really recommend IPython and IPython notebook) and learning how to make a Fourier analysis of this signal, then proceed with spectrograms and then try to implement different effects.
Other interesting software to look at are: MaxDSP (not free), PureData (free).
If you have more specific questions - ask. Hope this helps

Before manipulating existing sound - I would first get comfortable understanding audio itself - lookup PCM (pulse code modulation) - write some code which populates an array with a curve between values of -1 to +1 ... say using the sin function - then output this as a WAV file - download the audio utility called : Audacity which is a swiss army knife for audio processing - do not get caught up with fancy libraries until you get some Ah Ha moments with your hand rolled code - to playback/record audio take a look at : web audio API / OpenAL / OpenSL depending on your choice of platform web / laptop / mobile - welcome aboard

Related

FM synthesis in iOS

I would like to modulate the signal from the mic input with a sine wave at 200HZ (FM only). Anyone know of any good tutorials/articles that will help get me started?
Any info is very welcome
Thanks
I suggest you start here Audio File Stream Services Reference
Here you can also find some basic tutorials: Getting Started with Audio & Video.
Especially the SpeakHere example app could be interesting
Hope that helps you
The standard way to do audio processing in iOS or OSX is Core Audio. Here's Apple's overview of the framework.
However, Core Audio has a reputation of being very difficult to learn, especially if you don't have experience with C. If you're still wanting to learn Core Audio, then this book is the way to go: Learning Core Audio.
There are simpler ways to work with audio on iOS and OSX, one of them being AudioKit, which was developed specifically so developers can quickly prototype audio without having to deal with lower-level memory management, buffers, and pointer arithmetic.
There are examples showing both FM synthesis and audio input via the microphone, so you should have everything you need :)
Full disclosure: I am one of the developers of AudioKit.

Video conversion using map-reduce

I have a Ruby on Rails application where users would be uploading videos and I'm looking for a system for converting videos uploaded by the users to FLV format.
Currently we are using FFMPEG and since video conversion is a heavy task it seems to be taking a lot of time and a lot of CPU resources..
We are looking if we can use map-reduce / Hadoop framework for implementing video conversion, as it is completely distributed.
Is it a good option to use map-reduce for video conversion in real time? If it is so, how can that be implemented?
Note: Each video file size is around 50 - 60 MB.
Your requirement is "Real Time" conversion. Keep in mind that Hadoop is a "Batch Processing Framework".
IMHO, I say Hadoop is a poor choice here. A better solution would be definitely to use something like Storm:
Apache Storm is a free and open source distributed realtime computation system. Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing.
Personally, I implemented a project similar to yours using Storm and the result was amazing.
Another option is to use a distributed Actors model, such as Akka.io or Erlang. But since you are a Ruby shop, Storm or Akka would be easier for your team.

How Do I "Mix/Superimpose" two m4a audio files together

After finally successfully finding a way to concatenate multiple voice files into one single audio file on the iPhone, I am am now trying to superimpose an audio file over the length of the voice file.
So basically I have two .m4a files:
voice.m4a which is about 10 seconds for example.
music.m4a which is about 5 seconds.
What I require is that two file be combined in such a manner that the resulting single audio file now contains the music in the background of the voice file for the length of it, so basically the resulting output should have the 10 seconds of voice and the 5seconds of music repeated twice. It is absolutely important to have a single file that contains all of this.
I am trying to get all of this done in an application on the iPhone.
Can anyone please help me out with this?
If you are looking to do that programmatically, you will need to go deeper down into CoreAudio. For a simpler solution you could use AudioQueues or for more fine grained control AudioUnits and an AUGraph. The MultiChannelMixer is the Audio Unit you are looking for. Unfortunately there is no space for an elaborate tutorial here (would take a couple of days to write just the tutorial itself), but I am hoping I could point you to the right direction.
If you decide to go down that path and want to do further audio programming then this one time simple example, then I strongly suggest you buy "Learning Core Audio, A Hands-on Guide to Audio Programming for Mac and iOS" - Chris Adamson, Kevin Avila. You can find it on Amazon, paperback or Kindle.

iOS Audio Service : Read & write audio files

guys.
I'm working on some audio services on iOS.
I trying to search any examples or tutorials about
how audio service or stream can read a existing audio file than
process something like filter, than write another file.
Is there any body who can help me?
Dirac3LE (by Stephan M. Bernsee) is a great library for this job.
There are examples and manual included in the download.
It is particulary inteded for time and pitch manipulation
but in your case you'll be interested in its EAFRead and EAFWrite
classes.
If you want to get familiar with the lower level library that you can also use for microphone input/sound output, and that you can get raw samples into and out of, I would suggest taking a look at Audio Queue Services.
I used it in my side project to get audio from the microphone, and I also wrote some code you might find useful to do fast vectorized, FFT based FIR filtering on input audio. You can find the code here https://github.com/jamescarlson/FreeAPRS

XNA | C# : Record and Change the Voice

My aim is code a project which records human sound and changes it (with effects).
e.g : a person will record its sound over microphone (speak for a while) and than the program makes its like a baby sound.
This shall run effectively and fast (while recording the altering operation must run, too)
What is the optimum way to do it ?
Thanks
If you're looking for either XNA or DirectX to do this for you, I'm pretty sure you're going to be out of luck (I don't have much experience with DirectSound; maybe somebody can correct me). What it sounds like you want to do is realtime digital signal processing, which means that you're either going to need to write your own code to manipulate the raw waveform, or find somebody else who's already written the code for you.
If you don't have experience writing this sort of thing, it's probably best to use somebody else's signal processing library, because this sort of thing can quickly get complicated. Since you're developing for the PC, you're in luck; you can use any library you like using P/Invoke. You might try out some of the solutions suggested here and here.
MSDN has some info about the Audio namespace from XNA, and the audio recording introduced in version 4:
Working with Microphones
Recording Audio from a Microphone
Keep in mind that recorded data is returned in PCM format.

Resources