AKOscillator frequency range for theremin sound in iOS - ios

I want to create similar sound to theremin using touch coordinate on screen. I'm using y axis as frequency, x axis as amplitude.
Due to my small research I believe I can create it using AKOscillator or AKFMOscillator from AudioKit framework (please let me know if any other oscillator works better in this case). I'm open to other frameworks like built-in AudioToolbox (MIDINoteMessage etc.) if I can create similar sound to theremin.
Here it says theremin has two oscillators. One with fixed-frequency on 260kHz and one is dynamic between 257-260kHz. It superimposes their output (it takes difference of them I guess?). And it outputs between frequency between 0-3 kHz.
When I create sounds using AKFMOscillator with baseFrequency between 257-260 kHz, it sounds high-pitched.
When I try with one oscillator range between 0-3kHz it sounds very robotic. How I can simulate timbre of theremin?
How can I make it sound better? Should I mix two oscillators? I tried mixing with AKMixer but when both oscillators use same frequency and amplitude, it makes no difference.
I tried to mapping to nearest note (auto-tune), I tried limiting the frequency between 3-4 octaves. It sounds better but still not good as theremin.
What should use ( AKOscillator or AKFMOscillator, OscillatorBank), with which parameters (rampDuration, baseFrequency, modulationIndex, amplitude) to simulate more thereminish sound?
Update:
I did some more research and played with Synth One presets. Now, I know I need two oscillators mixed (both set to saw-shape wave). Changing ADSR(envelope) values to specific ranges creates richer sound (this gives the instrumental sound type). And a lfo to create the wavy (or spooky) sound effect. Playing notes (specific frequencies) creates good sounds, if you play every frequency in between note frequencies it doesn't sound good.

Related

Measure (frequency-weighted) sound levels with AudioKit

I am trying to implement an SLM app for iOS using AudioKit. Therefore I need to determine different loudness values to a) display the current loudness (averaged over a second) and b) do further calculations (e.g. to calculate the "Equivalent Continuous Sound Level" over a longer time span). The app should be able to track frequency-weighted decibel values like dB(A) and dB(C).
I do understand that some of the issues im facing are related to my general lack of understanding in the field of signal and audio processing. My question is how one would approach this task with AudioKit. I will describe my current process and would like to get some input:
Create an instance of AKMicrophone and a AKFrequencyTracker on this microphone
Create a Timer instance with some interval (currently 1/48_000.0)
Inside the timer: retrieve the amplitude and frequency. Calculate a decibel value from the amplitude with 20 * log10(amplitude) + calibrationOffset (calibration offset will be determined per device model with the help of a professional SLM). Calculate offsets for the retrieved frequency according to frequency-weighting (A and C) and apply these to the initial dB value. Store dB, dB(A) and dB(C) values in an array.
Calculate the average for arrays over the give timeframe (1 second).
I read somewhere else that using a Timer this is not the best approach. What else is there that I could use for the "sampling"? What exactly is the frequency of AKFrequencyTracker? Will this frequency be sufficient to determine dB(A) and dB(C) values or will I need an AKFFTTap for this? How are values retrieved from the AKFrequencyTracker averaged, i.e. what time frame is used for the RMS?
Possibly related questions: Get dB(a) level from AudioKit in swift, AudioKit FFT conversion to dB?

FSK demodulation with GNU Radio

I'm trying to demodulate a signal using GNU Radio Companion. The signal is FSK (Frequency-shift keying), with mark and space frequencies at 1200 and 2200 Hz, respectively.
The data in the signal text data generated by a device called GeoStamp Audio. The device generates audio of GPS data fed into it in real time, and it can also decode that audio. I have the decoded text version of the audio for reference.
I have set up a flow graph in GNU Radio (see below), and it runs without error, but with all the variations I've tried, I still can't get the data.
The output of the flow graph should be binary (1s and 0s) that I can later convert to normal text, right?
Is it correct to feed in a wav audio file the way I am?
How can I recover the data from the demodulated signal -- am I missing something in my flow graph?
This is a FFT plot of the wav audio file before demodulation:
This is the result of the scope sink after demodulation (maybe looks promising?):
UPDATE (August 2, 2016): I'm still working on this problem (occasionally), and unfortunately still cannot retrieve the data. The result is a promising-looking string of 1's and 0's, but nothing intelligible.
If anyone has suggestions for figuring out the settings on the Polyphase Clock Sync or Clock Recovery MM blocks, or the gain on the Quad Demod block, I would greatly appreciate it.
Here is one version of an updated flow graph based on Marcus's answer (also trying other versions with polyphase clock recovery):
However, I'm still unable to recover data that makes any sense. The result is a long string of 1's and 0's, but not the right ones. I've tried tweaking nearly all the settings in all the blocks. I thought maybe the clock recovery was off, but I've tried a wide range of values with no improvement.
So, at first sight, my approach here would look something like:
What happens here is that we take the input, shift it in frequency domain so that mark and space are at +-500 Hz, and then use quadrature demod.
"Logically", we can then just make a "sign decision". I'll share the configuration of the Xlating FIR here:
Notice that the signal is first shifted so that the center frequency (middle between 2200 and 1200 Hz) ends up at 0Hz, and then filtered by a low pass (gain = 1.0, Stopband starts at 1 kHz, Passband ends at 1 kHz - 400 Hz = 600 Hz). At this point, the actual bandwidth that's still present in the signal is much lower than the sample rate, so you might also just downsample without losses (set decimation to something higher, e.g. 16), but for the sake of analysis, we won't do that.
The time sink should now show better values. Have a look at the edges; they are probably not extremely steep. For clock sync I'd hence recommend to just go and try the polyphase clock recovery instead of Müller & Mueller; chosing about any "somewhat round" pulse shape could work.
For fun and giggles, I clicked together a quick demo demod (GRC here):
which shows:

How to remove Music from a song and keep Vocals

I Have a movie sample with audio transcription (For Blind People- There is a narrator explaining what is going on in the movie). I want to extract that.
What i so far tried was:
1- I have the sample without the transcription as well so i just imported both samples in Audacity. Inverted one and mixed. But it simply doesnt work ( Normalization is also applied)
2- I tooke the sample with audio description. splitted to mono, Took one channel and inverted. and mixed again. Now i have the movie without audio transcript. My intuition was that if i invert this result file again and mix it with the Actual one the other sounds should cross out and i would have the Narrator sound. But it did not happen! what shall i do now ? any suggestions ?
I have checked the following links so far :
http://www.howtogeek.com/61250/how-to-isolate-and-save-vocals-from-music-tracks-using-audacity/
http://www.labnol.org/software/tutorials/remove-vocals-song-mp3-music-instruments/1301/

Get peak volume of audio input on iOS

On iOS 7, how do I get the current microphone input volume in a range between 0 and 1?
I've seen several approaches like this one, but the results I get baffle me.
The return values of peakPowerForChannel: are documented to be in the range of -160 to 0 with 0 being the loudest and -160 near absolute silence.
Problem: Given a quite room and a short but loud noise, the power goes all the way up in an instant but takes very long time to drop back to quite level (way longer than the actual noise...)
What I want: Essentially I want an exact copy of the Audio Input patch of Quartz Composer with its Volume Peak output. Any tips?
To get a similar volume peak measurement, you might have to input raw audio via the iOS Audio Queue API (or the RemoteIO Audio Unit), and analyze the raw PCM waveform samples in each audio callback, looking for a magnitude maxima over your desired frame width or analysis time.

iOS: Sound generation on iPad given Hz parameter?

Is there an API in one of the iOS layers that I can use to generate a tone by just specifying its Hertz. What I´m looking to do is generate a DTMF tone. This link explains how DTMF tones consists of 2 tones:
http://en.wikipedia.org/wiki/Telephone_keypad
Which basically means that I should need playback of 2 tones at the same time...
So, does something like this exist:
SomeCleverPlayerAPI(697, 1336);
If spent the whole morning searching for this, and have found a number of ways to playback a sound file, but nothing on how to generate a specific tone. Does anyone know, please...
Check out the AU (AudioUnit) API. It's pretty low-level, but it can do what you want. A good intro (that probably already gives you what you need) can be found here:
http://cocoawithlove.com/2010/10/ios-tone-generator-introduction-to.html
There is no iOS API to do this audio synthesis for you.
But you can use the Audio Queue or Audio Unit RemoteIO APIs to play raw audio samples, generate an array of samples of 2 sine waves summed (say 44100 samples for 1 seconds worth), and then copy the results in the audio callback (1024 samples, or whatever the callback requests, at a time).
See Apple's aurioTouch and SpeakHere sample apps for how to use these audio APIs.
The samples can be generated by something as simple as:
sample[i] = (short int)(v1*sinf(2.0*pi*i*f1/sr) + v2*sinf(2.0*pi*i*f2/sr));
where sr is the sample rate, f1 and f1 are the 2 frequencies, and v1 + v2 sum to less than 32767.0. You can add rounding or noise dithering to this for cleaner results.
Beware of clicking if your generated waveforms don't taper to zero at the ends.

Resources