When using AKFrequencyTracker, I like to add the "tempo" feature to recognize notes according to their pace and wonder
Trying to use AKPeriodicFunction and AKMetronome but it looks they are used for playback rather than analysis.
tracker = AKFrequencyTracker(mic)
if tracker.amplitude > 0.1 {
var frequency = Float(tracker.frequency)
...
How can I add the "tempo" feature into the the tracker?
Thanks
AudioKit doesn't have beat detection, which would be the first tempo to a tempo tracker. We recently incorporated Aubio with AudioKit for a client project:
https://github.com/aubio/aubio
There are other open source alternatives, just do some Google searches for "tempo detection GitHub":
https://www.google.com/search?client=safari&rls=en&q=tempo+detection+github&ie=UTF-8&oe=UTF-8
Related
I want to create similar sound to theremin using touch coordinate on screen. I'm using y axis as frequency, x axis as amplitude.
Due to my small research I believe I can create it using AKOscillator or AKFMOscillator from AudioKit framework (please let me know if any other oscillator works better in this case). I'm open to other frameworks like built-in AudioToolbox (MIDINoteMessage etc.) if I can create similar sound to theremin.
Here it says theremin has two oscillators. One with fixed-frequency on 260kHz and one is dynamic between 257-260kHz. It superimposes their output (it takes difference of them I guess?). And it outputs between frequency between 0-3 kHz.
When I create sounds using AKFMOscillator with baseFrequency between 257-260 kHz, it sounds high-pitched.
When I try with one oscillator range between 0-3kHz it sounds very robotic. How I can simulate timbre of theremin?
How can I make it sound better? Should I mix two oscillators? I tried mixing with AKMixer but when both oscillators use same frequency and amplitude, it makes no difference.
I tried to mapping to nearest note (auto-tune), I tried limiting the frequency between 3-4 octaves. It sounds better but still not good as theremin.
What should use ( AKOscillator or AKFMOscillator, OscillatorBank), with which parameters (rampDuration, baseFrequency, modulationIndex, amplitude) to simulate more thereminish sound?
Update:
I did some more research and played with Synth One presets. Now, I know I need two oscillators mixed (both set to saw-shape wave). Changing ADSR(envelope) values to specific ranges creates richer sound (this gives the instrumental sound type). And a lfo to create the wavy (or spooky) sound effect. Playing notes (specific frequencies) creates good sounds, if you play every frequency in between note frequencies it doesn't sound good.
When I want to build an Oscillator with AudioKit there are different ways to go. For example you can create an AKOperation within an AKOperationGenerator like
var osc = AKOperationGenerator { parameters in
returnAKOperation.sawtoothWave(frequency: GeneratorSource.frequency)
)
but you could also create one with
var oscillator = AKOscillator(waveform: AKTable(.sawtooth))
What's the difference and when to choose what? Thnx!
If you just want one oscillator, it makes sense just to use the AKOscillator node, but if you want to do more than one thing dynamically, operations get you a lot flexibility. For instance, in your operation you can create two operation oscillators - one to oscillator the frequency and a low rate (LFO) and the other to actually oscillate the audio rate signal. There a few playgrounds that highlight when to use operations like this one:
http://audiokit.io/playgrounds/Synthesis/FM%20Oscillator%20Operation/
and the others listed in the Operations section of
http://audiokit.io/playgrounds/Synthesis/
I'm using OpenEars in my app for performing the recognition of some words and sentences. I have followed the basic tutorial for the offline speech recognition and executed a porting in Swift. This is the setup procedure
self.openEarsEventsObserver = OEEventsObserver()
self.openEarsEventsObserver.delegate = self
let lmGenerator: OELanguageModelGenerator = OELanguageModelGenerator()
addWords()
let name = "LanguageModelFileStarSaver"
lmGenerator.generateLanguageModelFromArray(words, withFilesNamed: name, forAcousticModelAtPath: OEAcousticModel.pathToModel("AcousticModelEnglish"))
lmPath = lmGenerator.pathToSuccessfullyGeneratedLanguageModelWithRequestedName(name)
dicPath = lmGenerator.pathToSuccessfullyGeneratedDictionaryWithRequestedName(name)
The recognition works well in a quiet room for both single words and whole sentences ( I would say it has a 90% hit rate). However, when I tried in quiet pub with a light background noise the app had serious difficulties in recognising even just word.
Is there any way to improve the speech recognition when there is background noise?
If the background noise is more or less uniform (i.e. has a regular pattern), you can try adaptation of the acoustic model, otherwise it's an open problem sometimes referred to as the cocktail party effect, which can be part solved using DNNs.
Try this setting, works well for me.
try? OEPocketsphinxController.sharedInstance().setActive(true)
OEPocketsphinxController.sharedInstance().secondsOfSilenceToDetect = 2
OEPocketsphinxController.sharedInstance().setSecondsOfSilence()
OEPocketsphinxController.sharedInstance().vadThreshold = 3.5
OEPocketsphinxController.sharedInstance().removingNoise = true
Or You can try iSphinx library.
I'm trying to use the currentPlaybackRate property on MPMusicPlayerController to adjust the tempo of a music track as it plays. The property works as expected when the rate is less than 0.90 or greater than 1.13, but for the range just above and below 1, there seems to be no change in tempo. Here's what I'm trying:
UIAppDelegate.musicPlayer = [MPMusicPlayerController iPodMusicPlayer];
... load music player with track from library
[UIAppDelegate.musicPlayer play];
- (void)speedUp{
UIAppDelegate.musicPlayer.currentPlaybackRate = UIAppDelegate.musicPlayer.currentPlaybackRate + 0.03125;
}
- (void)speedDown
{
UIAppDelegate.musicPlayer.currentPlaybackRate = UIAppDelegate.musicPlayer.currentPlaybackRate - 0.03125;
}
I can monitor the value currentPlaybackRate and see that it's being correctly set, but there seems to be no different in playback tempo until the 0.9 or 1.13 threshold has been reached. Does anyone have any guidance or experience on the matter?
I'm no expert, but I suspect that this phenomenon may be merely an artefact of the algorithm used to change the playback speed without raising or lowering the pitch. It's a tricky business, and here it must be done in real time without much distortion, so probably an integral multiple of the tempo is needed. You might want to read the wikipedia article on time stretching, http://en.wikipedia.org/wiki/Audio_timescale-pitch_modification
Actually I've found out the problem: the sentence myMusicPlayer.currentPlaybackRate = 1.2 must be placed after the sentence .play(). If you put the rate setting before the .play(), it would not work.
Is there an API in one of the iOS layers that I can use to generate a tone by just specifying its Hertz. What I´m looking to do is generate a DTMF tone. This link explains how DTMF tones consists of 2 tones:
http://en.wikipedia.org/wiki/Telephone_keypad
Which basically means that I should need playback of 2 tones at the same time...
So, does something like this exist:
SomeCleverPlayerAPI(697, 1336);
If spent the whole morning searching for this, and have found a number of ways to playback a sound file, but nothing on how to generate a specific tone. Does anyone know, please...
Check out the AU (AudioUnit) API. It's pretty low-level, but it can do what you want. A good intro (that probably already gives you what you need) can be found here:
http://cocoawithlove.com/2010/10/ios-tone-generator-introduction-to.html
There is no iOS API to do this audio synthesis for you.
But you can use the Audio Queue or Audio Unit RemoteIO APIs to play raw audio samples, generate an array of samples of 2 sine waves summed (say 44100 samples for 1 seconds worth), and then copy the results in the audio callback (1024 samples, or whatever the callback requests, at a time).
See Apple's aurioTouch and SpeakHere sample apps for how to use these audio APIs.
The samples can be generated by something as simple as:
sample[i] = (short int)(v1*sinf(2.0*pi*i*f1/sr) + v2*sinf(2.0*pi*i*f2/sr));
where sr is the sample rate, f1 and f1 are the 2 frequencies, and v1 + v2 sum to less than 32767.0. You can add rounding or noise dithering to this for cleaner results.
Beware of clicking if your generated waveforms don't taper to zero at the ends.