I found that there are AKFrequencyTracker and AKMicrophoneTracker in the lib. And both provides frequency as param.
The questions are:
What the difference between AKFrequencyTracker and
AKMicrophoneTracker?
What class is better to use for a real-time microphone singing detection?
Many thanks in advance.
AKMicrophoneTracker is a standalone class that just reads from the microphone and nothing more whereas the AKFrequencyTracker is a node that can be inserted at any point in your signal chain. They both use the same frequency detection algorithm, its just that the AKMicrophoneTracker is easier to use for the common case where all you need is pitch detection and nothing else AudioKit provides.
Related
I am having trouble finding how to read frequencies from audio input. I am trying to listen to very high frequencies (ultrasonic). I've explored several GitHub projects which all were either outdated or malfunctional.
I discovered this guide, but I am having trouble understanding it. https://developer.apple.com/documentation/accelerate/finding_the_component_frequencies_in_a_composite_sine_wave Can anyone provide guidance; has anyone done this before? Thanks
It's worth digging into this piece of sample code: https://developer.apple.com/documentation/accelerate/visualizing_sound_as_an_audio_spectrogram
The sample calculates the Nyquist frequency of the microphone - for example your device might have a maximum frequency of 20KHz. You can look at the values in each frequency domain page of samples and find the maximum value to derive the dominant frequency.
I would like to ask if it is possible to filter the frequency of the human voice only via AudioKit or otherwise. I want to create an emotional analyzer based on these frequencies from a human voice, but the problem is that the microphone captures all the frequencies around me. Is there any way to remove this?
And next, I would like to ask if it is possible to recognize which one is just talking. I mean conversation between two people.
Thank you in advance for a possible answer.
I am building an iOS app that allows the user to play guitar sounds - e.g. plucking or strumming.
I'd like to allow the user to apply pitch shifting or wah-wah (compression) on the guitar sound being played.
Currently, I am using audio samples of the guitar sound.
I've done some basic read-ups on DSP and audio synthesis, but I'm no expert in it. I saw libraries such as csound and stk, and it appears that the sounds they produced are synthesized (i.e. not played from audio samples). I am not sure how to apply them, or if I can use them to apply effects such as pitch shifting or wah-wah to audio samples.
Can someone point me in the right direction for this?
You can use open-source audio processing libraries. Essentially, you are getting audio samples in and you need to process them and send them as samples out. The processing can be done by these libraries, or you use one of your own. Here's one DSP-Library (Disclaimer: I wrote this). Look at the process(float,float) method for any of the classes to see how one does this.
Wah-wah and compression are 2 completely different effects. Wah-wah is a lowpass filter whose center frequency varies slowly, whereas compression is a method to equalize the volume. The above library has a Compressor class that you can check out.
The STK does have effects classes as well, not just synthesis classes (JCRev) is one for reverb but I would highly recommend staying away from it as they are really hard to compile and maintain.
If you haven't seen this already, check out Julius Smith's excellent, and comprehensive book Physical Audio Signal Processing
I need to write a speech detection algorithm (not speech recognition).
At first I thought I just have to measure the microphone power and compare it to some threshold value. But the problem gets much harder once you have to take the ambient sound level into consideration (for example in a pub a simple power threshold is crossed immediately because of other people talking).
So in the second version I thought I have to measure the current power spikes against the average sound level or something like that. Coding this idea proved to be quite hairy for me, at which point I decided it might be time to research already existing solutions.
Do you know of some general algorithm description for speech detection? Existing code or library in C/C++/Objective-C is also fine, be it commercial or free.
P.S. I guess there is a difference between “speech” and “sound” recognition, with the first one only responding to frequencies close to human speech range. I’m fine with the second, simpler case.
The key phrase that you need to Google for is Voice Activity Detection (VAD) – it's implemented widely in telecomms, particularly in Acoustic Echo Cancellation (AEC).
How can I detect that speech was started from some audio file. I need only detect start and stop of the speech without recognition
Thank you.
Check out this app
http://developer.apple.com/library/ios/#samplecode/SpeakHere/Introduction/Intro.html
you can tinker with this sample code a little to get what you need...
Here is one more link that I have come across
http://developer.apple.com/library/ios/#samplecode/aurioTouch/Introduction/Intro.html#//apple_ref/doc/uid/DTS40007770
You could use a pitch detector to listen for the presence of harmonic tones within the range of human speech. I don't know of any pitch detector for iOS though. I wrote my own And it was very hard.
Dirac does pitch detection, I don't know how accurate it is because I don't want to spend £1000 on the licence.