Detect sound level in a certain frequency range in iOS SDK - ios

I am working on my iOS app and I need to detect sound level from a certain frequency range. Here is a good tutorial for detecting sound level, but how to do that in specific frequency range in iOS SDK?

You need to capture audio from the microphone (AVAudioEngine is a good API for doing that), calculate its Fourier Transform (the Accelerate framework will do that with blazing speed) and examine the amplitude of the frequency bucket corresponding to your frequency. If it's large then you've got a match.
A possibly simpler and more efficient technique would be a Goertzal filter which is good at detecting a given frequency.

Without knowing the exact use case, I can imagine that similar code as what's used in a musical instrument tuner app would work. A quick search found this guitar tuning app which uses fast Fourier transforms.
Note that FFTs are implemented in the Accelerate framework, and in this app they appear to be imported from some other library.

Related

Is there a way to Improve Vision Frameworks Barcode Detection rate?

So my Question is rather generic. I am implementing a Barcode Scanner with Vision Framework for iOS and i came across the problem:
Low opacity or crumpled Barcodes do not have a high success rate with the Vision Framework. It is still the highest out of all Barcodes Scanners i took a closer look at.
Now im wondering if there is an option to increase Back Scale on the live camera to up the opacity of fading Barcodes?
I am rather certain that wont be able to change a lot when it comes to Barcodes that have a crinkle at two lines so they get combined.
Would be CGRect an option to increase the reading rate and speed by setting boundaries?

how does google measure app works on android?

I can see that it can measure horizontal and vertical distances with +/-5% accuracy. I have a use case scenario in which I am trying to formulate an algorithm to detect distances between two points in an image or video. Any pointers to how it could be working would be very useful to me.
I don't think the source is available for the Android measure app, but it is ARCore based and I would expect it uses a combination of triangulation and knowledge it reads from the 'scene', using the Google ARCore term, it is viewing.
Like a human estimating distance to a point, by basic triangulation between two eyes and the point being looked at, a measurement app is able to look at multiple views of the scene and to measure using its sensors how far the device has moved between the different views. Even a small movement allows the same triangulation techniques be used.
The reason for mentioning all this is to highlight that you do not have the same tools or information available to you if you are analysing image or video files without any position or sensor data. Hence, the Google measure app may not be the best template for you to look to for your particular problem.

track user translation movement on ios using sensors for vr game?

I'm starting to experiment VR game development on ios. I learned a lot from google cardboard sdk. It can track user's head orientation, but it can not track user's translation. This shortage cause the use can only look at the virtual environment from a fix location (I known I can add auto walk to the game, but it's just not the same).
I'm searching around the internet, some says translation tracking just can't be done by using sensors, but it seems combining magnetometer, you can track user's movement path, like this example.
I also found a different method called SLAM, which use camera and opencv to do some feature tracking, then use feature point informations to calculate translation. Here's some example from 13th Lab. And google has a Tango Project which is more advanced, but it require hardware support.
I'm quite new to this kind of topic, so I 'm wondering, if I want to track not only the head orientation but also the head(or body) translation movement in my game, which method should I choose. SLAM seems pretty good, but it's also pretty difficult, and I think it will has a big impact on the cpu.
If you are familiar with this topic, please give some advice, thanks in advance!
If high accuracy is not important, you can try using the accelerometer to detect walking movement (basically a pedometer) and multiply it with an average human step width. Direction can be determined by the compass / magnetometer.
High accuracy tracking would likely require complex algorithms such as SLAM, though many such algorithms have already been implemented in VR libraries such as Vuforia or Kudan
Hi I disagree with you Zhiquiang Li
Look at this video made with kudan, the video is quite stable and moreover my smartphone is a quite old phone.
https://youtu.be/_7zctFw-O0Y

How does knocktounlock work?

I am trying to figure out how knocktounlock.com is able to detect "knocks" on the iPhone. I am sure they use the accelerometer to achieve this, however all my tries come up with false flags (if user moves, jumps, etc it sometimes fires)
Basically, I want to be able to detect when a user knocks/taps/smacks their phone (and be able to distinguish that from things that may also give a rise to the accelerometer). So I am looking for sharp high peeks. The device will be in the pocket so the movement of the device will not be very much.
I have tried things like high/low pass (not sure if there would be a better option)
This is a duplicate of this: Detect hard taps anywhere on iPhone through accelerometer But it has not received any answers.
Any help/suggestions would be awesome! Thanks.
EDIT: Looking for more thoughts before I accept the answer below. I did hear back from Knocktounlock and they use the fourth derivative (jounce) to get better values to then analyse. Which is interesting.
I would consider knock on the iPhone to be exactly same as bumping two phones with each other. Check out this Github Repo,
https://github.com/joejcon1/iOS-Accelerometer-visualiser
Build&Run the App on iPhone and check out the spikes on Green line. You can see the value of the spike clearly,
Knocking the iPhone:
As you can see the time of the actual spike is very short when you knock the phone. However the spike patterns are little different in Hard Knock and Soft knock but can be distinguished programmatically.
Now lets see the Accelerometer pattern when iPhone moves in space freely,
As you can see the Spikes are bell shaped that means the it takes a little time for spike value to return to 0.
By these pattern it will be easier to determine the knocking pattern. Good Luck.
Also, This will drain your battery out as the sensor will always be running and iPhone needs to persist connection with Mac via Bluetooth.
P.S.: Also check this answer, https://stackoverflow.com/a/7580185/753603
I think the way to go here is using pattern recognition with accelerometer data.
You could (write and) train a classifier (e.g. K-nearest neighbor) with data you gathered and that has been classified by hand. Neural networks are also an option. However, there will be many different ways to solve that problem. But there is probably no straightforward way for achieving this.
Some papers showing pattern recognition approaches to similar topics (activity, movement), like
http://www.math.unipd.it/~cpalazzi/papers/Palazzi-Accelerometer.pdf
(some more, but I am not allowed to post them with my reputation count. You can search for "pattern recognition accelerometer data")
There is also a master thesis about gesture recognition on the iPhone:
http://klingmann.ch/msc_thesis_marco_klingmann_iphone_gestures.pdf
In general you won't achieve 100% correct classification. Depending on the time/knowledge one has got the result will vary between good-usable and we-could-use-random-classification-instead.
Just a though, but It could be useful to add to the mix the output of the microphone to listen to really short, loud noises at the same time that a possible "knock" movement has been detected.
I am surprised that 4th derivative is needed, intuitively feels to me 3rd ("jerk", the derivative of acceleration) should be enough. It is a big hint what to keep eye on, though.
It seems quite simple to me: collect accelerometer data at high rates, plot on chart, observe. Calculate from that first derivative, plot&observe. Then rinse&repeat, derivative of the last one. Draw conclusions. I highly doubt you will need to do pattern recognition per se, clustering/classifiers/what-have-you - i think you will see very distinct peak on one of your charts, may only need to tune collection rate and smoothing.
It is more interesting to me how come you don't have to be running the KnockToUnlock app for this to work? And if it was running in the background, who left it run there for unlimited time. I dont think accel. qualifies for unlimited background run. And after some pondering, i am guessing the reason is that the app uses Bluetooth to connect Mac as accessory - and as such gets a pass from iOS to run in the background (and suck your battery, shhht)
To solve this problem you need to select the frequency. Tap (knock)
has a very high frequency, so you should chose the frequency of the
accelerometer is not lower than 50 Hz (perhaps even 100 Hz) for
quality tap detection in the case of noise from other movements.
The use of classifiers is necessary, but in order to save battery consumption you should not call a classifier very often.It should write a simple algorithm that would find only taps and situation similar to knoks and report that you program need to call a classifier.
Note the gyro signal, it also responds to knocks, besides the
gyroscope signal not be need separated from the constant component
and the gyroscope signal contains less noise.
That is a good video about the basics of working with smartphones sensors: http://talkminer.com/viewtalk.jsp?videoid=C7JQ7Rpwn2k#.UaTTYUC-2Sp .

Identify a specific sound on iOS

I'd like to be able to recognise a specific sound in an iOS application. I guess it would basically work like speech recognition in that it's fairly fuzzy, but it would only have to be for 1 specific sound.
I've done some quick FFT stuff to identify specific frequencies over a certain threshold and only when they're solo (ie, they're not surrounded by other frequencies) so I can identify individual tones pretty easily. I'm thinking it's just an extension of this, but comparing to an FFT data set of a recording of the sound, and compare say 0.1 second chunks over the length of the audio. And I would also have to account for variation in amplitude, a little in pitch and a little in time.
Can anyone point me to any pre-existing source that I could use to speed this process along? I can't seem to find anything usable. Or failing that, any ideas on how to get started on something like this?
Thanks very much
From your description it is not entirely clear what you want to do.
What is the "specific" sound like? Does it have high background noise?
Whats the specific recognizable feature (e.g. pitch, inhamonicity, timbre ...)?
Against which other "sounds" do you want to compare it?
Do you simply want to match an arbitrary sound spectrum against a "template sound"?
Is your sound percussive, melodic, speech, ...? Is it long, short ...?
Whats the frequency range you expect the best discriminability? Are the features invariant with time?
There is no "general" solution that works for everything. Speech recognition in itself is fairly complex and wont work well for abstract sounds whose discriminable frequencies are not in the e.g. MEL bands.
So in conclusion, you are leaving too many open questions to get a useful answer.
Only suggestion i can make based on the few informations is the following:
For the template sound:
1) Extract spectral peak positions from the power spectrum
2) Measure the standard deviation around the peaks and construct a gaussian from it
3) save the gaussians for later classification
For unkown sounds:
1) Extract spectral peak positions
2) Project those points onto the saved gaussians which leaves you with z-scores of the peak positions
3) With the computed z-scores you should be able to classify your template sound
Note: This is a very crude method which discriminates sounds according to their most powerful frequencies. Using the gaussians it leaves room for slight shifts in the most powerful frequencies.

Resources