How does knocktounlock work? - ios

I am trying to figure out how knocktounlock.com is able to detect "knocks" on the iPhone. I am sure they use the accelerometer to achieve this, however all my tries come up with false flags (if user moves, jumps, etc it sometimes fires)
Basically, I want to be able to detect when a user knocks/taps/smacks their phone (and be able to distinguish that from things that may also give a rise to the accelerometer). So I am looking for sharp high peeks. The device will be in the pocket so the movement of the device will not be very much.
I have tried things like high/low pass (not sure if there would be a better option)
This is a duplicate of this: Detect hard taps anywhere on iPhone through accelerometer But it has not received any answers.
Any help/suggestions would be awesome! Thanks.
EDIT: Looking for more thoughts before I accept the answer below. I did hear back from Knocktounlock and they use the fourth derivative (jounce) to get better values to then analyse. Which is interesting.

I would consider knock on the iPhone to be exactly same as bumping two phones with each other. Check out this Github Repo,
https://github.com/joejcon1/iOS-Accelerometer-visualiser
Build&Run the App on iPhone and check out the spikes on Green line. You can see the value of the spike clearly,
Knocking the iPhone:
As you can see the time of the actual spike is very short when you knock the phone. However the spike patterns are little different in Hard Knock and Soft knock but can be distinguished programmatically.
Now lets see the Accelerometer pattern when iPhone moves in space freely,
As you can see the Spikes are bell shaped that means the it takes a little time for spike value to return to 0.
By these pattern it will be easier to determine the knocking pattern. Good Luck.
Also, This will drain your battery out as the sensor will always be running and iPhone needs to persist connection with Mac via Bluetooth.
P.S.: Also check this answer, https://stackoverflow.com/a/7580185/753603

I think the way to go here is using pattern recognition with accelerometer data.
You could (write and) train a classifier (e.g. K-nearest neighbor) with data you gathered and that has been classified by hand. Neural networks are also an option. However, there will be many different ways to solve that problem. But there is probably no straightforward way for achieving this.
Some papers showing pattern recognition approaches to similar topics (activity, movement), like
http://www.math.unipd.it/~cpalazzi/papers/Palazzi-Accelerometer.pdf
(some more, but I am not allowed to post them with my reputation count. You can search for "pattern recognition accelerometer data")
There is also a master thesis about gesture recognition on the iPhone:
http://klingmann.ch/msc_thesis_marco_klingmann_iphone_gestures.pdf
In general you won't achieve 100% correct classification. Depending on the time/knowledge one has got the result will vary between good-usable and we-could-use-random-classification-instead.

Just a though, but It could be useful to add to the mix the output of the microphone to listen to really short, loud noises at the same time that a possible "knock" movement has been detected.

I am surprised that 4th derivative is needed, intuitively feels to me 3rd ("jerk", the derivative of acceleration) should be enough. It is a big hint what to keep eye on, though.
It seems quite simple to me: collect accelerometer data at high rates, plot on chart, observe. Calculate from that first derivative, plot&observe. Then rinse&repeat, derivative of the last one. Draw conclusions. I highly doubt you will need to do pattern recognition per se, clustering/classifiers/what-have-you - i think you will see very distinct peak on one of your charts, may only need to tune collection rate and smoothing.
It is more interesting to me how come you don't have to be running the KnockToUnlock app for this to work? And if it was running in the background, who left it run there for unlimited time. I dont think accel. qualifies for unlimited background run. And after some pondering, i am guessing the reason is that the app uses Bluetooth to connect Mac as accessory - and as such gets a pass from iOS to run in the background (and suck your battery, shhht)

To solve this problem you need to select the frequency. Tap (knock)
has a very high frequency, so you should chose the frequency of the
accelerometer is not lower than 50 Hz (perhaps even 100 Hz) for
quality tap detection in the case of noise from other movements.
The use of classifiers is necessary, but in order to save battery consumption you should not call a classifier very often.It should write a simple algorithm that would find only taps and situation similar to knoks and report that you program need to call a classifier.
Note the gyro signal, it also responds to knocks, besides the
gyroscope signal not be need separated from the constant component
and the gyroscope signal contains less noise.
That is a good video about the basics of working with smartphones sensors: http://talkminer.com/viewtalk.jsp?videoid=C7JQ7Rpwn2k#.UaTTYUC-2Sp .

Related

Comparing pitches with digital audio

I work on application which will compare musical notes with digital audio. My first idea was analyzes wav file (or sound in real-time) with some polyphonic pitch algorithms and gets notes and chords from this file and subsequently compared with notes in dataset. I went through a lot of pages and it seems to be a lot of hard work because existing implementations and algorithms are mainly/only focus on monophonic sound.
Now, I got the idea to do this in the opposite way. In dataset I have for example note: A4 or better example chord: A4 B4 H4. And my idea is make some wave (or whatever I don't know what) from this note or chord and then compared with piece of digital audio.
Is this good idea? Is it better/harder solution?
If yes can you recommend me how to do it?
The easiest solution is to take the FFT (Fast Fourier Transform) of the waveform: all the notes (and their harmonics) will be present in the signal. You then look for the frequencies that correspond to notes, and there's your solution.
Note - in order to get decent frequency resolution you need a sufficiently long sample, and high enough sample rate. But try it and you will see.
Here are a couple of screen shots of an app called SpectraWave that I took sitting in front of my piano. The first is of middle A (f = 440 Hz as you know):
and the second is of an A-minor chord (as you can see, my middle finger is a little stronger and the C is showing up as the note with the greatest volume). The harmonics will soon make it hard to see more than just a few notes…
Your "solution" most likely makes matching even more difficult, since you will have no idea what waveform to make for each note. Most musical instruments and voices not only produce waveforms that are significantly different from single sinewaves or any other familiar waveform, but these waveforms evolve over time. Thus guessing the proper
waveform to use for each note for a match is extremely improbable.

How to Handle Occlusion and Fragmentation

I am trying to implement a people counting system using computer vision for uni project. Currently, my method is:
Background subtraction using MOG2
Morphological filter to remove noise
Track blob
Count blob passing a specified region (a line)
The problem is if people come as group, my method only counts one people. From my readings, I believe this is what called as occlusion. Another problem is when people looks similar to background (use dark clothing and passing a black pillar/wall), the blob is separated while it is actually one person.
From what I read, I should implement a detector + tracker (e.g. detect human using HOG). But my detection result is poor (e.g. 50% false positives with 50% hit rate; using OpenCV human detector and my own trained detector) so I am not convinced to use the detector as basis for tracking. Thanks for your answers and time for reading this post!
Tracking people in video surveillance sequences is still an open problem in the research community. However particule filters (PF) (aka sequential monte-carlo) gives good results towards occlusion and complex scene. You should read this. There is also extra links to example source code after biblio.
An advantage on using PF is the gain in computational time towards tracking by detection (only).
If you go this way, feel free to ask for better understanding about the maths behind the PF.
There is no single "good" answer to this as handling occlusion (and background substraction) are still open problems! There are several pointers that can be given that might help you along with your project.
You want to detect if a "blob" is one person or a group of people. There are several things you could do to handle this.
Use multiple cameras (it's unlikely that a group of people is detected as a single blob from all angles)
Try to detect parts of the human body. If you detect two heads on a single blob, there are multiple people. Same can be said for 3 legs, 5 shoulders, etc.
On the area of tracking a "lost" person (one walking behind another object), is to extrapolate it's position. You know that a person can only move so much in between frames. By holding this into account, you know that it's impossible for a user to be detected in the middle of your image and then suddenly disappear. After several frames of not seeing that person, you can discard the observation, as the person might have had enough time to move away.

Is there a more accurate way to detect ball to ball collisions with Sphero API?

I'm writing a game for sphero, the robotic ball (having issues with their forums, can't seem to ask a question). I'm trying to do ball to ball collision detection for 2 or more players.
first of all the they give a sample here:
https://github.com/orbotix/Sphero-iOS-SDK/tree/master/samples/CollisionDetection
The thresholds they supply are WAY too sensitive, on a wooden floor it triggers all the time. Forgetting that for the minute, I have to use the impact timestamp from both devices to see if they have triggered collisions at roughly the same time.
My issue is when subtracting timestamps, in some cases i'm getting very wide variations and i think the difference is quite long to begin with. I'm storing several timestamps so I don't miss the correct one and I tried playing with the dead time to see if lowering it would help.
Most commonly subtracting 2 NSTimeIntervals i get a difference between 0.68 and 0.72 (I would have expected 0.01 level reactions). So Im checking if the difference is under 0.72, 3 times in a row i got between 0.72 and 0.73, several times I got 1.5, 2.6, 1.1 and even 3.8.
It doesn't seem as though its reliable. The documentation says this time comes from the iPhones reference. Both devices are synced to get time automatically, so they are as close to each other as possible.
Has anyone tried this and come up with a reliable solution, that doesn't involve keeping one ball still ?
I did a significant amount of research on the subject of ball to ball collisions when I started as a developer for Orbotix, the makers of Sphero.
This is a very complicated problem to solve. The closest I came to making this work (for a infected zombies research game) was about 80% accuracy for detecting which ball hit which ball with a sample size of 3. The more balls you would put into the game, the lower the accuracy would become. Hence, we decided to eliminate the issue by having one ball required to stop moving before it was vulnerable, like in Sphero TAG.
There are a few factors that limit this capability, and it seems you have discovered them. I believe the biggest issue is that collision detection has poor performance while the ball is driving. Especially on a rough surface or when the ball makes quick jerky movements. This alone causes majors problems when coupled with the dead time.
I was able to get collision timestamps to within 50 ms on average. Are you taking into consideration the wifi latency in transmitting the packets between phones?
The solution is something you probably don't want to hear, but you should tweak your game play to work within the capabilities of collision detection. That is, the ball driving really slow when it can be contacted, or even come to a stop like in TAG. Ask yourself, how can I make this fun without ball to ball collisions?
I just want to say, first, that we are moving our developer support forum here, to StackOverflow, and that's why you can't post on the forums. So, you did the right thing, Simon, by coming to StackOverflow, and you should be proud.
We just changed the forums to redirect here instead of leaving people confused.
The timestamps are generated by Sphero. But they only make sense is you're using the Poll Packet Times command to generate delay and offset values. Please refer to DID 00h, CID 50h in the API commands document.
That being said, collision detection is an ever evolving technology from our end. We employ a cleverly coded DFT frequency transform on a sliding data window real-time inside the robot. The parameters allow tuning to the surface you're running on; there are no universal settings. If you're obtaining too many false positives then please experiment. If you have ideas to improve the algorithm then contact us directly and maybe we can include it as a new filtering method. We're always open to clever ideas!
You could sync the internal timers of each Sphero at the beginning of the game. These can be matched against a synced timer within each host phone. Clocks may be different, but a millisecond is a millisecond. You could also lower the threshold of the collision detection, thus making it so that the 'event' (damage, infection, etc.) can only occur if the 'attacking' Sphero is moving at a certain speed. Or a variation thereof.

iOS ways or tips to improve quality of GPS tracking

As we all know tracking the position of a moving iOS device can be very handy. We have seen that in various sport/fitness applications. Due to the fact that sometimes the position determination gets inaccurate due to poor signal or other interferences I would like to ask if there are any common ways/tips/algorithms to achieve an almost seamless gps track?
I would be very thankful for any references?
Cheers,
anka
This would actually seem like a problem more fit for a Kalman filter. In short: you make educated guesses based on the last known position and velocity, then update your guess when new data comes in. Uncertainty comes into play as well.
It doesn’t have to be just position/velocity, you can do this with any number of scalar variables.
This might be a good application for a particle filter.

Identify a specific sound on iOS

I'd like to be able to recognise a specific sound in an iOS application. I guess it would basically work like speech recognition in that it's fairly fuzzy, but it would only have to be for 1 specific sound.
I've done some quick FFT stuff to identify specific frequencies over a certain threshold and only when they're solo (ie, they're not surrounded by other frequencies) so I can identify individual tones pretty easily. I'm thinking it's just an extension of this, but comparing to an FFT data set of a recording of the sound, and compare say 0.1 second chunks over the length of the audio. And I would also have to account for variation in amplitude, a little in pitch and a little in time.
Can anyone point me to any pre-existing source that I could use to speed this process along? I can't seem to find anything usable. Or failing that, any ideas on how to get started on something like this?
Thanks very much
From your description it is not entirely clear what you want to do.
What is the "specific" sound like? Does it have high background noise?
Whats the specific recognizable feature (e.g. pitch, inhamonicity, timbre ...)?
Against which other "sounds" do you want to compare it?
Do you simply want to match an arbitrary sound spectrum against a "template sound"?
Is your sound percussive, melodic, speech, ...? Is it long, short ...?
Whats the frequency range you expect the best discriminability? Are the features invariant with time?
There is no "general" solution that works for everything. Speech recognition in itself is fairly complex and wont work well for abstract sounds whose discriminable frequencies are not in the e.g. MEL bands.
So in conclusion, you are leaving too many open questions to get a useful answer.
Only suggestion i can make based on the few informations is the following:
For the template sound:
1) Extract spectral peak positions from the power spectrum
2) Measure the standard deviation around the peaks and construct a gaussian from it
3) save the gaussians for later classification
For unkown sounds:
1) Extract spectral peak positions
2) Project those points onto the saved gaussians which leaves you with z-scores of the peak positions
3) With the computed z-scores you should be able to classify your template sound
Note: This is a very crude method which discriminates sounds according to their most powerful frequencies. Using the gaussians it leaves room for slight shifts in the most powerful frequencies.

Resources