Apple iPhone's now have a U1 chip which is described as "Ultra Wideband technology for spatial awareness". I've heard the technology can do time of flight calculations to determine range, but that doesn't answer how it determines relative position. How does the positioning work?
How does ultra-wideband work?
Travelling at the speed of light
The idea is to send radio waves from one module to another and measure the time of flight (TOF), or in other words, how long it takes. Because radio waves travel at the speed of light (c = 299792458 m/s) we can simply divide the time of flight by this speed to get the distance.
However, perhaps you've noticed that the radio waves travel fast. Very fast! In a single nanosecond, which is a billionth of a second, a wave has travelled almost 30 cm. So if we want to perform centimetre-accurate ranging, we have to be able to measure the timing very very accurately! So now the question is, how can we do this? How can we even measure the timing of.. a wave?
It's all about the bandwidth
In physics, there is something called the Heisenberg's uncertainty principle. The principle states that it is impossible to know both the frequency and the timing of a signal. Consider for example a sinusoid; a signal with a well-known frequency but a very ill-determined timing: the signal has no beginning or end. However, if we combine multiple sinusoidal signals with a slightly different frequency, we can create a 'pulse' with more defined timing, i.e., the peak of the pulse. This is seen in the following figure from Wikipedia that sequentially adds sinusoids to a signal to get a sharper pulse:
fig. 1
The range of frequencies that are used for this signal is called the bandwidth Δf. Using Heisenberg's uncertainty principle we can roughly determine the width Δt of the pulse, given a certain bandwidth Δf*:
ΔfΔt ≥ 1/4π
From this simple formula we can see that if we want a narrow pulse, which is necessary for accurate timing, we need to use a large bandwidth. For example, using the bandwidth Δf = 20 MHz available for wifi systems we obtain a pulse-width larger than Δt ≥ 4ns. Using the speed of light this relates to a pulse of 1.2 m 'long' which is too much for accurate ranging. Firstly, because it is hard to accurately determine the peak of such a wide pulse, and secondly because of reflections. Reflections come from the signals bouncing onto objects (walls, ceilings, closets, desks, etc..) within the surrounding environment. These reflections are also captured by the receiver and may overlap with the line-of-sight pulse which makes it very hard to measure the true peak of the pulse. With pulses of 4 ns wide, any object within 1.2 m of the receiver or the transmitter will cause an overlapping pulse. Because of this, ranging from wifi using time-of-flight is not suitable for indoor applications.
The ultra-wideband signals have typically a bandwidth of 500 MHz resulting in pulses of 0.16 ns wide! This timing resolution is so fine that at the receiver, we are able to distinguish several reflections of the signal. Hence, it remains possible to do accurate ranging even in places with a lot of reflectors, such as indoor environments.
fig. 2
Where to put all this bandwidth?
So we need a lot of bandwidth. Unfortunately, everybody wants it: in wireless communication systems, more bandwidth means faster downloads. However, if everybody would transmit signals on the same frequencies, all the signals would interfere and nobody would be able to receive anything meaningful. Because of this, the use of the frequency spectrum is highly regulated.
So how is it possible that UWB gets 500 MHz of precious bandwidth and most other systems have to satisfy with a lot less? Well, the UWB systems are only allowed to transmit at very low power (the power spectrum density must be below -41.3 dBm/MHz). This very strict power constraint means that a single pulse is not able to reach far: at the receiver, the pulse will likely be below the noise level. In order to solve this issue, a train of pulses is sent by the transmitter (typically 128 of 1024) to represent a single bit of information. At the receiver, the received pulses are accumulated and with enough pulses, the power of the 'accumulated pulse' will rise above the noise level and reception is possible. Hooray!
The IEEE 802.15.4 standard for Low-Rate Wireless Personal Area Networks has defined a number of UWB channels of at least 500MHz wide. Depending on the country you're in, some of these channels are either allowed or not. In general, the lower band channels (1 to 4) can be used in most countries under some limitations on update rate (using mitigation techniques). Channel 5 is accepted in most parts of the world without any limitations, with the notable exception of Japan. Purely from physics, the lower the channel center frequency, the better the range.
A note on the received signal strength (RSS)
There exists another way to measure the distance between two points by using radio waves, and that is by using the received signal strength. The further the two points are, the smaller the received signal strength will be. Hence, from this RSS-value, we should be able to derive the distance. Unfortunately, it's not that simple. In general, the received signal strength will be a combination of the power of all the reflections and not only of the desired line-of-sight. Because of this, it becomes very hard to relate the RSS value to the true distance. The figure below shows just how bad it is.
In this figure, the RSS value of a Bluetooth signal is measured at certain distances. At every distance, the error bars show how the RSS value behaves at the given distance. Clearly, the variation on the RSS value is very large which makes RSS unsuitable for accurate ranging or positioning.
Source
Related
I am running FFT algorithm to detect the music note played on a guitar.
The frequencies that I am interested are in the range 65.41Hz (C2) to 1864.7Hz (A#6).
If I set the sampling frequency of the input to 16KHz, the output of FFT would yield N points from 0Hz to 16KHz linearly. All the input I am interested would be in the first N/8 points approximately. The other N*7/8 points are of no use to me. They actually are decreasing my resolution.
From Nyquist's theory (https://en.wikipedia.org/wiki/Nyquist_frequency), the sampling frequency that is needed is just twice the maximum frequency one desires. In my case, this would be about 4KHz.
Is 4KHz really the ideal sampling frequency for a guitar tuning app?
Intuitively, one would feel a better sampling frequency would give you more accurate results. However, in this case, it seems having a lesser sampling frequency is better for improving the resolution. Regards.
You are confusing the pitch of a guitar note with spectral frequency. A guitar generates lots of overtones and harmonics at a much higher frequency than the pitch of a played note. Those higher harmonics and overtones, more than the possibly weak fundamental frequency in some cases, is what the human ear hears and interprets as the lower perceived pitch.
Any of the overtones and harmonics around or above 2 kHz that are not completely low pass filtered out before sampling at 4 kHz will cause aliasing and thus corruption of your sampled data and its spectrum.
If you want to create an accurate tuner, use a pitch estimation algorithm, not an FFT peak frequency bin estimator. And depending on which pitch estimation method you choose, a higher density of samples per unit time might allow finer accuracy or greater reliability under background noise or more prompt responsiveness.
Is 4KHz really the ideal sampling frequency for a guitar tuning app?
You've been mis-reading Nyquist's theorem if you ask it like that.
States that every sampling frequency above twice your maximum signal frequency will allow you to perfectly reconstruct your original signal. So there's no "ideal" frequency. Just a set of frequencies that are sufficient. What is ideal hence depends on a lot of other things: mainly, what your digitizer really supports (hint: most sound cards can do 44.1kHz, but not 4kHz), what kind of margin you want to have for filters etc to work on, and what kind of processing power you can spend (hint: modern smart phones, PCs and even pocket calculators don't really have a hard time processing a couple hundred kHz in real time).
Also note that #hotpaw2 is right, the harmonics are important, and are multiples of the base tone frequency.
However, in this case, it seems having a lesser sampling frequency is better for improving the resolution.
no. No matter where that comes from, it's wrong. Information theory's first and foremost result is that based upon more information, you can't make worse estimates. An oversampled signal is simply more information on the same signal.
Yes, if all you are interested in is frequencies up to 2 kHz then you only need a sampling frequency of 4 kHz. This should include an anti-aliasing filter in front of the ADC or any downconverter to prevent any higher frequency components from aliasing into a lower frequency.
If all you are interested in is specific frequencies (one or two) then you may want to look at the Goertzel algorithm which is more efficient than an FFT for a single frequency. Also, the chirp-Z transform can be used to effectively get a zoomed FFT (resulting in a higher resolution over a smaller bandwidth without the computational complexity of an FFT with the same resolution). You may want to check out this CZT tutorial
I want to select an optimal window for STFT for different audio signals. For a signal with frequency contents from 10 Hz to 300 Hz what will be the appropriate window size ? similarly for a signal with frequency contents 2000 Hz to 20000 Hz, what will be the optimal window size ?
I know that if a window size is 10 ms then this will give you a frequency resolution of about 100 Hz. But if the frequency contents in the signal lies from 100 Hz to 20000 HZ then 10 ms will be appropriate window size ? or we should go for some other window size because of 20000 Hz frequency content in a signal ?
I know the classic "uncertainty principle" of the Fourier Transform. You can either have high resolution in time or high resolution in frequency but not both at the same time. The window lengths allow you to trade off between the two.
Windowed analysis is designed for quasi-stationary signals. Quasi-stationary signals are signals which change over time but on some short period of time they might be considered stable.
One example of quasi-stationary signal is speech. Frequency components of this signal change over time when position of tongue and mouth changes, but on a short period of time approximately 0.01s they might be considered stable because tongue does not move this fast. The range of 0.01s is determined by our biology, we just can't move tongue faster than that.
Another example is music. When you touch the string you might consider it produces more or less stable sound for some short period of time. Usually 0.05 seconds. Within this period you might consider sound stable.
There might be other types of signals, for example, it might have frequency 10Ghz and be quasi-stationary of 1ms of time.
Windowed analysis allows to capture both stationary properties of signal and change of signal over time. Here it does not matter what sample rate does signal have, what frequency resolution do you need or what are the main harmonics. Are main harmonics near 100Hz or near 3000Hz. It is important on what period of time the signal is stationary and on what it can be considered as changing.
So for speech 25ms window is good just because speech is quasi-stationary on that range. For music you usually take longer windows because our fingers are moving slower than our mouth. You need to study your signal to decide optimal window length or you need to provide more information about it.
You need to specify your "optimality" criteria.
For a desired frequency resolution, you need a length or window size of roughly Fs/df (or from a fraction to twice that length or more, depending on S/N and window). However the length also needs to be similar to or shorter than the length of time during which your signal is stationary within your desired frequency resolution bounds. This may not be possible or known, thus requiring you to specify which criteria (df vs. dt) is more important for your desired "optimality".
If multiple window lengths meet your criteria, then the shortest length that is a multiple of very small primes is likely to be the most computationally efficient for the following FFTs within an STFT computational sequence.
Based on the sampling theorem, the sampling frequency needs to be larger than twice the highest frequency of the signal. And based on DFT (discrete Fourier Transform), we also know that the frequency resolution is the inverse of the entire signal duration, and the the entire frequency span is the inverse of the time resolution. Note that the frequency is simply the inverse of the period, thus the relationships go inversely with each other.
Frequency resolution = 1 / (overall time duration)
Frequency span = 1 / (time resolution)
Having said that, to process 20kHz audio signal, we need to sample in 40kHz. And if we want to get the frequency resolution down, say to 10Hz, we will need to sample the entire duration as long as 0.1Sec, which is 1/10Hz.
This is the reason we normally see that audio files are said to be 44k. Because the human hearing range is limited to 20kHz. To add some margin to it, we use 44k sampling frequency in stead of 40kHz.
I think the uncertainty principle goes with the fact that more localized signal in one domain, actually spread out on the other. For example, a pulse in time domain goes from negative infinity to positive infinite, i.e the entire stretch of the spectrum. And vice versa that the a single frequency signal in spectrum stretches from negative infinity to positive infinite in time domain. This is simply because we had to go forever in order to know if a signal could be a pure sinusoidal signal or not.
But for DFT, we can always get the frequency span if we sample twice the highest frequency of the signal, and the resolution we want if we sample the signal duration long enough. So, not so uncertain as the uncertainty principle says, as long as we know how many samples to take and how fast and how long to take them.
I am trying to post-process pulse train data. It is a 0 to 5V square wave, where the frequency of pulses corresponds to a physical measurement. During measuring, I may see anywhere from 100 pulses/second to 10,000 pulses per second. The duty cycle changes.
I wrote a pulse counter function to analyze the pulse data in the time domain, but the result was very noisy. I suspect that an FFT may be appropriate, though I have never really done anything like this before.
Has anybody done anything similar? What would be the broad methodology behind analyzing the pulse train in the frequency domain? Would it be best to take an FFT at specific time intervals (for instance every seconds worth of data)?
An FFT might be useful if your pulse train were stationary over some interval (the length of an FFT) and embedded in noise. Otherwise, why not just use at the reciprocal of the time between rising edges?
Lately I have been experimenting with audio and FFTs, specifically the Minim library in Processing (basically Java, not that its particularly important for this question). What I have come to understand is that with a buffer/sample size N and sample rate K, after performing a forward FFT, I will get N frequency bins (only N/2 usable data and in fact Minim only returns N/2 bins) linearly spaced representing the spectrum from 0 to K/2 HZ.
With Minim (as well as other typical FFT implementations) you wait to gather N samples, and then perform the forward transformation, then wait for N more samples, and so on. In order to get a reasonable frame-rate (for audio visualizations, beat detection, etc.), I must use a small sample size relative to the sampling frequency.
The problem with this, though, is that a small sample size results in a very low resolution for the low end of the spectrum when I compute logarithmically spaced averages (Since a bass octave is much narrower than a high pitched octave).
I was wondering if a possible way to squeeze more apparent resolution would be to perform FFTs more often than every N samples on a slightly larger sample size than I am currently using. (I.E. with input buffer of size 2048, every 100 samples, add those samples to the input buffer and remove the oldest 100 samples, and perform a FFT). It seems like this would possibly create a rolling-average type of affect (which I can live with) but I'm not too sure.
What would be the pros and cons of this approach? Are there any other ways I could increase my apparent resolution while still being able to do real-time visualization and analysis?
That approach goes by the name Short-time Fourier transform. You get all the answers to your question on wikipedia: https://en.wikipedia.org/wiki/Short-time_Fourier_transform
It works great in practice and you can even get better resolution out of it compared to what you would expect from a rolling window by using the phase difference between the fft's.
Here is one article that does pitch shifting of audio signals. The way how to get higher frequency resolution is well explained: http://www.dspdimension.com/admin/pitch-shifting-using-the-ft/
We use the approach you describe, which we call overlapping, to make sure all the rows of a spectral waterfall are filled in. Overlap can be used to provide spectra that are spaced as closely as a single sample interval.
The primary disadvantage is the extra processing to produce all those spectra.
On the positive side, while the time resolution of each spectra is still constrained by FFT size, looking at closely spaced adjacent spectra seems to provide a kind of a visual interpolation that, I think, lets you see the data with higher precision.
One common way this is done is to use multiple lengths of windowed FFTs on the same data, short FFTs for good time resolution, much longer FFTs for better frequency resolution of lower frequencies. Then the problem for visualization becomes picking the best FFT result out of several possible at each plot point (such as the highest contrast sub-block, etc.) and blending them attractively.
Most modern processors (in PCs and mobile phones, etc.) can easily do multiple lengths (dozens) of FFTs still in real-time for audio.
I've been playing around with Estimote Beacons for the last few days. I'm starting to doubt the effectiveness for iBeacons becouse of the high latency they have when it comes to determine a Beacons position.
When you move 2-3 meters it takes a few seconds until it gets the position right.
A usecase-scenario like, capturing a person walking by a beacon can be quite hard to determine.
Is it possible to manipulate the Update/Refresh Rate of a CLLocationManager or a CLBeaconRegion? e.g. every 0.1 Seconds
The reason that you are seeing it take so long for the iOS distance measurement (what they call "accuracy" in the CLBeacon object) to stabilize is because it is based on a running average of the RSSI -- the received signal strength. This signal strength measurement is inherently noisy and it bounces all around. That is why collecting multiple samples is necessary to smooth it out.
But because of this averaging, there is a lag. The most recent estimate is based on measurements from several seconds ago.
You cannot change the refresh rate of the CLLocationManager or the CLBeaconRegion, but you may be able to get an iBeacon that transmits more often than the 1s baseline. More transmissions gives you more RSSI measurements to work with, and it may help smooth out the noise. Because I am not sure of the internal implementation of CoreLocation, I am not positive whether a higher iBeacon transmission rate would reduce the noise on the distance measurement.
You can always calculate your own distance measurement, too, based on RSSI and the Power calibration value sent out by an iBeacon. If you use a single RSSI sample, then there will be no lag from averaging with earlier measurements, but you will have a high degree of variability. You basically have to accept a tradeoff between filtering out noise and filtering out old measurements based on different positions.
If you want to try your own calculation, you can use something like below (See my answer to this question for details).
distanceInMeters = 0.89976 * (rssi/txPower)**7.7095 + 0.111
You have to set realistic expectations on how accurate this estimate is going to be. Apple generally recommends that you don't use their "accuracy" measurement inside CLBeacon, unless it is in combination of other rougher measurements like "proximity" that bucketize the distance measurement into "immediate", "near" and "far" groupings.