I'm implementing Monte-Carlo localization for my robot that is given a map of the enviroment and its starting location and orientation. My approach is as follows:
Uniformly create 500 particles around the given position
Then at each step:
motion update all the particles with odometry (my current approach is newX=oldX+ odometryX(1+standardGaussianRandom), etc.)
assign weight to each particle using sonar data (formula is for each sensor probability*=gaussianPDF(realReading) where gaussian has the mean predictedReading)
return the particle with biggest probability as the location at this step
then 9/10 of new particles are resampled from the old ones according to weights and 1/10 is uniformly sampled around the predicted position
Now, I wrote a simulator for the robot's enviroment and here is how this localization behaves: http://www.youtube.com/watch?v=q7q3cqktwZI
I'm very afraid that for a longer period of time the robot may get lost. If add particles to a wider area, the robot gets lost even easier.
I expect a better performance. Any advice?
The biggest mistake is that you assume the particle with the highest weight to be your posterior state. This disagrees with the main idea of the particle filter.
The set of particles you updated with the odometry readings is your proposal distribution. By just taking the particle with the highest weight into account you completely ignore this distribution. It would be the same if you just randomly spread particles in the whole state space and then take the one particle that explains the sonar data the best. You only rely on the sonar reading and as sonar data is very noisy your estimate is very bad. A better approach is to assign a weight to each particle, normalize the weights, multiply each particle state by its weight and sum them up to obtain your posterior state.
For your resample step i would recommend to remove the random samples around the predicted state as they corrupt your proposal distribution. It is legit to generate random samples in order to recover from failures, but those are supposed to be spread over the whole state space and explicitly not around your current prediction.
Related
I have an electromagnetic sensor and electromagnetic field emitter.
The sensor will read power from the emitter. I want to predict the position of the sensor using the reading.
Let me simplify the problem, suppose the sensor and the emitter are in 1 dimension world where there are only position X (not X,Y,Z) and the emitter emits power as a function of distance squared.
From the painted image below, you will see that the emitter is drawn as a circle and the sensor is drawn as a cross.
E.g. if the sensor is 5 meter away from the emitter, the reading you get on the sensor will be 5^2 = 25. So the correct position will be either 0 or 10, because the emitter is at position 5.
So, with one emitter, I cannot know the exact position of the sensor. I only know that there are 50% chance it's at 0, and 50% chance it's at 10.
So if I have two emitters like the following image:
I will get two readings. And I can know exactly where the sensor is. If the reading is 25 and 16, I know the sensor is at 10.
So from this fact, I want to use 2 emitters to locate the sensor.
Now that I've explained you the situation, my problems are like this:
The emitter has a more complicated function of the distance. It's
not just distance squared. And it also have noise. so I'm trying to
model it using machine learning.
Some of the areas, the emitter don't work so well. E.g. if you are
between 3 to 4 meters away, the emitter will always give you a fixed
reading of 9 instead of going from 9 to 16.
When I train the machine learning model with 2 inputs, the
prediction is very accurate. E.g. if the input is 25,36 and the
output will be position 0. But it means that after training, I
cannot move the emitters at all. If I move one of the emitters to be
further apart, the prediction will be broken immediately because the
reading will be something like 25,49 when the right emitter moves to
the right 1 meter. And the prediction can be anything because the
model has not seen this input pair before. And I cannot afford to
train the model on all possible distance of the 2 emitters.
The emitters can be slightly not identical. The difference will
be on the scale. E.g. one of the emitters can be giving 10% bigger
reading. But you can ignore this problem for now.
My question is How do I make the model work when the emitters are allowed to move? Give me some ideas.
Some of my ideas:
I think that I have to figure out the position of both
emitters relative to each other dynamically. But after knowing the
position of both emitters, how do I tell that to the model?
I have tried training each emitter separately instead of pairing
them as input. But that means there are many positions that cause
conflict like when you get reading=25, the model will predict the
average of 0 and 10 because both are valid position of reading=25.
You might suggest training to predict distance instead of position,
that's possible if there is no problem number 2. But because
there is problem number 2, the prediction between 3 to 4 meters away
will be wrong. The model will get input as 9, and the output will be
the average distance 3.5 meters or somewhere between 3 to 4 meters.
Use the model to predict position
probability density function instead of predicting the position.
E.g. when the reading is 9, the model should predict a uniform
density function from 3 to 4 meters. And then you can combine the 2
density functions from the 2 readings somehow. But I think it's not
going to be that accurate compared to modeling 2 emitters together
because the density function can be quite complicated. We cannot
assume normal distribution or even uniform distribution.
Use some kind of optimizer to predict the position separately for each
emitter based on the assumption that both predictions must be the same. If
the predictions are not the same, the optimizer must try to move the
predictions so that they are exactly at the same point. Maybe reinforcement
learning where the actions are "move left", "move right", etc.
I told you my ideas so that it might evoke some ideas in you. Because this is already my best but it's not solving the issue elegantly yet.
So ideally, I would want the end-to-end model that are fed 2 readings, and give me position even when the emitters are moved. How would I go about that?
PS. The emitters are only allowed to move before usage. During usage or prediction, the model can assume that the emitter will not be moved anymore. This allows you to have time to run emitters position calibration algorithm before usage. Maybe this will be a helpful thing for you to know.
You're confusing memoizing a function with training a model; the former is merely recalling previous results; the latter is the province of AI. To train with two emitters, you need to give the useful input data and appropriate labels (right answers), and design your model topology such that it can be trained to a useful functional response fro cases it has never seen.
Let the first emitter be at position 0 by definition. Your data then consists of the position of the second emitter and the two readings. The label is the sensor's position. Your given examples would look like this:
emit2 read1 read2 sensor
1 25 36 0
1 25 16 5
2 25 49 0
1.5 25 9 5 distance of 3 < d < 4 always reads as 3^2
Since you know that you have an squared relationship in the underlying physics, you need to include quadratic capability in your model. To handle noise, you'll want some dampener capability, such as an extra node or two in a hidden layer after the first. For more complex relationships, you'll need other topologies, non-linear activation functions, etc.
Can you take it from there?
I'm currently working on a program in C++ in which I am computing the time varying FFT of a wav file. I have a question regarding plotting the results of an FFT.
Say for example I have a 70 Hz signal that is produced by some instrument with certain harmonics. Even though I say this signal is 70 Hz, it's a real signal and I assume will have some randomness in which that 70Hz signal varies. Say I sample it for 1 second at a sample rate of 20kHz. I realize the sample period probably doesn't need to be 1 second, but bear with me.
Because I now have 20000 samples, when I compute the FFT. I will have 20000 or (19999) frequency bins. Let's also assume that my sample rate in conjunction some windowing techniques minimize spectral leakage.
My question then: Will the FFT still produce a relatively ideal impulse at 70Hz? Or will there 'appear to be' spectral leakage which is caused by the randomness the original signal? In otherwords, what does the FFT look like of a sinusoid whose frequency is a random variable?
Some of the more common modulation schemes will add sidebands that carry the information in the modulation. Depending on the amount and type of modulation with respect to the length of the FFT, the sidebands can either appear separate from the FFT peak, or just "fatten" a single peak.
Your spectrum will appear broadened and this happens in the real world. Look e.g for the Voight profile, which is a Lorentizan (the result of an ideal exponential decay) convolved with a Gaussian of a certain width, the width being determined by stochastic fluctuations, e.g. Doppler effect on molecules in a gas that is being probed by a narrow-band laser.
You will not get an 'ideal' frequency peak either way. The limit for the resolution of the FFT is one frequency bin, (frequency resolution being given by the inverse of the time vector length), but even that (as #xvan pointed out) is in general broadened by the window function. If your window is nonexistent, i.e. it is in fact a square window of the length of the time vector, then you'll get spectral peaks that are convolved with a sinc function, and thus broadened.
The best way to visualize this is to make a long vector and plot a spectrogram (often shown for audio signals) with enough resolution so you can see the individual variation. The FFT of the overall signal is then the projection of the moving peaks onto the vertical axis of the spectrogram. The FFT of a given time vector does not have any time resolution, but sums up all frequencies that happen during the time you FFT. So the spectrogram (often people simply use the STFT, short time fourier transform) has at any given time the 'full' resolution, i.e. narrow lineshape that you expect. The FFT of the full time vector shows the algebraic sum of all your lineshapes and therefore appears broadened.
To sum it up there are two separate effects:
a) broadening from the window function (as the commenters 1 and 2 pointed out)
b) broadening from the effect of frequency fluctuation that you are trying to simulate and that happens in real life (e.g. you sitting on a swing while receiving a radio signal).
Finally, note the significance of #xvan's comment : phi= phi(t). If the phase angle is time dependent then it has a derivative that is not zero. dphi/dt is a frequency shift, so your instantaneous frequency becomes f0 + dphi/dt.
Is it possible to decorrelate accelerometer data in real-time? If so, how is it done?
Background:
My application is receiving (X,Y,Z) accelerometer data in real-time (sample rate is 6.75Hz). The sensor is moving in a periodic motion but the motion is not necessarily along only one axis. The 3 signals x(t), y(t) and z(t) are therefore slightly correlated and I would like to know if I can find a rotation matrix (in real time) which can be used to rotate the measured (x,y,z) into a new vector (x*,y*,z*) so that the entire motion is along the z-axis?
I would like to implement the algorithm in C.
Thanks.
What you're trying to do is generally called "principal component analysis". The Wikipedia article is pretty good:
https://en.wikipedia.org/wiki/Principal_component_analysis
For static data you generally use the eigenvectors of the covariance matrix as your new coordinate basis.
PCA in real time is doable, but not super easy. See, for example: http://www.bio-conferences.org/articles/bioconf/pdf/2011/01/bioconf_skills_00055.pdf
I'd like to first of all emphasize that Matt Timmermans' answer has done exactly what people are actually doing when classifying accelerometer data from clinical studies (a project I worked on).
Then: you're observing a sampled signal. In general, if you have a sensor that gives you samples at a rate of 6.75Hz, the highest frequency of a signal you can detect is 6.75Hz/2 = 3.375Hz. Everything that has a frequency higher than that will inherently be aliased back and look like it was something with a frequency f with 0<=f<3.375Hz. If you've not considered this, please go and read up on the Nyquist–Shannon sampling theorem. Especially: shield your sensors (however you do that, e.g. by employing dampeners) from all input above that limit, otherwise your measurements might be worth very little or even nothing. If your sensor does this internally (that's absolutely possible, there are enough accelerometers with analog low pass filters), this has been taken care of. However, document that characteristics of your sensor.
Now, your case is a little bit easier because you know pretty well that your whole observation is going to be periodic, and it's measured along three orthogonal axis.
In this case, just doing three discrete Fourier transforms at once, extracting the "strongest" spectral component over all three channels, and finding the phase of that spectral component (which is but the complex argument of that DFT bin) in the two others would give you something that you can map to a periodic movement around a specific axis in 3D space. If you want to, remove these value (set the bins to 0), and search for strongest component again etc.
Discrete cosine transforms can be done in staggering speed nowadays. with 6.75Hz, no PC in this world will ever get into trouble when you try this while you receive further samples. It's a hilariously low sampling rate.
Another, more elegant (read: you need less samples to compute this) would be using a parametric estimator; in your case, a direction-of-arrival sensor from the world of RF technology with multiple antennas would, as far as I can think, map directly to detection of rotational axis. The classical algorithms here are MUSIC and ESPRIT, and for your case (limited, known amount of oscillating parts), ESPRIT might be the better choice.
I have some geographical trajectories sampled to analyze, and I calculated the histogram of data in spatial and temporal dimension, which yielded a time domain based feature for each spatial element. I want to perform a discrete FFT to transform the time domain based feature into frequency domain based feature (which I think maybe more robust), and then do some classification or clustering algorithms.
But I'm not sure using what descriptor as frequency domain based feature, since there are amplitude spectrum, power spectrum and phase spectrum of a signal and I've read some references but still got confused about the significance. And what distance (similarity) function should be used as measurement when performing learning algorithms on frequency domain based feature vector(Euclidean distance? Cosine distance? Gaussian function? Chi-kernel or something else?)
Hope someone give me a clue or some material that I can refer to, thanks~
Edit
Thanks to #DrKoch, I chose a spatial element with the largest L-1 norm and plotted its log power spectrum in python and it did show some prominent peaks, below is my code and the figure
import numpy as np
import matplotlib.pyplot as plt
sp = np.fft.fft(signal)
freq = np.fft.fftfreq(signal.shape[-1], d = 1.) # time sloth of histogram is 1 hour
plt.plot(freq, np.log10(np.abs(sp) ** 2))
plt.show()
And I have several trivial questions to ask to make sure I totally understand your suggestion:
In your second suggestion, you said "ignore all these values."
Do you mean the horizontal line represent the threshold and all values below it should be assigned to value zero?
"you may search for the two, three largest peaks and use their location and probably widths as 'Features' for further classification."
I'm a little bit confused about the meaning of "location" and "width", does "location" refer to the log value of power spectrum (y-axis) and "width" refer to the frequency (x-axis)? If so, how to combine them together as a feature vector and compare two feature vector of "a similar frequency and a similar widths" ?
Edit
I replaced np.fft.fft with np.fft.rfft to calculate the positive part and plot both power spectrum and log power spectrum.
code:
f, axarr = plt.subplot(2, sharex = True)
axarr[0].plot(freq, np.abs(sp) ** 2)
axarr[1].plot(freq, np.log10(np.abs(sp) ** 2))
plt.show()
figure:
Please correct me if I'm wrong:
I think I should keep the last four peaks in first figure with power = np.abs(sp) ** 2 and power[power < threshold] = 0 because the log power spectrum reduces the difference among each component. And then use the log spectrum of new power as feature vector to feed classifiers.
I also see some reference suggest applying a window function (e.g. Hamming window) before doing fft to avoid spectral leakage. My raw data is sampled every 5 ~ 15 seconds and I've applied a histogram on sampling time, is that method equivalent to apply a window function or I still need apply it on the histogram data?
Generally you should extract just a small number of "Features" out of the complete FFT spectrum.
First: Use the log power spec.
Complex numbers and Phase are useless in these circumstances, because they depend on where you start/stop your data acquisiton (among many other things)
Second: you will see a "Noise Level" e.g. most values are below a certain threshold, ignore all these values.
Third: If you are lucky, e.g. your data has some harmonic content (cycles, repetitions) you will see a few prominent Peaks.
If there are clear peaks, it is even easier to detect the noise: Everything between the peaks should be considered noise.
Now you may search for the two, three largest peaks and use their location and probably widths as "Features" for further classification.
Location is the x-value of the peak i.e. the 'frequency'. It says something how "fast" your cycles are in the input data.
If your cycles don't have constant frequency during the measuring intervall (or you use a window before caclculating the FFT), the peak will be broader than one bin. So this widths of the peak says something about the 'stability' of your cycles.
Based on this: Two patterns are similar if the biggest peaks of both hava a similar frequency and a similar widths, and so on.
EDIT
Very intersiting to see a logarithmic power spectrum of one of your examples.
Now its clear that your input contains a single harmonic (periodic, oscillating) component with a frequency (repetition rate, cycle-duration) of about f0=0.04.
(This is relative frquency, proprtional to the your sampling frequency, the inverse of the time beetween individual measurment points)
Its is not a pute sine-wave, but some "interesting" waveform. Such waveforms produce peaks at 1*f0, 2*f0, 3*f0 and so on.
(So using an FFT for further analysis turns out to be very good idea)
At this point you should produce spectra of several measurements and see what makes a similar measurement and how differ different measurements. What are the "important" features to distinguish your mesurements? Thinks to look out for:
Absolute amplitude: Height of the prominent (leftmost, highest) peaks.
Pitch (Main cycle rate, speed of changes): this is position of first peak, distance between consecutive peaks.
Exact Waveform: Relative amplitude of the first few peaks.
If your most important feature is absoulute amplitude, you're better off with calculating the RMS (root mean square) level of our input signal.
If pitch is important, you're better off with calculationg the ACF (auto-correlation function) of your input signal.
Don't focus on the leftmost peaks, these come from the high frequency components in your input and tend to vary as much as the noise floor.
Windows
For a high quality analyis it is importnat to apply a window to the input data before applying the FFT. This reduces the infulens of the "jump" between the end of your input vector ant the beginning of your input vector, because the FFT considers the input as a single cycle.
There are several popular windows which mark different choices of an unavoidable trade-off: Precision of a single peak vs. level of sidelobes:
You chose a "rectangular window" (equivalent to no window at all, just start/stop your measurement). This gives excellent precission of your peaks which now have a width of just one sample. Your sidelobes (the small peaks left and right of your main peaks) are at -21dB, very tolerable given your input data. In your case this is an excellent choice.
A Hanning window is a single cosine wave. It makes your peaks slightly broader but reduces side-lobe levels.
The Hammimg-Window (cosine-wave, slightly raised above 0.0) produces even broader peaks, but supresses side-lobes by -42 dB. This is a good choice if you expect further weak (but important) components between your main peaks or generally if you have complicated signals like speech, music and so on.
Edit: Scaling
Correct scaling of a spectrum is a complicated thing, because the values of the FFT lines depend on may things like sampling rate, lenght of FFT, window, and even implementation details of the FFT algorithm (there exist several different accepted conventions).
After all, the FFT should show the underlying conservation of energy. The RMS of the input signal should be the same as the RMS (Energy) of the spectrum.
On the other hand: if used for classification it is enough to maintain relative amplitudes. As long as the paramaters mentioned above do not change, the result can be used for classification without further scaling.
How do i get frequency using FFT? What's the right procedure and codes?
Pitch detection typically involves measuring the interval between harmonics in the power spectrum. The power spectrum is obtained form the FFT by taking the magnitude of the first N/2 bins (sqrt(re^2 + im^2)). However there are more sophisticated techniques for pitch detection, such as cepstral analysis, where we take the FFT of the log of the power spectrum, in order to identify periodicity in the spectral peaks.
A sustained note of a musical instrument is a periodic signal, and our friend Fourier (the second "F" in "FFT") tells us that any periodic signal can be constructed by adding a set of sine waves (generally with different amplitudes, frequencies, and phases). The fundamental is the lowest frequency component and it corresponds to pitch; the remaining components are overtones and are multiples of the fundamental's frequency. It is the relative mixture of fundamental and overtones that determines timbre, or the character of an instrument. A clarinet and a trumpet playing in unison sound "in tune" because they share the same fundamental frequency, however, they are individually identifiable because of their differing timbre (overtone mixture).
For your problem, you could sample the trumpet over a time window, calculate the FFT (which decomposes the sequence of samples into its constituent digital frequencies), and then assert that the pitch is the frequency of the bin with the greatest magnitude. If you desire, this could then be trivially quantized to the nearest musical half step, like E flat. (Lookup FFT on Wikipedia if you don't understand the relationship between the sampling frequency and the resultant frequency bins, or if you don't understand the detriment of having too low a sampling frequency.) This will probably meet your needs because the fundamental component usually has greater energy than any other component. The longer the window, the greater the pitch accuracy because the bin centers will become more closely spaced in frequency. However, if the window is so long that the trumpet is changing its pitch appreciably over the duration of the window, then the technique's effectiveness will break down considerably.
DansTuner is my open source project to solve this problem. I am in fact a trumpet player. It has pitch detection code lifted from Audacity.
ia added this org.apache.commons.math.transform.FastFourierTransforme package to the project and its works perfectly
Here is a short blog article on non-parametric techniques to estimating the PSD (power spectral density) along with some more detailed links. This might get you started in estimating the PSD - and then finding the pitch.