How do I combine two electromagnetic readings to predict the position of a sensor? - machine-learning

I have an electromagnetic sensor and electromagnetic field emitter.
The sensor will read power from the emitter. I want to predict the position of the sensor using the reading.
Let me simplify the problem, suppose the sensor and the emitter are in 1 dimension world where there are only position X (not X,Y,Z) and the emitter emits power as a function of distance squared.
From the painted image below, you will see that the emitter is drawn as a circle and the sensor is drawn as a cross.
E.g. if the sensor is 5 meter away from the emitter, the reading you get on the sensor will be 5^2 = 25. So the correct position will be either 0 or 10, because the emitter is at position 5.
So, with one emitter, I cannot know the exact position of the sensor. I only know that there are 50% chance it's at 0, and 50% chance it's at 10.
So if I have two emitters like the following image:
I will get two readings. And I can know exactly where the sensor is. If the reading is 25 and 16, I know the sensor is at 10.
So from this fact, I want to use 2 emitters to locate the sensor.
Now that I've explained you the situation, my problems are like this:
The emitter has a more complicated function of the distance. It's
not just distance squared. And it also have noise. so I'm trying to
model it using machine learning.
Some of the areas, the emitter don't work so well. E.g. if you are
between 3 to 4 meters away, the emitter will always give you a fixed
reading of 9 instead of going from 9 to 16.
When I train the machine learning model with 2 inputs, the
prediction is very accurate. E.g. if the input is 25,36 and the
output will be position 0. But it means that after training, I
cannot move the emitters at all. If I move one of the emitters to be
further apart, the prediction will be broken immediately because the
reading will be something like 25,49 when the right emitter moves to
the right 1 meter. And the prediction can be anything because the
model has not seen this input pair before. And I cannot afford to
train the model on all possible distance of the 2 emitters.
The emitters can be slightly not identical. The difference will
be on the scale. E.g. one of the emitters can be giving 10% bigger
reading. But you can ignore this problem for now.
My question is How do I make the model work when the emitters are allowed to move? Give me some ideas.
Some of my ideas:
I think that I have to figure out the position of both
emitters relative to each other dynamically. But after knowing the
position of both emitters, how do I tell that to the model?
I have tried training each emitter separately instead of pairing
them as input. But that means there are many positions that cause
conflict like when you get reading=25, the model will predict the
average of 0 and 10 because both are valid position of reading=25.
You might suggest training to predict distance instead of position,
that's possible if there is no problem number 2. But because
there is problem number 2, the prediction between 3 to 4 meters away
will be wrong. The model will get input as 9, and the output will be
the average distance 3.5 meters or somewhere between 3 to 4 meters.
Use the model to predict position
probability density function instead of predicting the position.
E.g. when the reading is 9, the model should predict a uniform
density function from 3 to 4 meters. And then you can combine the 2
density functions from the 2 readings somehow. But I think it's not
going to be that accurate compared to modeling 2 emitters together
because the density function can be quite complicated. We cannot
assume normal distribution or even uniform distribution.
Use some kind of optimizer to predict the position separately for each
emitter based on the assumption that both predictions must be the same. If
the predictions are not the same, the optimizer must try to move the
predictions so that they are exactly at the same point. Maybe reinforcement
learning where the actions are "move left", "move right", etc.
I told you my ideas so that it might evoke some ideas in you. Because this is already my best but it's not solving the issue elegantly yet.
So ideally, I would want the end-to-end model that are fed 2 readings, and give me position even when the emitters are moved. How would I go about that?
PS. The emitters are only allowed to move before usage. During usage or prediction, the model can assume that the emitter will not be moved anymore. This allows you to have time to run emitters position calibration algorithm before usage. Maybe this will be a helpful thing for you to know.

You're confusing memoizing a function with training a model; the former is merely recalling previous results; the latter is the province of AI. To train with two emitters, you need to give the useful input data and appropriate labels (right answers), and design your model topology such that it can be trained to a useful functional response fro cases it has never seen.
Let the first emitter be at position 0 by definition. Your data then consists of the position of the second emitter and the two readings. The label is the sensor's position. Your given examples would look like this:
emit2 read1 read2 sensor
1 25 36 0
1 25 16 5
2 25 49 0
1.5 25 9 5 distance of 3 < d < 4 always reads as 3^2
Since you know that you have an squared relationship in the underlying physics, you need to include quadratic capability in your model. To handle noise, you'll want some dampener capability, such as an extra node or two in a hidden layer after the first. For more complex relationships, you'll need other topologies, non-linear activation functions, etc.
Can you take it from there?

Related

What is wrong with my approach of using MLP to make a chess engine?

I’m making a chess engine using machine learning, and I’m experiencing problems debugging it. I need help figuring out what is wrong with my program, and I would appreciate any help.
I made my research and borrowed ideas from multiple successful projects. The idea is to use reinforcement learning to teach NN to differentiate between strong and weak positions.
I collected 3 million games with Elo over 2000 and used my own method to label them. After researching hundreds of games, I found out, that it’s safe to assume that in the last 10 turns of any game, the balance doesn’t change, and the winning side has a strong advantage. So I picked positions from the last 10 turns and made two labels: one for a win for white and zero for black. I didn’t include any draw positions. To avoid bias, I have picked even numbers of positions labeled with wins for both sides and even number of positions for both sides with the next turn.
Each position I represented by a vector with the length of 773 elements. Every piece on every square of a chess board, together with castling rights and a next turn, I coded with ones and zeros. My sequential model has an input layer with 773 neurons and an output layer with one single neuron. I have used a three hidden layer deep MLP with 1546, 500 and 50 hidden units for layers 1, 2, and 3 respectively with dropout regularization value of 20% on each. Hidden layers are connected with the non- linear activation function ReLU, while the final output layer has a sigmoid output. I used binary crossentropy loss function and the Adam algorithm with all default parameters, except for the learning rate, which I set to 0.0001.
I used 3 percent of the positions for validation. During the first 10 epochs, validation accuracy gradually went up from 90 to 92%, just one percent behind training accuracy. Further training led to overfitting, with training accuracy going up, and validation accuracy going down.
I tested the trained model on multiple positions by hand, and got pretty bad results. Overall the model can predict which side is winning, if that side has more pieces or pawns close to a conversion square. Also it gives the side with a next turn a small advantage (0.1). But overall it doesn’t make much sense. In most cases it heavily favors black (by ~0.3) and doesn’t properly take into account the setup. For instance, it labels the starting position as ~0.0001, as if the black side has almost 100% chance to win. Sometimes irrelevant transformation of a position results in unpredictable change of the evaluation. One king and one queen from each side usually is viewed as lost position for white (0.32), unless black king is on certain square, even though it doesn’t really change the balance on the chessboard.
What I did to debug the program:
To make sure I have not made any mistakes, I analyzed, how each position is being recorded, step by step. Then I picked a dozen of positions from the final numpy array, right before training, and converted it back to analyze them on a regular chess board.
I used various numbers of positions from the same game (1 and 6) to make sure, that using too many similar positions is not the cause for the fast overfitting. By the way, even one position for each game in my database resulted in 3 million data set, which should be sufficient according to some research papers.
To make sure that the positions I use are not too simple, I analyzed them. 1.3 million of them had 36 points in pieces (knights, bishops, rooks, and queens; pawns were not included in the count), 1.4 million - 19 points, and only 0.3 million - had less.
Some things you could try:
Add unit tests and asserts wherever possible. E.g. if you know that some value is never supposed to get negative, add an assert to check that this condition really holds.
Print shapes of all tensors to check that you have really created the architecture you intended.
Check if your model outperforms some simple baseline model.
You say your model overfits, so maybe simplify it / add regularization?
Check how your model performs on the simplest positions. E.g. can it recognize a checkmate?

Object classification with Kinect using cascaded classifiers

My project is to create a software that recognizes certain objects like an apple or a coin etc. I want to use Kinect. My question is: Do I need to have a machine learning algorithm like haar classifier to recognize a object or kinect itself can do that?
Kinect itself cannot recognize objects. It will give you a dense depth map. Then you can use the depth features along with some simple features (in your case, maybe color features or gradient features would do the job). Those features you input to a classifier (SVM or Random Forest for example) to train the system. You use the trained model for testing on new samples.
Regarding Haar features, I think they could do the job but you would need a sufficiently large database of features. It all depends on what you want to detect. In the case of an apple and a coin, just color would suffice.
Refer this paper to get an idea how to perform human pose recognition using Kinect camera. You just have to pay attention to their depth features and their classifiers. Do not apply their approach directly. Your problem is simpler.
Edit: simple gradient orientations histogram
Gradient orientations can give you a coarse idea about the shape of the object (It is not a shape-feature to be specific, better shape features exist, but this one is extremely fast to calculate).
Code snippet:
%calculate gradient
[dx,dy] = gradient(double(img));
A = (atan(dy./(dx+eps))*180)/pi; %eps added to avoid division by zero.
A will contain orientation for each pixel. Segment your original image according to the depth values. For a segment having similar depth values, calculate color histogram. Extract the pixel orientations corresponding to that region, call it A_r. calculate a 9-bin (you can have more bins. Nine bins mean each bin will contain 180/9=20 degrees) histogram. Concatenate the color features and the gradient histogram. Do this for sufficient number of leaves. Then you can give this to a classifier for training.
Edit: This is a reply to a comment below.
Regarding MaxDepth parameter in opencv_traincascade
The documentation says, "Maximal depth of a weak tree. A decent choice is 1, that is case of stumps". When you perform binary classification, it takes a form of:
if yourFeatureValue>=learntThresh
class=1;
else
class=0;
end
The above type of classifier which performs thresholding on a single feature value (a scalar) is called decision stumps. There is only one split between positive and negative class (therefore maxDepth is one). For example, it would work in following scenario. Imagine you have a 1-D feature:
f=[1 2 3 4 -1 -2 -3 -4]
First 4 are class 1, rest are class 0. Decision stumps would get 100% accuracy on this data by setting the threshold to zero. Now, imagine a complicated feature space such as:
f=[1 2 3 4 5 6 7 8 9 10 11 12];
First 4 and last 4 are class 1, rest are class 0. Here, you cannot get 100% classification by decision stumps. You need two thresholds/splits. Therefore, you can construct a tree with depth value 2. You will have 2^(2-1)=2 thresholds. For depth=3, you get 4 thresholds, for depth=4, you get 8 thresholds and so on. Here, I assume a tree with a single node has height 1.
You may feel that the more the number of levels, you can achieve more accuracy, but then there is a problem of overfitting (and computation, memory storage etc.). Therefore, you have to set a good value for depth. I usually set it to 3.

Monte-Carlo localization for mobile-robot

I'm implementing Monte-Carlo localization for my robot that is given a map of the enviroment and its starting location and orientation. My approach is as follows:
Uniformly create 500 particles around the given position
Then at each step:
motion update all the particles with odometry (my current approach is newX=oldX+ odometryX(1+standardGaussianRandom), etc.)
assign weight to each particle using sonar data (formula is for each sensor probability*=gaussianPDF(realReading) where gaussian has the mean predictedReading)
return the particle with biggest probability as the location at this step
then 9/10 of new particles are resampled from the old ones according to weights and 1/10 is uniformly sampled around the predicted position
Now, I wrote a simulator for the robot's enviroment and here is how this localization behaves: http://www.youtube.com/watch?v=q7q3cqktwZI
I'm very afraid that for a longer period of time the robot may get lost. If add particles to a wider area, the robot gets lost even easier.
I expect a better performance. Any advice?
The biggest mistake is that you assume the particle with the highest weight to be your posterior state. This disagrees with the main idea of the particle filter.
The set of particles you updated with the odometry readings is your proposal distribution. By just taking the particle with the highest weight into account you completely ignore this distribution. It would be the same if you just randomly spread particles in the whole state space and then take the one particle that explains the sonar data the best. You only rely on the sonar reading and as sonar data is very noisy your estimate is very bad. A better approach is to assign a weight to each particle, normalize the weights, multiply each particle state by its weight and sum them up to obtain your posterior state.
For your resample step i would recommend to remove the random samples around the predicted state as they corrupt your proposal distribution. It is legit to generate random samples in order to recover from failures, but those are supposed to be spread over the whole state space and explicitly not around your current prediction.

Ideal Input In Neural Network For The Game Checkers

I'm designing a feed forward neural network learning how to play the game checkers.
For the input, the board has to be given and the output should give the probability of winning versus losing. But what is the ideal transformation of the checkers board to a row of numbers for input? There are 32 possible squares and 5 different possibilities (king or piece of white or black player and free position) on each square. If I provide an input unit for each possible value for each square, it will be 32 * 5. Another option is that:
Free Position: 0 0
Piece of white: 0 0.5 && King Piece of white: 0 1
Piece of black: 0.5 1 && King Piece of black: 1 0
In this case, the input length will be just 64, but I'm not sure which one will give a better result. Could anyone give any insight on this?
In case anyone is still interested in this topic—I suggest encoding the Checkers board with a 32 dimensional vector. I recently trained a CNN on an expert Checkers database and was able to acheive a suprisingly high level of play with no search, somewhat similar (I suspect) to the supervised learning step that Deepmind used to pretrain AlphaGo. I represented my input as an 8x4 grid, with entries in the set [-3, -1, 0, 1, 3] corresponding to an opposing king, opposing checker, empty, own checker, own king, repsectively. Thus, rather than encoding the board with a 160 dimensional vector where each dimension corresponds to a location-piece combination, the input space can be reduced to a 32-dimensional vector where each board location is represented by a unique dimension, and the piece at that location is encoded by a set of real numbers—this is done without any loss of information.
The more interesting question, at least in my mind, is which output encoding is most conducive for learning. One option is to encode it in the same way as the input. I would advise against this having found that simplifying the output encoding to a location (of the piece to move) and a direction (along which to move said piece) is much more advantageous for learning. While the reasons for this are likely more subtle, I suspect it is due to the enormous state space of checkers (something like 50^20 board possitions). Considering that the goal of our predictive model is to accept an input containing an enourmous number of possible states, and produce one ouput (i.e., move) from (at-most) 48 possibilities (12 pieces times 4 possible directions excluding jumps), a top priority in architecting a neural network should be matching the complexity of its input and output space to that of the actual game. With this in mind, I chose to encode the ouput as a 32 x 4 matrix, with each row representing a board location, and each column representing a direction. During training I simply unraveled this into a 128 dimensional, one-hot encoded vector (using argmax of softmax activations). Note that this output encoding lends itself to many invalid moves for a given board (e.g., moves off the board from edges and corners, moves to occupied locations, etc..)—we hope that the neural network can learn valid play given a large enough training set. I found that the CNN did a remarkable job at learning valid moves.
I’ve written more about this project at http://www.chrislarson.io/checkers-p1.
I've done this sort of thing with Tic-Tac-Toe. There are several ways to represent this. One of the most common for TTT is have input and output that represent the entire size of the board. In TTT this becomes 9 x hidden x 9. Input of -1 for X, 0 for none, 1 for O. Then the input to the neural network is the current state of the board. The output is the desired move. Whatever output neuron has the highest activation is going to be the move.
Propagation training will not work too well here because you will not have a finite training set. Something like Simulated Annealing, PSO, or anything with a score function would be ideal. Pitting the networks against each other for the scoring function would be great.
This worked somewhat well for TTT. I am not sure how it would work for Checkers. Chess would likely destroy it. For Go it would likely be useless.
The problem is that the neural network will learn patters only at fixed location. For example jumping an opponent in the top-left corner would be a totally different situation than jumping someone in the bottom left corner. These would have to be learned separately.
Perhaps better is to represent the exact state of the board in position independent way. This would require some thought. For instance you might communicate what "jump" opportunities exist. What move-towards king square opportunity's exist, etc and allow the net to learn to prioritize these.
I've tried all possibilities and intuitive i can say that the most great idea is separating all possibilities for all squares. Thus, concrete:
0 0 0: free
1 0 0: white piece
0 0 1: black piece
1 1 0: white king
0 1 1: black king
It is also possible to enhance other parameters about the situation of the game like the amount of pieces under threat or amount of possibilities to jump.
Please see this thesis
Blondie24 page 46, there is description of input for neural network.

Is it possible to use core motion for distance measurement [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Getting displacement from accelerometer data with Core Motion
Android accelerometer accuracy (Inertial navigation)
I am trying to use core motion user acceleration values, and double integrating them to derive distance covered. I move my iPhone linearly along its Y axis, against a 30 cm log ruler, on the table. First, I let the device be at rest for 10 seconds, and I calculate my offsets along the three axes, by averaging the respective user acceleration values.
The X, Y and Z offsets are subtracted from the acceleration values, when I try calculating the distance covered. After offset subtraction, these values are passed through a low pass filter and a median filter, separately of course. The filters are linear filters, and the cut-off frequency is specified by the number of neighbouring values whose mean is taken in low pass, and median in the median filter. I have experimented with varying values of this number from 1 to 100. In the end, these filtered values are double integrated using trapezoidal rule to get distances. But, the distance calculated is no where close to 30 cm. The closest value I got was some -22 cm(I am wondering why I am getting negative values even though I move the device in positive Y direction). I also came across this:
http://ajnaware.wordpress.com/2008/09/05/accelerating-iphones/
its an old post about the same thing, which says that the accelerometer readings returned appeared to come in quanta of about 0.18m/s^2 (ie. about 0.018g), resulting in a large cumulative error very quickly. Going by that, for this error to really not matter, one will have to accelerate the device by almost 1.8m/s^2, which is practically impossible for distance/length measurement purposes. for small movements, it does not look like there is a possibility of calculating distances by using an optimal filter and a higher order numerical integration method, without an impractical velocity/acceleration constraint like that. Is it possible?
How about using my acceleration vs timestamp data to interpolate a polynomial that grows over time, as I get more and more motion updates, which represents approximately an acceleration vs time curve. Double integration of ths polynomial would be a piece of cake. But, for small distances, the polynomial will have a big error component. Using a predictable known motion that my device will be subjected to, I wish to take a huge number of snapshots (calculated distance vs actual known distance) to calculate my error polynomial in a similar way, and then subtract it from my first polynomial. Can this work?
Although this does not fit StackOverflow, because it's not a question but a discussion, I'll try to sum up my thoughts about it.
As already said, the accelerometer is very inaccurate and you would need very good accuracy for this kind of task, especially if you are trying to measure such short distances. Plus, accelerometers differ from device to device, you will get different results for the same movements with different device. Plus a very huge random error.
My guess is, that you can get rid of a huge part of randomness/error by calibrating the device and making the "measurement move" a couple of times, like 10 times. After that you have enough data to get an average that might get close to the real value.
Calibration is a key part here, you have to think of a clever way to calibrate, like letting the user move the device over different distances in different speeds.
But all this is just theory. I would really like to see your results, but I doubt you get it working good enough even using the best possible filters/algorithms, since there is just too much noise.

Resources