I have set of about 200 points (x,y) of an image. The 200 data belong to 11 classes (which I think will become the class labels). My problem is how do I represent the x, y values as one data?
My first thought is that I should represent them separately with the labels and then when I get a point to classify, I will classify x and y separately. Something in me tells me that this is incorrect.
Please advice me how to present the x,y value as one data element.
I can't imagine what problem you meet. In kNN algorithm, we can use variables with multiple dimensions, you just need to use list in python standard library or array in Numpy library to organize the data such as : group = numpy.array([[1.0,1.1],[1.0,1.0],[0,0],[0,0.1]])
or group = [[1.0,1.1],[1.0,1.0],[0,0],[0,0.1]] to represent (1.0,1.1) (1.0,1.0) (0,0) (0,0.1).
However, I suggest to use numpy, as there're so many functions in it and they are implemented by C language which ensure the efficiency of programs.
If you use numpy, you'd better to do all the operations in matrix way, for example, you can use point=tile([0,0],(3,1))anddistance(group-point)(distance is a function written by me) to calculate the distance without iteration.
The key is not representation but distance calculation instead. The points in your case are essentially one element but with two dimensions (x, y). The kNN algorithm can handle n-dimension case itself: it finds the k-nearest neighbors. So you can use the euclidean distance d((x1, y1), (x2, y2))=((x1-x2)^2 + (y1-y2)^2)^0.5, where (x1, y1) represents the first point to calculate, as the distances of points in your case.
Related
I want to do feature scaling datasets by using means and standard deviations, and my code is below; but apparently it is not a univerisal code, since it seems only work with one dataset. Thus I am wondering what is wrong with my code, any help will be appreciated! Thanks!
X is the dataset I am currently using.
mu = mean(X);
sigma = std(X);
m = size(X, 1);
mu_matrix = ones(m, 1) * mu;
sigma_matrix = ones(m, 1) * sigma;
featureNormalize = (X-mu_matrix)/sigma;
Thank you for clarifying what you think the code should be doing in the comments.
My answer will effectively answer why what you think is happening is not what is happening.
First let's talk about the mean and std functions. When their input is a vector (whether this is vertically or horizontally aligned), then this will return a single number which is the mean or standard deviation of that vector respectively, as you might expect.
However, when the input is a matrix, then you need to know what it does differently. Unless you specify the direction (dimension) in which you should be calculating means / std, then it will calculate means along the rows, i.e. returning a single number for each column. Therefore, the end-result of this operation will be a horizontal vector.
Therefore, both mu and sigma will be horizontal vectors in your code.
Now let's move on to the 'matrix multiplication' operator (i.e. *).
When using the matrix multiplication operator, if you multiply a horizontal vector with a vertical vector (i.e. the usual matrix multiplication operation), your output is a single number (i.e. a scalar). However, if you reverse the orientations, as in, you multiply a vertical vector by a horizontal one, you will in fact be calculating a 'Kronecker product' instead. Since the output of the * operation is completely defined by the rows of the first input, and the columns of the second input, whether you're getting a matrix multiplication or a kronecker product is implicit and entirely dependent on the orientation of your inputs.
Therefore, in your case, the line mu_matrix = ones(m, 1) * mu; is not in fact appending a vector of ones, like you say. It is in fact performing the kronecker product between a vertical vector of ones, and the horizontal vector that is your mu, effectively creating an m-by-n matrix with mu repeated vertically for m rows.
Therefore, at the end of this operation, as the variable naming would suggest, mu_matrix is in fact a matrix (same with sigma_matrix), having the same size as X.
Your final step is X- mu_sigma, which gives you at each element, the difference between x and mu at that element. Then you "divide" with the sigma matrix.
Here is why I asked if you were sure you should be using ./ instead of /.
/ is the matrix division operator. With / You are effectively performing matrix multiplication by an inverse matrix, since D / S is mathematically equivalent to D * inv(S). It seems to me you should be using ./ instead, to simply divide each element by the standard deviation of that column (which is why you had to repeat the horizontal vector over m rows in sigma_matrix, so that you could use it for 'elementwise division'), since what you are trying to do is to normalise each row (i.e. observation) of a particular column, by the standard deviation that is specific to that column (i.e. feature).
I'm trying to use opencv to find some template in images. While opencv has several template matching methods, I have big trouble to understand the difference and when to use which by looking at their mathematic equization:
CV_TM_SQDIFF
CV_TM_SQDIFF_NORMED
CV_TM_CCORR
CV_TM_CCORR_NORMED
CV_TM_CCOEFF
Can someone explain the major difference between all these method in a non-mathematical way?
The general idea of template matching is to give each location in the target image I, a similarity measure, or score, for the given template T. The output of this process is the image R.
Each element in R is computed from the template, which spans over the ranges of x' and y', and a window in I of the same size.
Now, you have two windows and you want to know how similar they are:
CV_TM_SQDIFF - Sum of Square Differences (or SSD):
Simple euclidian distance (squared):
Take every pair of pixels and subtract
Square the difference
Sum all the squares
CV_TM_SQDIFF_NORMED - SSD Normed
This is rarely used in practice, but the normalization part is similar in the next methods.
The nominator term is same as above, but divided by a factor, computed from the
- square root of the product of:
sum of the template, squared
sum of the image window, squared
CV_TM_CCORR - Cross Correlation
Basically, this is a dot product:
Take every pair of pixels and multiply
Sum all products
CV_TM_CCOEFF - Cross Coefficient
Similar to Cross Correlation, but normalized with their Covariances (which I find hard to explain without math. But I would refer to
mathworld
or mathworks
for some examples
I have an object with some sensors on it with a known 3d location in a fixed orientation in relation to each other. Let's call this object the "detector".
I have a few of these sensors' detected locations in 3D world space.
Problem: how do I get an estimated pose (position and rotation) of the "detector" in 3D world space.
I tried looking into the npn problem, flann and orb matching and knn for the outliers, but it seems like they all expect a camera position of some sort. I have nothing to do with a camera and all I want is the pose of the "detector". considering that opencv is a "vision" library, do I even need opencv for this?
edit: Not all sensors might be detectec. Here indicated by the light-green dots.
Sorry this is kinda old but never too late to do object tracking :)
OpenCV SolvePNP RANSAC should work for you. You don't need to supply an initial pose, just make empty Mats for rvec and tvec to hold your results.
Also because there is no camera just use the Identity for the camera distortion arguments.
The first time calling PNP with empty rvec and tvec make sure to set useExtrinsicGuess = false. Save your results if good, use them for next frame with useExtrinsicGuess = true so it can optimize the function more quickly.
https://docs.opencv.org/2.4/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html#solvepnpransac
You definitely do not need openCV to estimate the position of your object in space.
This is a simple optimization problem where you need to minimize a distance to the model.
First, you need to create a model of your object's attitude in space.
def Detector([x, y, z], [alpha, beta, gamma]):
which should return a list or array of all positions of your points with IDs in 3D space.
You can even create a class for each of these sensor points, and a class for the whole object which has as attributes as many sensors as you have on your object.
Then you need to build an optimization algorithm for fitting your model on the detected data.
The algorithm should use the attitude
x, y, z, alpha, beta, gamma as variables.
for the objective function, you can use something like a sum of distances to corresponding IDs.
Let's say you have a 3 points object that you want to fit on 3 data points
#Model
m1 = [x1, y1, z1]
m2 = [x2, y2, z2]
m3 = [x3, y3, z3]
#Data
p1 = [xp1, yp1, zp1]
p2 = [xp2, yp2, zp2]
p3 = [xp3, yp3, zp3]
import numpy as np
def distanceL2(pt1, pt2):
distance = np.sqrt((pt1[0]-pt2[0])**2 + (pt1[1]-pt2[1])**2 + (pt1[2]-pt2[2])**2))
# You already know you want to relate "1"s, "2"s and "3"s
obj_function = distance(m1, p1) + distance(m2,p2) + distance(m3,p3)
Now you need to dig into optimization libraries for finding the best algorithm to use, depending on how fast you need you optimization to be.
Since your points in space are virtually connected, this should not be too difficult. scipy.optimize could do it.
To reduce the dimensions of your problem, Try taking one of the 'detected' points as reference (as if this measure was to be trusted) and then find the minimum of the obj_function for this position (there are only 3 parameters left to optimize on, corresponding to orientation)
and then iterate for each of the points you have.
Once you have the optimum, you can try to look for a better position for this sensor around it and see if you reduce again the distance.
I have a Kalman filter tracking a point, with a state vector (x, y, dx/dt, dy/dt).
At a given update, I have a set of candidate points which may correspond to the tracked points. I would like to iterate through these candidates and choose the one most likely to correspond to the tracked point, but only if the probability of that point corresponding to the tracked point is greater than a threshold (e.g. p > 0.5).
Therefore I need to use the covariance and state matrices of the filter to estimate this probability. How can I do this?
Additionally, note that my state vector is four dimensions, but the measurements are in two dimensions (x, y).
When you predict the measurements with y = Hx you also compute the covariance of y as H*P*H.T. This property is why we use variance in the Kalman Filter.
The geometrical way to understand how far a given point is from your predicted point is a error ellipse or confidence region. A 95% confidence region is the ellipse scaled to 2*sigma (if that isn't intuitive, you should go read about normal distributions, because that is what the KF thinks it is working on). If the covariance is diagonal, the error ellipse will be axis aligned. If there are co-varying terms (which there may not be if you have not introduced them anywhere via Q or R) then the ellipse will be tilted.
The mathematical way is with the Mahalanobis distance, which just directly formulates the geometrical representation above as a distance. The distance scale is standard deviations, so your P=0.5 corresponds to a distance of 0.67 (again, see normal distributions if this is surprising).
The most probable point (I suppose from detections) will be the nearest point to filter prediction.
I have a even set of points in 2D. I need an algorithm that can make pairs of those points such that total sum of distance between pairs is maximum.
Dynamic Programming, greedy approach won't work, I think.
Can I use Linear Programming or Hungarian algo? or any other?
You certainly can use integer linear programming. Here is an example formulation:
Introduce a binary variable x[ij] for each unordered couple of distrinct points i and j (i.e. such as i<j), where x[ij]=1 iff the points i and j are grouped together.
Compute all the distances d[ij] (for i<j).
The objective is to maximize sum_[i<j] d[ij]*x[ij], subject to the constraints that each point is in exactly one pair, i.e. forall j, sum_[i<j] x[ij] = 1.
Note that this work also for 3d points: you only need the distance between two pairs of points.