I have a 7x6 grid in which an object moving is being tracked. The object can move randomly in any direction with any pace (can even come to a halt) within the grid.
Input: coordinates of the object every second are stored in a .csv file (x-coordinate, y-coordinate, ith second), where i=0 to n(n seconds of tracking).
Please, suggest a machine learning algorithm which can predict centroids of clusters of coordinates as mentioned in below output.
Output: Cluster centroids of points where the object came to halt (one after the other (c1,c2,c3,...,c8), named based on time as shown in below picture).
Return the last position for each object.
That should satisfy your requirements, without any learning or clustering.
Related
My understanding of K-medoids is that centroids are picked randomly from existing points. Clusters are calculated by dividing remaining points to the nearest centroid. Error is calculated (absolute distance).
a) How are new centroids picked? From examples seams that they are picked randomly? And error is calculated again to see if those new centroids are better or worse.
b) How do you know that you need to stop picking new centroids?
It's worth to read the wikipedia page of the k-medoid algorithm. You are right about that the k medoid from the n data points selected randomly at the first step.
The new medoids are picked by swapping every medoid m and every non-medoid o in a loop and calculating the distance again. If the cost increased you undo the swap.
The algorithm stops if there is no swap for a full iteration.
The process for choosing the initial medoids is fairly complicated.. many people seem to just use random initial centers instead.
After this k medoids always considers every possible change of replacing one of the medoids with one non-medoid. The best such change is then applied, if it improves the result. If no further improvements are possible, the algorithm stops.
Don't rely on vague descriptions. Read the original publications.
Before answering a brief about k-medoids would be needed which i have stated in the first two steps and the last two would answer your questions.
1) The first step of k-medoids is that k-centroids/medoids are randomly picked from your dataset. Suppose your dataset contains 'n' points so these k- medoids would be chosen from these 'n' points. Now you can choose them randomly or you could use approaches like smart initialization that is used in k-means++.
2) The second step is the assignment step wherein you take each point in your dataset and find its distance from these k-medoids and find the one that is minimum and add this datapoint to set S_j corresponding to C_j centroid (as we have k-centroids C_1,C_2,....,C_k).
3) The third step of the algorithm is updation step.This will answer your question regarding how new centroids are picked after they have been initialized. I will explain updation step with an example to make it more clear.
Suppose you have ten points in your dataset which are
(x_1,x_2,x_3,x_4,x_5,x_6,x_7,x_8,x_9,x_10). Now suppose our problem is 2-cluster one so we firstly choose 2-centroids/medoids randomly from these ten points and lets say those 2-medoids are (x_2,x_5). The assignment step will remain same. Now in updation, you will choose those points which are not medoids (points apart from x_2,x_5) and again repeat the assigment and update step to find the loss which is the square of the distance between x_i's from medoids. Now you will compare the loss found using medoid x_2 and the loss found by non-medoid point. If the loss is reduced then you will swap x_2 point with any non-medoid point that has reduced the loss.If the loss is not reduced then you will keep x_2 as your medoid and won't swap.
So, there can be lot of swaps in the updation step which also makes this algorithm computationally high.
4) The last step will answer your second question i.e. when should one stop picking new centroids. When you compare the loss of medoid/centroid point with a loss computed by non-medoid, If the difference is very negligible, the you can stop and keep the medoid point as a centroid only.But if the loss is quite significant then you will have to perform the swapping until the loss reduces.
I Hope that would answer your questions.
I am interested in tracking objects across frames of a movie to calculate the velocity of each object. These are Drosophila in well plates. Therefore there are always 12 objects in every frame and the objects should not be allowed to merge. I have written a script that identifies each object and finds their centroids. It them minimizes the distances between those centroids and the ones detected on the previous frame.
What I am really surprised to see is that without even taking the previous centroids into consideration, openCV seems to do a really good job of automatically assigning each contour the same relative identity across frames. So if I plot my video with the contour number over each blob, that number hardly changes across frames? How does this work? How does OpenCV decide which contour will be returned first and which will be returned last?
Thank you,
FB
Is there any way to reduce the dimension of the following features from 2D coordinate (x,y) to one dimension?
Yes. In fact, there are infinitely many ways to reduce the dimension of the features. It's by no means clear, however, how they perform in practice.
A feature reduction usually is done via a principal component analysis (PCA) which involves a singular value decomposition. It finds the directions with highest variance -- that is, those direction in which "something is going on".
In your case, a PCA might find the black line as one of the two principal components:
The projection of your data onto this one-dimensional subspace than yields the reduced form of your data.
Already with the eye one can see that on this line the three feature sets can be separated -- I coloured the three ranges accordingly. For your example, it is even possible to completely separate the data sets. A new data point then would be classified according to the range in which its projection onto the black line lies (or, more generally, the projection onto the principal component subspace) lies.
Formally, one could obtain a division with further methods that use the PCA-reduced data as input, such as for example clustering methods or a K-nearest neighbour model.
So, yes, in case of your example it could be possible to make such a strong reduction from 2D to 1D, and, at the same time, even obtain a reasonable model.
I am using opencv to implement finger tracking system
And also use
calcOpticalFlowPyrLK(pGmask,nGmask,fingers,track,status,err);
to perform a LK tracker.
The concept I am not clear, after I implement the LK tracker, how should I detect the movement of fingers? Also, the tracker get the last frame and current frame, how to detect a series of action or continuous gesture like within 5 frames?
The 4th parameter of calcOpticalFlowPyrLK (here track) will contain the calculated new positions of input features in the second image (here nGmask).
In the simple case, you can estimate the centroid separately of fingers and track where you can infer to the movement. Making decision can be done from the direction and magnitude of the vector pointing from fingers' centroid to track's centroid.
Furthermore, complex movements can be considered as time series, because movements are consisting of some successive measurements made over a time interval. These measurements could be the direction and magnitude of the vector mentioned above. So any movement can be represented as below:
("label of movement", time_series), where
time_series = {(d1, m1), (d2, m2), ..., (dn, mn)}, where
di is direction and mi is magnitude of the ith vector (i=1..n)
So the time-series consists of n * 2 measurements (sampling n times), that's the only question how to recognize movements?
If you have prior information about the movement, i.e. you know how to perform a circular movement, write an a letter etc. then the question can be reduced to: how to align time series to themselves?
Here comes the well known Dynamic Time Warping (DTW). It can be also considered as a generative model, but it is used between pairs of sequences. DTW is an algorithm for measuring similarity between two temporal sequences which may vary in time or speed (such in our case).
In general, DTW calculates an optimal match between two given time series with certain restrictions. The sequences are warped non-linearly in the time dimension to determine a measure of their similarity independent of certain non-linear variations in the time dimension.
In image processing, how region growing and clustering differ from each other ? Give more information on how they differ. Thank you for reading
Region growing :
You have to select seed points and then the local area around the seed is analyzed in order to know if the neighbor pixels should have the same label. http://en.wikipedia.org/wiki/Region_growing
It can be used for precise image segmentation.
Clustering :
There are many clustering techniques (k-means, hierarchical clustering, density clustering, etc.). Clustering algorithms don't ask to input seed points because they are based on unsupervised learning.
It can be use for coarse image segmentation.
I found region growing similar to some clustering algorithm. I explained my view point below:
In region growing there are 2 cases:
Selecting seed points randomly which is similar to k-mean. seeds
play the role of means here. Then, we start with one seed and spread
it until we cannot grow it anymore (like we start with one mean and
we continue till we reach a convergence). and the way we grow the
region is based on the euclidean distance form seed grey value
(usually).
Second case in region growing can be considered with no seed (assume
we don't know how many seeds to choose or we don't know the number
of clusters). So we start with the first pixel. Then we find
neighbors of current pixel with considering distance d from mean
grey value of the region (of course at first iteration mean grey
value is exactly current grey value). Afterwards, we update the mean
grey value. In this way region growing seems to act like mean shift
algorithm. If we don't update mean grey value after each assigning,
then it could be considered as a DBSCAN algorithm.