I’m starting to work on a line follower project but it is required that I use image processing techniques. I have a few ideas to consider, but I would like some input as there are some doubts I would like to clarify. This is my approach to solve this problem: I first read the image, then apply thresholding to detect the object (the line). I do color filtering and then edge detection. After this I start to do image classification to detect all the lines, then extrapolate those lines to only output/detect parallel lines (like a lane detection algorithm). With this parallel lines I can calculate the center to maintain my vehicle centered and the angle to make turns.
I do not know the angles in the path so the system must be able to turn any angle, that’s why I will calculate the angle. I have included a picture of a line with a turn, this is the kind of turns I will be dealing with. I have managed to implement almost everything. My main problem is in the change of angle, basically the turns. After I have detected the parallel lines, how can I make my system know when is time to make a turn? The question might be kind of confusing, but basically the vehicle will be moving forward as long the angle is near to zero. But when the vehicle approach a turn, it might detect two set of parallel lines. Maybe I can define a length of the detected lines that will define whether or not the vehicle must move forward?
Any ideas would be appreciated.
If you have two lines (the center line of each path):
y1 = m1 * x + b1
y2 = m2 * x + b2
They intersect when you choose an x such that y1 and y2 are equal (if they are not parallel of course, so m1 != m2)
m1 * x + b1 = m2 * x + b2
(do a bunch of algebra)
x = (b2 - b1) / (m1 - m2)
(y should be the same for both line formulas)
When you are near this point, switch lines.
NOTE: This won't handle the case of perfectly vertical lines, because they have infinite slope, and no y-intercept -- for that see the parametric form of lines. You will have 2 equations per line:
x = f1(t1)
y = f2(t1)
and
x = f3(t2)
y = f4(t2)
Set f1(t1) == f3(t2) and f2(t1) == f4(t2) to find the intersection of non-parallel lines. Then plug t1 into the first line formula to find (x, y)
Basically the answer by Lou Franco explains you how to get the intersection of the two center line of each path and then that intersection is a good point to start your turn.
I would add a suggestion on how to compute the center line of a path.
In my experience, when working with floating point representation of lines extracted from images, the lines are really never parallel, they just intersect usually at a point that falls out of the image (maybe far away).
The following C++ function bisector_of_lines is inspired by the method bisector_of_linesC2 found at CGAL source code.
A line is expressed as a*x+b*y+c=0, the following function
constructs the bisector of the two lines p and q.
line p is pa*x+pb*y+pc=0
line q is qa*x+qb*y+qc=0
The a, b, c of the bisector line are the last three parameters of the function: a, b and c.
In the general case, the bisector has the direction of the vector which is the sum of the normalized directions of the two lines, and which passes through the intersection of p and q. If p and q are parallel, then the bisector is defined as the line which has the same direction as p, and which is at the same distance from p and q (see the official CGAL documentation for CGAL::Line_2<Kernel> CGAL::bisector).
void
bisector_of_lines(const double &pa, const double &pb, const double &pc,
const double &qa, const double &qb, const double &qc,
double &a, double &b, double &c)
{
// We normalize the equations of the 2 lines, and we then add them.
double n1 = sqrt(pa*pa + pb*pb);
double n2 = sqrt(qa*qa + qb*qb);
a = n2 * pa + n1 * qa;
b = n2 * pb + n1 * qb;
c = n2 * pc + n1 * qc;
// Care must be taken for the case when this produces a degenerate line.
if (a == 0 && b == 0) {// maybe it is best to replace == with https://stackoverflow.com/questions/19837576/comparing-floating-point-number-to-zero
a = n2 * pa - n1 * qa;
b = n2 * pb - n1 * qb;
c = n2 * pc - n1 * qc;
}
}
Related
I'm writing a k-means algorithm. At each step, I want to compute the distance of my n points to k centroids, without a for loop, and for d dimensions.
The problem is I have a hard time splitting on my number of dimensions with the Matlab functions I know. Here is my current code, with x being my n 2D-points and y my k centroids (also 2D-points of course), and with the points distributed along dimension 1, and the spatial coordinates along the dimension 2:
dist = #(a,b) (a - b).^2;
dx = bsxfun(dist, x(:,1), y(:,1)'); % x is (n,1) and y is (1,k)
dy = bsxfun(dist, x(:,2), y(:,2)'); % so the result is (n,k)
dists = dx + dy; % contains the square distance of each points to the k centroids
[_,l] = min(dists, [], 2); % we then argmin on the 2nd dimension
How to vectorize furthermore ?
First edit 3 days later, searching on my own
Since asking this question I made progress on my own towards vectorizing this piece of code.
The code above runs in approximately 0.7 ms on my example.
I first used repmat to make it easy to do broadcasting:
dists = permute(permute(repmat(x,1,1,k), [3,2,1]) - y, [3,2,1]).^2;
dists = sum(dists, 2);
[~,l] = min(dists, [], 3);
As expected it is slightly slower since we replicate the matrix, it runs at 0.85 ms.
From this example it was pretty easy to use bsxfun for the whole thing, but it turned out to be extremely slow, running in 150 ms so more than 150 times slower than the repmat version:
dist = #(a, b) (a - b).^2;
dists = permute(bsxfun(dist, permute(x, [3, 2, 1]), y), [3, 2, 1]);
dists = sum(dists, 2);
[~,l] = min(dists, [], 3);
Why is it so slow ? Isn't vectorizing always an improvement on speed, since it uses vector instructions on the CPU ? I mean of course simple for loops could be optimized to use it aswell, but how can vectorizing make the code slower ? Did I do it wrong ?
Using a for loop
For the sake of completeness, here's the for loop version of my code, surprisingly the fastest running in 0.4 ms, not sure why..
for i=1:k
dists(:,i) = sum((x - y(i,:)).^2, 2);
endfor
[~,l] = min(dists, [], 2);
Note: This answer was written when the question was also tagged MATLAB. Links to Octave documentation added after the MATLAB tag was removed.
You can use the pdist2MATLAB/Octave function to calculate pairwise distances between two sets of observations.
This way, you offload the bother of vectorization to the people who wrote MATLAB/Octave (and they have done a pretty good job of it)
X = rand(10,3);
Y = rand(5,3);
D = pdist2(X, Y);
D is now a 10x5 matrix where the i, jth element is the distance between the ith X and jth Y point.
You can pass it the kind of distance you want as the third argument -- e.g. 'euclidean', 'minkowski', etc, or you could pass a function handle to your custom function like so:
dist = #(a,b) (a - b).^2;
D = pdist2(X, Y, dist);
As saastn mentions, pdist2(..., 'smallest', k) makes things easier in k-means. This returns just the smallest k values from each column of pdist2's result. Octave doesn't have this functionality, but it's easily replicated using sort()MATLAB/Octave.
D_smallest = sort(D);
D_smallest = D_smallest(1:k, :);
I am starting to use Octave and I am trying to understand how is the underlying calculation done for dividing a Scalar by vector ?
I am able to understand how ./ is operating to give us the results - dividing 1 by every element of the matrix column. However, I am not able to get my head around how we get the values in the second case ? 1 / (1 + a)
Example :
g = 1 ./ (1 + a)
g =
0.50000
0.25000
0.20000
>> g = 1 / (1 + a)
g =
0.044444 0.088889 0.111111
When you divide 1 by a vector, it gives you a vector that yields 1 when multiplied on the left by the first vector. In this sense, it is a sort of 'inverse' of the vector, although it will only be a one way inverse. In your example:
>> (1/(1+a))*(1+a)
ans = 1
>> (1+a)*(1/(1+a))
ans =
0.088889 0.177778 0.222222
0.177778 0.355556 0.444444
0.222222 0.444444 0.555556
You could say 1/(1+a) is the left inverse of 1+a. This would also explain why the dimensions of the vector are transposed. Another way to put it: given a vector v, 1/v is the solution (w) of the vector equation w*v=1.
If I have three points that create an angle, what would be the best way to determine if a fourth point resides within the angle created by the previous three?
Currently, I determine the angle of the line to all three points from the origin point, and then check to see if the test angle is in between the two other angles but I'm trying to figure out if there's a better way to do it. The function is run tens of thousands of times an update and I'm hoping that there's a better way to achieve what I'm trying to do.
Let's say you have angle DEF (E is the "pointy" part), ED is the left ray and EF is the right ray.
* D (Dx, Dy)
/
/ * P (Px, Py)
/
/
*---------------*
E (Ex, Ey) F (Fx, Fy)
Step 1. Build line equation for line ED in the classic Al * x + Bl * y + Cl = 0 form, i.e. simply calculate
Al = Dy - Ey // l - for "left"
Bl = -(Dx - Ex)
Cl = -(Al * Ex + Bl * Ey)
(Pay attention to the subtraction order.)
Step 2. Build line equation for line FE (reversed direction) in the classic Ar * x + Br * y + Cr = 0 form, i.e. simply calculate
Ar = Ey - Fy // r - for "right"
Br = -(Ex - Fx)
Cr = -(Ar * Ex + Br * Ey)
(Pay attention to the subtraction order.)
Step 3. For your test point P calculate the expressions
Sl = Al * Px + Bl * Py + Cl
Sr = Ar * Px + Br * Py + Cr
Your point lies inside the angle if and only if both Sl and Sr are positive. If one of them is positive and other is zero, your point lies on the corresponding side ray.
That's it.
Note 1: For this method to work correctly, it is important to make sure that the left and right rays of the angle are indeed left and right rays. I.e. if you think about ED and EF as clock hands, the direction from D to F should be clockwise. If it is not guaranteed to be the case for your input, then some adjustments are necessary. For example, it can be done as an additional step of the algorithm, inserted between steps 2 and 3
Step 2.5. Calculate the value of Al * Fx + Bl * Fy + Cl. If this value is negative, invert signs of all ABC coefficients:
Al = -Al, Bl = -Bl, Cl = -Cl
Ar = -Ar, Br = -Br, Cr = -Cr
Note 2: The above calculations are made under assumption that we are working in a coordinate system with X axis pointing to the right and Y axis pointing to the top. If one of your coordinate axes is flipped, you have to invert the signs of all six ABC coefficients. Note, BTW, that if you perform the test described in step 2.5 above, it will take care of everything automatically. If you are not performing step 2.5 then you have to take the axis direction into account from the very beginning.
As you can see, this a precise integer method (no floating point calculations, no divisions). The price of that is danger of overflows. Use appropriately sized types for multiplications.
This method has no special cases with regard to line orientations or the value of the actual non-reflex angle: it work immediately for acute, obtuse, zero and straight angle. It can be easily used with reflex angles (just perform a complementary test).
P.S. The four possible combinations of +/- signs for Sl and Sr correspond to four sectors, into which the plane is divided by lines ED and EF.
* D
/
(-,+) / (+,+)
/
-------*------------* F
/ E
(-,-) / (+,-)
/
By using this method you can perform the full "which sector the point falls into" test. For an angle smaller than 180 you just happen to be interested in only one of those sectors: (+, +). If at some point you'll need to adapt this method for reflex angles as well (angles greater than 180), you will have to test for three sectors instead of one: (+,+), (-,+), (+,-).
Describe your origin point O, and the other 2 points A and B then your angle is given as AOB. Now consider your test point and call that C as in the diagram.
Now consider that we can get a vector equation of C by taking some multiple of vector OA and some multiple of OB. Explicitly
C = K1 x OA + K2 OB
for some K1,K2 that we need to calculate. Set O to the origin by subtracting it (vectorially) from all other points. If coordinates of A are (a1,a2), B = (b1,b2) and C = (c1,c2) we have in matrix terms
[ a1 b1 ] [ K1 ] = [ c1 ]
[ a2 b2 ] [ K2 ] = [ c2 ]
So we can solve for K1 and K2 using the inverse of the matrix to give
1 / (a1b2 - b1a2) [ b2 -b1 ] [ c1 ] = [ K1 ]
[ -a2 a1 ] [ c2 ] = [ K2 ]
which reduces to
K1 = (b2c1 - b1c2)/(a1b2 - b1a2)
K2 = (-a2c1 + a1c2)/(a1b2 - b1a2)
Now IF the point C lies within your angle, the multiples of the vectors OA and OB will BOTH be positive. If C lies 'under' OB, then we need a negative amount of OA to get to it similarly for the other direction. So your condition is satisfied when both K1 and K2 are greater than (or equal to) zero. You must take care in the case where a1b2 = b1a2 as this corresponds to a singular matrix and division by zero. Geometrically it means that OA and OB are parallel and hence there is no solution. The algebra above probably needs verifying for any slight typo mistake but the methodology is correct. Maybe long winded but you can get it all simply from point coordinates and saves you calculating inverse trig functions to get angles.
The above applies to angles < 180 degrees, so if the your angle is greater than 180 degrees, you should check instead for
!(K1 >= 0 && K2 >= 0)
as this is exterior to the segment less than 180 degree. Remember that for 0 and 180 degrees you will have a divide by zero error which must be checked for (ensure a1b2 - b1a2 != 0 )
Yes, I meant the smallest angle in my comment above. Look at this thread for an extensive discussion on cheap ways to find the measure of the angle between two vectors. I have used the lookup-table approach on many occasions with great success.
Triangle O B C has to be positive oriented and also triangle O C A. To calaculate orientation, just use Shoelace formula. Both values has to be positive.
I'm trying to develop an application using SOM in analyzing data. However, after finishing training, I cannot find a way to visualize the result. I know that U-Matrix is one of the method but I cannot understand it properly. Hence, I'm asking for a specific and detail example how to construct U-Matrix.
I also read an answer at U-matrix and self organizing maps but it only refers to 1 row map, how about 3x3 map? I know that for 3x3 map:
m(1) m(2) m(3)
m(4) m(5) m(6)
m(7) m(8) m(9)
a 5x5 matrix must me created:
u(1) u(1,2) u(2) u(2,3) u(3)
u(1,4) u(1,2,4,5) u(2,5) u(2,3,5,6) u(3,6)
u(4) u(4,5) u(5) u(5,6) u(6)
u(4,7) u(4,5,7,8) u(5,8) u(5,6,8,9) u(6,9)
u(7) u(7,8) u(8) u(8,9) u(9)
but I don't know how to calculate u-weight u(1,2,4,5), u(2,3,5,6), u(4,5,7,8) and u(5,6,8,9).
Finally, after constructing U-Matrix, is there any way to visualize it using color, e.g. heat map?
Thank you very much for your time.
Cheers
I don't know if you are still interested in this but I found this link
http://www.uni-marburg.de/fb12/datenbionik/pdf/pubs/1990/UltschSiemon90
which explains very speciffically how to calculate the U-matrix.
Hope it helps.
By the way, the site were I found the link has several resources referring to SOMs I leave it here in case anyone is interested:
http://www.ifs.tuwien.ac.at/dm/somtoolbox/visualisations.html
The essential idea of a Kohonen map is that the data points are mapped to a
lattice, which is often a 2D rectangular grid.
In the simplest implementations, the lattice is initialized by creating a 3D
array with these dimensions:
width * height * number_features
This is the U-matrix.
Width and height are chosen by the user; number_features is just the number
of features (columns or fields) in your data.
Intuitively this is just creating a 2D grid of dimensions w * h
(e.g., if w = 10 and h = 10 then your lattice has 100 cells), then
into each cell, placing a random 1D array (sometimes called "reference tuples")
whose size and values are constrained by your data.
The reference tuples are also referred to as weights.
How is the U-matrix rendered?
In my example below, the data is comprised of rgb tuples, so the reference tuples
have length of three and each of the three values must lie between 0 and 255).
It's with this 3D array ("lattice") that you begin the main iterative loop
The algorithm iteratively positions each data point so that it is closest to others similar to it.
If you plot it over time (iteration number) then you can visualize cluster
formation.
The plotting tool i use for this is the brilliant Python library, Matplotlib,
which plots the lattice directly, just by passing it into the imshow function.
Below are eight snapshots of the progress of a SOM algorithm, from initialization to 700 iterations. The newly initialized (iteration_count = 0) lattice is rendered in the top left panel; the result from the final iteration, in the bottom right panel.
Alternatively, you can use a lower-level imaging library (in Python, e.g., PIL) and transfer the reference tuples onto the 2D grid, one at a time:
for y in range(h):
for x in range(w):
img.putpixel( (x, y), (
SOM.Umatrix[y, x, 0],
SOM.Umatrix[y, x, 1],
SOM.Umatrix[y, x, 2])
)
Here img is an instance of PIL's Image class. Here the image is created by iterating over the grid one pixel at a time; for each pixel, putpixel is called on img three times, the three calls of course corresponding to the three values in an rgb tuple.
From the matrix that you create:
u(1) u(1,2) u(2) u(2,3) u(3)
u(1,4) u(1,2,4,5) u(2,5) u(2,3,5,6) u(3,6)
u(4) u(4,5) u(5) u(5,6) u(6)
u(4,7) u(4,5,7,8) u(5,8) u(5,6,8,9) u(6,9)
u(7) u(7,8) u(8) u(8,9) u(9)
The elements with single numbers like u(1), u(2), ..., u(9) as just the elements with more than two numbers like u(1,2,4,5), u(2,3,5,6), ... , u(5,6,8,9) are calculated using something like the mean, median, min or max of the values in the neighborhood.
It's a nice idea calculate the elements with two numbers first, one possible code for that is:
for i in range(self.h_u_matrix):
for j in range(self.w_u_matrix):
nb = (0,0)
if not (i % 2) and (j % 2):
nb = (0,1)
elif (i % 2) and not (j % 2):
nb = (1,0)
self.u_matrix[(i,j)] = np.linalg.norm(
self.weights[i //2, j //2] - self.weights[i //2 +nb[0], j // 2 + nb[1]],
axis = 0
)
In the code above the self.h_u_matrix = self.weights.shape[0]*2 - 1 and self.w_u_matrix = self.weights.shape[1]*2 - 1 are the dimensions of the U-Matrix. With that said, for calculate the others elements it's necessary obtain a list with they neighboors and apply a mean for example. The following code implements that's idea:
for i in range(self.h_u_matrix):
for j in range(self.w_u_matrix):
if not (i % 2) and not (j % 2):
nodelist = []
if i > 0:
nodelist.append((i-1,j))
if i < 4:
nodelist.append((i+1, j))
if j > 0:
nodelist.append((i,j -1))
if j < 4:
nodelist.append((i,j+1))
meanlist = [self.u_matrix[u_node] for u_node in nodelist]
self.u_matrix[(i,j)] = np.mean(meanlist)
elif (i % 2) and (j % 2):
meanlist = [
(i - 1, j),
(i + 1, j),
(i, j - 1),
(i, j + 1)]
self.u_matrix[(i,j)] = np.mean(meanlist)
Is it possible to perform a Cascaded Hough Transform in OpenCV? I understand its just a HT followed by another one. The problem I'm facing is that the values returned are always rho and theta and never in y-intercept form.
Is it possible to convert these values back to y-intercept and split them into sub-spaces so I can detect vanishing points?
Or is it just better to program an implementation of HT myself in, say, Python?
you could try to populate the Hough domain with m and c parameters instead, so that y = mx + c can be re-written as c = y - mx so instead of the usual rho = x cos(theta) + y sin(theta), you have c = y - mx
normally, you'd go through the thetas and calculate the rho, then you increment the accumulator value for that pair of rho and theta. Here, you'd go through the value of m and calculate the values of c, then accumulate that m,c element in the accumulator. The bin with the most votes would be the right m,c
// going through the image looking for edge pixels
for (i = 0;i<numrows;i++)
{
for (j = 0;j<numcols;j++)
{
if (img[i*numcols + j] > 1)
{
for (n = first_m;n<last_m;n++)
{
index = i - n * j;
accum[n][index]++;
}
}
}
}
I guess where this becomes ineffective is that its hard to define the step size for going through m as they should technically go from -infinity to infinity so you'd kind of have trouble. yeah, so much for Hough transform in terms of m,c. Lol
I guess you could go the other way and isolate m so it would be m = (y-c)/x so that now, you cycle through a bunch of y values that make sense and its much more manageable though it's still hard to define your accumulator matrix because m still has no limit. I guess you could limit the values of m that you would be interested in looking for.
Yeah, much more sense to go with rho and theta and convert them into y = mx + c and then even making a brand new image and re-running the hough transform on it.
I don't think OpenCV can perform cascaded hough transforms. You should convert them to xy space yourself. This article might help you:
http://aishack.in/tutorials/converting-lines-from-normal-to-slopeintercept-form/