Understanding Hough-Transformation [duplicate] - opencv

This question already has an answer here:
Understanding Hough Transform [closed]
(1 answer)
Closed 3 years ago.
I want to understand the Hough Transformation for school.
I know that we can not represent vertical lines which are parallel to the Y-axis (with the y = m*x+b). But we can do this with the polar coordinates r and theta with (y= - cos(theta)/sin(theta) * x + r/sin(theta)).
But lets say I have a line which goes trough this two points. P1(0,0) and P2(0, 100). So this is a line which is exactly like the Y-Axis.
How can this be represented by the polar coordinates r and theta?
since r is 0 so theta is also 0. I don't understand how this line can be represented in the hough space... :/
Can someone explain this to me?

Your equation for the Hough transform can also be written as (is more commonly written as):
r = x*cos(theta) + y*sin(theta)
This can still be solved if you set r=0. In fact, this represents all the lines that go through the pixel at (0,0).
For the case of the vertical line through (0,0), we have r=0 and theta=pi/2. This leads to:
0 = x*1 + y*0
This is satisfied for x=0 and any y. So all pixels (0,y) form this line.

Related

Hough transformation calculation

hough transformation using the normal equation of a line.
For calculating the value of r while keeping the values of x and y same for different θ the following formula is used.
r = sin(θ)y + cos(θ)x
But the results are not the same as shown in the slides. Am I missing something? I am a newbie please be gentle.
As #Ash explained in the comment section above the angles are expected to be expressed in radians not degree.
So r = sin(θ*pi/180)y + cos(θ*pi/180)x will give the correct answer.

Hough Transform Accumulator to Cartesian

I'm studying a course on vision systems and one of the questions posed was;
For the accumulator shown;
Determine the most likely r,θ combination representing the straight line of the greatest strength in the original image.
From my understanding of the accumulator this would be r = 60, θ = 150 as the 41 votes is the highest number of votes in this cluster of large votes. Am I correct with this combination?
And hence calculate the equation of this line in the form y = mx + c
I'm not sure of the conversion steps required to convert the r = 60, θ = 150 to y = mx + c with the information given since r = 60, θ = 150 denotes 1 point on the line.
State the resolution of your answer and give your reasoning
I assume the resolution is got to do with some of the steps in the auscultation and not the actual resolution of the original image since that's irrelevant to the edges detected in the image.
Any guidance on the above 3 points would be greatly appreciated!
Yes, this is correct.
This is asking you what the slope and intercept are of the line given r and theta. r and theta are not one point on the line, they are one point of the accumulator. r and theta describe a line using the line equation in polar coordinates: . This is the cool thing about the hough transform, every line in one space, (i.e. image space) can be described by a point in another space (r, theta). This could be done with m and b from the line equation , but as we all know, m is undefined for vertical lines. This is the reason the polar line equation is used. It is important to note that the line described by the HT r and theta refers to a line from the origin extending to the actual line in the image. This means your image line y = mx + b equation will need to be orthogonal to the polar equation. The wiki article on the HT describes this well and shows examples. I would recommend drawing a diagram of your r and theta extending to a line like this:
Then use trig to get two points on the red line. Two points are enough to give you m and b from the line equation.
I'm not entirely sure what "resolution" refers to in this context. But it does seem like your line estimator will have some precision loss since r is every 20 mm and theta is every 15 degrees. Perhaps it is asking what degree of error you could get given an accumulator of this resolution.

what input x maximize activation function in an autoencoder hidden layer?

Hi when i am reading about Stanford's Machine Learning materials about autoencoder, found a formula hard to prove by myself. Link to Material
Question is:
" What input image x would cause ai to be maximally activated? "
Screen shot of the Question and Context:
Many thanks to your answers in advance!
While this can be rigorously solved using KLT conditions and Lagrange multipliers, there is a more intuitive way to figure the result out. I assume that f(.) is a monotone increasing, sigmoid type of nonlinearity (ReLU is also valid). So, finding the maximum of w1x1+...+w100x100 + b under the constraint (x1)^2+...+(x100)^2 <= 1 is equivalent to finding the maximum of f(w1x1+...+w100x100 + b) with the same constraint.
Note that g = w1x1+...+w100x100 + b is a linear function of x terms (Name it as g, so later we can refer it by that). So, the direction of largest increase at any point (x1,...,x100) in the domain of that function is the same, which is the gradient. The gradient is simply (w1,w2,...,w100) at any point in the domain, which means if we go in the direction of (w1,w2,...,w100), independent from where we start, we obtain the largest increase in the function. To make things simplier and to allow us to visualize, assume that we are in the R^2 space and the function is w1x1 + w2x2 + b:
The optimum x1 and x2 are constrained to lie in or on the circle C:(x1)^2 + (x2)^2 =1. Assume that we are on the origin (0.0). If we go in the direction of the gradient (blue arrow) (w1,w2), we are going to attain the largest value of the function where the blue arrow intersects with the circle. That intersection has the coordinates c*(w1,w2) and it is c^2(w1^2 + w2^2) = 1, where c is a scalar coefficient. c is easily solved as c= 1 / sqrt(w1^2 + w2^2). Then at the intersection we have x1=w1/sqrt(w1^2 + w2^2) and x2=w2/sqrt(w1^2 + w2^2), which the solution we seek. This can be extended in the same way to 100 dimensional case.
You may ask why we started at the origin and not any other point in the circle. Note that the red line is perpendicular to the gradient vector and the function is constant along that line. Draw that (u1,u2) line, preserving its orientation, arbitrarily with the constraint that it intersects the circle C. Then choose any point on the line, such that it lies within the circle. On the (u1,u2) line, you start at the same value of the function g, wherever you are. Then as you go in the (w1,w2) direction, the longest path taken within the circle always goes through the origin, which means the path you increase the function g the most.

How to dertmine the DirectionVector of a Line?

I have a programming problem , in the context of a geometric shape recognition(Rectangles, ovals etc).
In this context, if I have a a simple line, from say (x1,y1) to (x2,y2) - made up of a series of points(x-y pairs) -
How would I calculate the DIRECTION VECTOR for this line? I understand the math behind it, but I'm finding the algorithm provided by my client a bit vague. I'm stuck at step 3) of this algorithm.
The following is the algorithm(in English as opposed ot psedocode), exactly as provided by my client.
1) Brake the points that make up a "stroke" or "line" up in to sets of X(where by default X= 20 - we will adjust) points = a PointSet
2) For Each PointSet, find the EndPouint(average of the points at the ends) for the first and last Y points(where by default Y= X/5).
3) Find the DirectionVector of the PointSet= Subtract the CentrePoints
4) For each pair of PointSets, find the AngleChange = the angle between the DirectionVectors of the PointSets.
and so on.......
I am trying to figure out what point (3) means......
Any help would be DEEPLy appreciated folks! THANKS in advance.
If the segment from (x1,y1) to (x2,y2) is short, then you can approximate its direction vector simply by: (x2-x1)*i + (y2-y1)*j.
Otherwise, you could use PCA to estimate the direction vector as the principal axis of individual points forming the segment,

How to test proximity of lines (Hough transform) in OpenCV

(This is a follow-up from this previous question).
I was able to successfully use OpenCV / Hough transforms to detect lines in pictures (scanned text); at first it would detect many many lines (at least one line per line of text), but by adjusting the 'threshold' parameter via trial-and-error, it now only detects "real" lines.
(The 'threshold' parameter is dependant on image size, which is a bit of a problem if one has to deal with images of different resolutions, but that's another story).
My problem is that the Hough transform sometimes detects two lines where there is only one; those two lines are very near one another and (apparently) parallel.
=> How can I identify that two lines are almost parallel and very near one another? (so that I can keep only one).
If you use the standard or multiscale hough, you will end up with the rho and theta coordinates of the lines in polar coordinates. Rho is the distance to the origin, and theta is normally the angle between the detected line and the Y axis. Without looking into the details of the hough transform in opencv, this is a general rule in those coordinates: two lines will be almost parallel and very near one another when:
- their thetas are nearly identical AND their rhos are nearly identical
OR
- their thetas are near 180 degrees apart AND their rhos are near each other's negative
I hope that makes sense.
That's interesting about the theta being the angle between the line and the y-axis.
Generally, the rho and theta values are visualized as being the angle from the x-axis to the line perpendicular to the line in question. The rho is then the length of this perpendicular line. Thus, a theta = 90 and rho = 20 would mean a horizontal line 20 pixels up from the origin.
A nice image is shown on Hough Transform question

Resources