Unable to properly calculate [a, b] space in Hough transformation for circle detection - image-processing

Code of function performing accumulation in [a, b] space
minr: minimum radius.
maxr: maximum radius.
Magnitude: Binary output after edge detection using Sobel.
Gradient: Direction map. Calculated with 'atan(Ix./Iy)' where Ix is horizontal and Iy is vertical.
function [A] = accumulation(minr, maxr, magnitude, gradient)
[rows, cols, ~] = size(magnitude);
A = zeros(rows, cols, maxr);
for row = 1:rows
for col = 1:cols
for r = minr:maxr
a = row - r * cos(gradient(row, col));
b = col - r * sin(gradient(row, col));
a = round(a);
b = round(b);
if (a > 0 && a <= rows && b > 0 && b <= cols)
A(a, b, r) = A(a, b, r) + (magnitude(row, col)/r);
end
end
end
end
end
Output
Although, I am using 3 dimensional array, following image is in 2D just to show the issue.
Steps performed before accumulation in [a, b] space
Smoothing using 3x3 Gaussian filter.
Edge detection using Sobel operators which returns magnitude and direction map.
Thresholding and thinning.
Source
I am using Circle Detection Using Hough Transforms
Documentation by Jaroslav Borovicka for guidance.

One issue I see in you code is that you set only one point per r. You need two. Note that the gradient gives you the orientation of the edge, but you don't know the direction towards the center -- unless you're computing the gradient of an image with solid disks, and you know the contrast with the background (i.e it's always black in white or white on black). Typically one sets a point at distance r in direction theta and another in direction theta + pi.
Another problem you might be having is inaccuracy in the computation of the gradient. If you compute this on a binarized image, the direction of the gradient will be off by a lot. Smithing your grey-value image before computing the gradient might help (or better, use Gaussian gradients).
"Smoothing using 3x3 Gaussian filter" is wrong by definition. See the link above.
"Thresholding and thinning" -- try not thresholding. Your code is set up to accumulate using gradient magnitude as weights. Use those, they'll help.
Finally, don't use atan, use atan2 instead.

Related

OpenCv warpPerspective meaning of elements of homography

I have a question regarding the meaning of the elements from an projective transformation matrix e.g. in an homography used by OpenCv warpPerspective.
I know the basic of an affin transformation, but here I'm more interested in the projective transformation, meaning in the below shown matrix the elements A31 and A32:
A11 A12 A13
A21 A22 A23
A31 A32 1
I played around with the values a bit which means having a fixed numbers for all other element. Meaning:
1 0 0
0 1 0
A31 A32 1
to have just the projective elements.
But what exactly causing the elements A31 and A32 ? Like A13 and A23 are responsible for the horizontal and vertical translation.
Is there an simple explanation for this two elements? Like having a positive value means ...., having a negativ value meaning ... . S.th. like that.
Hope anyone can help me.
Newton's descriptions are correct, but it might be helpful to actually see the transformations to understand what's going on, and how they might work together with other values in the transformation matrix to make a bit more sense. I'll give some python/OpenCV examples with animations to show what these values do.
import numpy as np
import cv2
img = cv2.imread('img1.png')
h, w = img.shape[:2]
# initializations
max_m20 = 2e-3
nsteps = 50
M = np.eye(3)
So here I'm setting the transformation matrix to be the identity (no transformation). We want to see the effect of changing the element at (2, 0) in the transformation matrix M, so we'll animate by looping through nsteps linearly spaced between 0 to max_m20.
for m20 in np.linspace(0, max_m20, nsteps):
M[2, 0] = m20
warped = cv2.warpPerspective(img, M, (w, h))
cv2.imshow('warped', warped)
k = cv2.waitKey(1)
if k == ord('q') & 0xFF:
break
I applied this on an image taken from Oxford's Visual Geometry Group.
So indeed, we can see that this is similar to either rotating your camera around a point that is aligned with the left edge of the image, or rotating the image itself around an axis. However, it is a little different than that. Note that the top edge stays along the top the whole time, which is a little strange. Instead of we rotate around an axis like above, we would imagine that the top edge would start to come down on the right edge too. Like this:
Well, if you're thinking about transformations, one easy way to get this transformation is to take the transformation above, and add some skew distortion so that the right top side is being pushed down as that bottom right corner is being pushed up. And that's actually exactly how this view was created:
M = np.eye(3)
max_m20 = 2e-3
max_m10 = 0.6
for m20, m10 in zip(np.linspace(0, max_m20, nsteps), np.linspace(0, max_m10, nsteps)):
M[2, 0] = m20
M[1, 0] = m10
warped = cv2.warpPerspective(img, M, (w, h))
cv2.imshow('warped', warped)
k = cv2.waitKey(1)
if k == ord('q') & 0xFF:
break
So the right way to think about the perspective in these matrices is, IMO, with the skew entries and the last row together. Those are the two places in the homography matrix where angles actually get modified*; otherwise, it's just rotation, scaling, and translation---all of which are angle preserving.
*Note: Actually, angles can be changed in one more way that I didn't mention. Affine transformations allow for non-uniform scaling, which means you can stretch a shape in width and not in height or vice-versa, which would also change the angles. Imagine if you had a triangle and stretched it only in width; the angles would change. So it turns out that non-uniform scaling (i.e. when the first and middle element of the transformation matrix are different values) can also modify angles in addition to the perspective change and shearing distortions.
Note that in these examples, the same applies to the second entry in the last row with the other skew location; the only difference is it happens at the top instead of the left side. Negative values in both cases is akin to rotating the plane along that axis towards, instead of farther away from, the camera.
The 3x1 ,3x2 elements of homography matrix change the plane of the image. Thats the difference between Affine and Homography matrices. For instance consider this- The A31 changes the plane of your image along the left edge. Its like sticking your image to a stick like a flag and rotating. The positive is clock wise and the negative is reverse. The other element does the same from the top edge. But together, they set a plane for your image. That's the simplest way i could put it.

Code for a multiple quadratic (or polynomial) least squares (surface fit)?

for a machine vision project I am trying to search image data for quadratic surfaces (f(x,y) = Ax^2+Bx+Cy^2+Dy+Exy+F). My plan is to iterate through regions of data and perform a surface-fit, look at the error, see if it's a continuous surface (which would probably indicate a feature in the image).
I was previously able to find quadratic curves (f(x) = Ax^2+Bx+C) in the image data by sampling lines, by using the equations on this site
Link
this worked well, was promising, but it would be much more useful for my task to find 2-D regions that form continuous surfaces.
I see lots of articles indicating that least squares regressions scales up to multiple dimensions, but I'm not able to find code for this Hopefully there is a "closed form" (non-iterative, just compute from your data points) solution, like described above for 1D data. Does anybody know of some source or pseudocode that accomplishes this? Thanks.
(Sorry if my terminology is a bit off.)
I'm not sure what your background is, but if you know some linear algebra you will find linear least squares on wikipedia useful.
Lets take the following example. Say we have the following image
and we want to know how well this fits to a 2D quadratic function in a least squares sense.
Probably the most straightforward way to solve the problem is to compute the optimal coefficients in a least squares sense, then check the error.
First we need to describe the matrices.
Let X be a matrix containing every x,y coordinate in the image, taking the form
X = [x1 x1^2 y1 y1^2 x1*y1 1;
x2 x2^2 y2 y2^2 x2*y2 1;
...
xN xN^2 yN yN^2 xN*yN 1];
For the example image above, X would be a 100x6 matrix.
Let y be the image intensity values in a vector of the form
y = [img(x1,y1);
img(x2,y2);
...
img(xN,yN)]
In this case y is a 100 element column vector.
We want to minimize the least squares objective function S with respect to the vector of coefficients b
S(b) = |y - X*b|^2
where |.| is the L2 norm and b is the desired coefficients
b = [A;
B;
C;
D;
E;
F]
Taking the vector derivative of S(b) with respect to b, setting to zero, and solving for b leads to the standard least squares solution.
b = inv(X'X)*X'*y
where inv is the matrix inverse, ' is transpose, and * is matrix multiplication.
MATLAB example.
% Generate an image
% define x,y coordinates for each location in the image
[x,y] = meshgrid(1:10,1:10);
% true coefficients
b_true = [0.1 0.5 0.3 -0.4 0.4 124];
% magnitude of noise
P = 2;
% create image
img = b_true(1).*x + b_true(2).*x.^2 + b_true(3).*y + b_true(4).*y.^2 + b_true(5).*x.*y + b_true(6);
noise = P*randn(10,10);
img = img + noise;
% Begin least squares optimization
% create matrices
X = [x(:) x(:).^2 y(:) y(:).^2 x(:).*y(:) ones(size(x(:)))];
y = img(:);
% estimated coefficients
b = (X.'*X)\(X.')*y
% mean square error (expected to be near P^2)
E = 1/numel(y) * sum((y - X*b).^2)
Output
b =
0.0906
0.5093
0.1245
-0.3733
0.3776
124.5412
E =
3.4699
In your application you would probably want to define some threshold such that when E < threshold you accept the image (or image region) as a quadratic polynomial.

Are there standard methods for computing the direction field of an image?

I'd like to compute a sort of direction field on a 2D image, as (poorly) illustrated from this photoshop mockup. NOTE: This is NOT a vector field as you learn about in differential equations. Instead, this is something that draws along the lines that one would see if they computed level sets of the image.
Are there known methods of obtaining this type of direction field (red lines) of an image? It seems like it almost behaves like the normal to the gradient, but this isn't exactly it, either, since there are places where the gradient is zero and I'd like direction fields at these locations as well.
I was able to find a paper on how to do this for fingerprint processing that went into enough detail that their results were repeatable. It's unfortunately behind a paywall, but here it is for anyone interested and able to access the full text:
Systematic methods for the computation of the directional fields and singular points of fingerprints
EDIT: As requested, here is a quick and dirty summary (in Python) of how this is achieved in the above paper.
A naive approach would be to average the gradient in a small square neighborhood around the target pixel, much like the superimposed grid on the image in the question, and then compute the normal. However, if you simply average the gradient, it's possible that opposite gradients in the region will cancel each other (e.g. when computing the orientation along a ridge). Thus, it is common to compute with squared gradients, since gradients pointing in opposite directions would then be aligned. There is a clever formula for the squared gradient based on the original gradient. I won't give the derivation, but here is the formula:
Now, take the sum of squared gradients over the region (modulo some piece-wise defined compensations for the way angles work). Finally, through some arctangent magic, you'll get the orientation field.
If you run the following code on a smooth grayscale bitmap image with the grid-size chosen appropriately and then plot the orientation field O alongside your original image, you'll see how the orientation field more or less gives the angles I asked about in my original question.
from scipy import misc
import numpy as np
import math
# Import the grayscale image
bmp = misc.imread('path/filename.bmp')
# Compute the gradient - VERY important to convert to floats!
grad = np.gradient(bmp.astype(float))
# Set the block size (superimposed grid on the sample image in the question)
blockRadius=5
# Compute the orientation field. Result will be a matrix of angles in [0, \pi), one for each pixel in the original (grayscale) image.
O = np.zeros(bmp.shape)
for x in range(0,bmp.shape[0]):
for y in range(0,bmp.shape[1]):
numerator = 0.
denominator = 0.
for i in range(max(0,x-blockRadius),min(bmp.shape[0],x+blockRadius)):
for j in range(max(0,y-blockRadius),min(bmp.shape[0],y+blockRadius)):
numerator = numerator + 2.*grad[0][i,j]*grad[1][i,j]
denominator = denominator + (math.pow(grad[0][i,j],2.) - math.pow(grad[1][i,j],2.))
if denominator==0:
O[x,y] = 0
elif denominator > 0:
O[x,y] = (1./2.)*math.atan(numerator/denominator)
elif numerator >= 0:
O[x,y] = (1./2.)*(math.atan(numerator/denominator)+math.pi)
elif numerator < 0:
O[x,y] = (1./2.)*(math.atan(numerator/denominator)-math.pi)
for x in range(0,bmp.shape[0]):
for y in range(0,bmp.shape[1]):
if O[x,y] <= 0:
O[x,y] = O[x,y] + math.pi
else:
O[x,y] = O[x,y]
Cheers!

How to efficiently find and remove 1 pixel bands of image intensity changes?

We're having some visual artifacts on a normal map for a shader because of some bands of single pixels which are very contrast to their surroundings. Just to be clear, edges are not an issue, only these single pixel bands.
Using something like typical Sobel edge detection would not work in this case because on top of such a band, it would detect 0. I can think of other modifications to the kernel which might works such as
-1 -2 -1
2 4 2
-1 -2 -1
but I assumed that there was likely a "correct" mathematical way to do such an operation.
In the end, I want to smooth these lines out using the surrounding pixels (so a selective blur). These lines could appear in any orientation, so if I were to use the above kernel, I would need to apply it in both direction and add it to get the line intensity similar to when applying the Sobel kernel.
I assume that you have lines of 1 pixel width in your image that are brighter or darker than their surroundings and you want to find them and remove them from the image and replace the removed pixels by an average of the local neighborhood.
I developed an algorithm for this and it works on my example data (since you did not give any data). It has two parts:
Identification of lines
I could not think of a simple, yet effective filter to detect lines (which are connected, so one would probably need to look at correlations). So I used a simple single pixel detection filter:
-1 -1 -1
-1 8 -1
-1 -1 -1
and then some suitable thresholding.
Extrapolation of data from outside of a mask to the mask
A very elegant solution (using only convolutions) is to take the data outside the mask and convolve it with a gaussian, then take negative mask and convolve it with the very same gaussian, then divide both pixelwise. The result within the mask is the desired blurring.
What it is mathematically: a weighted averaging of the data.
Here is my phantom data:
And this is the identification of the lines
And the final result shows that the distortion has been suppressed tenfold:
And finally my code (in Matlab):
%% create phantom data with lines (1pixel wide bands)
[x, y] = ndgrid(1:100, 1:100);
original = 3 * x - 2 * y + 100 * sin(x / 2) + 120 * cos(y / 3); % funny shapes
bw = original > mean(original(:)); % black and white
distortion = bwmorph(bw,'remove'); % some lines
data = original + max(original(:)) * distortion; % phantom
% show
figure();
subplot(1,3,1); imagesc(original); axis image; colormap(hot); title('original');
subplot(1,3,2); imagesc(distortion); axis image; title('distortion');
subplot(1,3,3); imagesc(data); axis image; title('image');
%% line detection
% filter by single pixel filter
pixel_filtered = filter2([-1,-1,-1;-1,8,-1;-1,-1,-1], data);
% create mask by simple thresholding
mask = pixel_filtered > 0.2 * max(pixel_filtered(:));
% show
figure();
subplot(1,2,1); imagesc(pixel_filtered); axis image; colormap(hot); title('filtered');
subplot(1,2,2); imagesc(mask); axis image; title('mask');
%% line removal and interpolation
% smoothing kernel: gaussian
smooth_kernel = fspecial('gaussian', [3, 3], 1);
smooth_kernel = smooth_kernel ./ sum(smooth_kernel(:)); % normalize to one
% smooth image outside mask and divide by smoothed negative mask
smoothed = filter2(smooth_kernel, data .* ~mask) ./ filter2(smooth_kernel, ~mask);
% withing mask set data to smoothed
reconstruction = data .* ~mask + smoothed .* mask;
% show
figure();
subplot(1,3,1); imagesc(reconstruction); axis image; colormap(hot); title('reconstruction');
subplot(1,3,2); imagesc(original); axis image; title('original');
subplot(1,3,3); imagesc(reconstruction - original); axis image; title('difference');

Finding isocurve from triangulation with known uv at all vertices

I have a 3D triangulated mesh which is similar to a curved piece of paper in that it has a 4 edges and lives in 3-dimensional space. Edges may be different lengths and curve differently, but it could theoretically be continually morphed to look like a piece of paper. A uv coordinate has been assigned to every vertex and the range of u and v is between 0 and 1. Some vertices are obviously on the border. For the bottom border u is in the range [0,1] and v is 0. Top border vertices have u within [0,1] and v = 1. The left and right borders have u = 0 or u = 1 (respectively) with v within [0,1].
Now think about the "isocurve" where u = 0.5. This would be the "line" (or collection of line segments?) from bottom to top of the "middle" of the surface. How would I go about finding that?
Or, let's say I wanted to find the 3D coordinate corresponding to the uv coordinate (0.2,0.7). How would I get there?
I don't want to implement this by putting data through a renderer (OpenGL, etc). I'm sure there must be a standard method. It feels like the inverse of a texture mapping function.
Essentially both of your questions boil down to the same thing: how to convert between UV and XYZ coordinates.
This is an interpolation problem. Considering a single triangle in your mesh, you know both the UV and XYZ coordinates at the 3 vertices. As such, you have the right amount of data to interpolate X,Y,Z as linear functions of U,V:
X(U,V) = a0 + a1*U + a2*V
Y(U,V) = b0 + b1*U + b2*V
Z(U,V) = c0 + c1*U + c2*V
The problem then becomes how to determine the coefficients ai,bi,ci. This can be done by solving a set of linear equations based on the given vertex data. For example, the X coefficients for a given triangle can be found by solving:
[X1] [1 U1 V1] [a0]
[X2] = [1 U2 V2] * [a1]
[X3] [1 U3 V3] [a2]
Once you have all of these coefficients for each triangle in the mesh you can then determine an XYZ coordinate for any UV pair:
1. Locate the triangle that contains the UV point.
2. Evaluate the X(U,V),Y(U,V),Z(U,V) functions for the given triangle.

Resources