Perfect homography with openCV. Hough? - opencv

For an openCV project, I need to find a "perfect" homography from descriptor matching. I pushed paramters : many corners, retroprojection threshold very low...
It's close but I need almost pixel perfect homography.
I name marker the image I'm looking for, query the image which contains the marker.
My idea was to refine the 4 corner positions of the marker in query image. And then recalculate homography from these refined 4 corners.
I was thinking of using hough to detect line intersections. So I get corners which are indeed good candidates.
Now I need a score function to assess which of these candidates is the "best". Here what I tried
1/
- let's call H an homogrphy
- test = query + H*marker. So if my homography was "pefect", test would be identical to query (but for shadows...).
2/ Then I calculate the "difference" naively like below. I sum the absolute difference and then divide by the area (otherwise the smaller, the better). Not reliable... I gave more weight to the sums and still fails.
It really seemed a good idea to me but it's an epic fail for now: the difference does not help find the "better" corners. Any idea ?
Thank you very much,
Michaƫl
def diffImageScore(testImage,workingImage, queryPersp):
height,width,d = testImage.shape
height2,width2,d2 = workingImage.shape
area=quadrilatereArea(queryPersp)*1.0
score=0
height, width, depth = testImage.shape
img1Gray= cv2.cvtColor(cv2.blur(testImage,(3,3)), cv2.COLOR_BGR2GRAY)
img2Gray= cv2.cvtColor(cv2.blur(workingImage,(3,3)), cv2.COLOR_BGR2GRAY)
for i in range(0, height):
for j in range(0, width):
s1=abs(int(img1Gray[i,j]) -int(img2Gray[i,j] ))
s1=pow(s1,3)
score+=s1
return score/area

Related

How exactly does dp parameter in cv::HoughCircles work?

I read similar question in Stack Overflow. I tried, but I still can not understand how it works.
I read OpenCV document cv::HoughCircles, here are some explanation about dp parameter:
Inverse ratio of the accumulator resolution to the image resolution. For example, if dp=1 , the accumulator has the same resolution as the input image. If dp=2 , the accumulator has half as big width and height.
Here are my question. For example, if dp = 1, the size of accumulator is same as image, there is a consistent one-to-one match between pixels in image and positions in accumulator, but if dp = 2, how to match?
Thanks in advance.
There is no such thing as a one-to-one match here. You do have an image with pixels and a hough space, which is used for voting for circles. This parameter is just a convenient way to specify the size of the hough space relatively to the image size.
Please take a look at this answer for more details.
EDIT:
Your image has (x,y)-coordinates. Your circle hough space has (a,b,r)-coordinates, whereas (a,b) are the circle centers and r are the radii. Let's say you find a edge pixel. Now you vote for each circle, which could go through this edge pixel. I found this nice picture of hough space with a single vote i.e. a single edge pixel (continuous case). In practice this vote happens within the 3D accumulator matrix. You can think of it as rasterization of this continuous case.
Now, as already mentioned the dp parameter defines the size of this accumulator matrix relatively to your image size. The bigger the dp parameter the lower the resolution of your rasterization. It's like taking photos with different resolutions. If you downsize your photo multiple pixels will reduce to a single one. Same happens if you reduce your accumulator matrix respectively increase your dp parameter. Multiple votes for different circle centers (which lie next to each other) and radii (which are of similar size) are now merged, i.e. you do get less accurate circle parameters, but a more "robust" voting.
Please be aware that the OpenCV implementation is a little bit more complicated (they use the Hough gradient method instead of the standard Hough transform) but the considerations still apply.

SIFT matches and recognition?

I am developing an application where I am using SIFT + RANSAC and Homography to find an object (OpenCV C++,Java). The problem I am facing is that where there are many outliers RANSAC performs poorly.
For this reasons I would like to try what the author of SIFT said to be pretty good: voting.
I have read that we should vote in a 4 dimension feature space, where the 4 dimensions are:
Location [x, y] (someone says Traslation)
Scale
Orientation
While with opencv is easy to get the match scale and orientation with:
cv::Keypoints.octave
cv::Keypoints.angle
I am having hard time to understand how I can calculate the location.
I have found an interesting slide where with only one match we are able to draw a bounding box:
But I don't get how I could draw that bounding box with just one match. Any help?
You are looking for the largest set of matched features that fit a geometric transformation from image 1 to image 2. In this case, it is the similarity transformation, which has 4 parameters: translation (dx, dy), scale change ds, and rotation d_theta.
Let's say you have matched to features: f1 from image 1 and f2 from image 2. Let (x1,y1) be the location of f1 in image 1, let s1 be its scale, and let theta1 be it's orientation. Similarly you have (x2,y2), s2, and theta2 for f2.
The translation between two features is (dx,dy) = (x2-x1, y2-y1).
The scale change between two features is ds = s2 / s1.
The rotation between two features is d_theta = theta2 - theta1.
So, dx, dy, ds, and d_theta are the dimensions of your Hough space. Each bin corresponds to a similarity transformation.
Once you have performed Hough voting, and found the maximum bin, that bin gives you a transformation from image 1 to image 2. One thing you can do is take the bounding box of image 1 and transform it using that transformation: apply the corresponding translation, rotation and scaling to the corners of the image. Typically, you pack the parameters into a transformation matrix, and use homogeneous coordinates. This will give you the bounding box in image 2 corresponding to the object you've detected.
When using the Hough transform, you create a signature storing the displacement vectors of every feature from the template centroid (either (w/2,h/2) or with the help of central moments).
E.g. for 10 SIFT features found on the template, their relative positions according to template's centroid is a vector<{a,b}>. Now, let's search for this object in a query image: every SIFT feature found in the query image, matched with one of template's 10, casts a vote to its corresponding centroid.
votemap(feature.x - a*, feature.y - b*)+=1 where a,b corresponds to this particular feature vector.
If some of those features cast successfully at the same point (clustering is essential), you have found an object instance.
Signature and voting are reverse procedures. Let's assume V=(-20,-10). So during searching in the novel image, when the two matches are found, we detect their orientation and size and cast a respective vote. E.g. for the right box centroid will be V'=(+20*0.5*cos(-10),+10*0.5*sin(-10)) away from the SIFT feature because it is in half size and rotated by -10 degrees.
To complete Dima's , one needs to add that the 4D Hough space is quantized into a (possibly small) number of 4D boxes, where each box corresponds to the simiƩarity given by its center.
Then, for each possible similarity obtained via a tentative matching of features, add 1 into the corresponding box (or cell) in the 4D space. The output similarity is given by the cell with the more votes.
In order to computethe transform from 1 match, just use Dima's formulas in his answer. For several pairs of matches, you may need to use some least squares fit.
Finally, the transform can be applied with the function cv::warpPerspective(), where the third line of the perspective matrix is set to [0,0,1].

How to detect 45 degree edges in an image

If instead of getting all edges, I only want edges that make 45 degree angles. What is a method to detect these?
Would it be possible to detect all edges, then somehow run a constrained hough transform to detect which edges form 45 degrees?
What is wrong with using an diagonal structure element and simply convolve the image??
Details
Please read here and it should become clear how to build the structuring element. If you are familiar with convolution than you can build a simple structure matrix which amplifies diagonals without theory
{ 0, 1, 2},
{-1, 0, 1},
{-2, -1, 0}
The idea is: You want to amplify pixel in the image, where 45deg below it is something different than 45deg above it. Thats the case when you are at a 45deg edge.
Taking an example. Following picture
convolved by the above matrix gives a graylevel image where the highest pixel values have those lines which are exactly 45deg.
Now the approach is to simply binarize the image. Et voila
First of all, it is possible to do this as post processing.
The result of Hough is in the parameter space of (angle,radius).
So you can simply take a slice in say angle=(45-5,45+5) and all radiuses.
An alternative method is that the output of edge detection will contain only 45/135 angle edges.
If you use a kernel but want line equations, then you'll still have to perform a line fit after the edge pixels are found. If you're certain the lines are exactly 45 degrees, then knowing the (x,y) point on any discovered line or line segment is sufficient to find the line equation.
Hough (rho, theta) parameter space can use whatever ranges of rho and theta that you'd like. You might preprocess the image to favor neighbor pixels at the proper angle. For example, give a "bonus point" to an edge pixel if it has 8-neighbors at the appropriate angle. You can certainly mix a kernel-based method (such as halirutan suggested) with a parametric or parameterless Hough algorithm.
A recent implementation of Hough runs at blazing fast speeds, so if you're looking for a quick solution you might download the open source code and then simply filter the output.
"Real-time line detection through an improved Hough transform voting scheme"
by Fernandes and Oliveira
http://www.ic.uff.br/~laffernandes/projects/kht/index.html

Opencv match contour image

I'd like to know what would be the best strategy to compare a group of contours, in fact are edges resulting of a canny edges detection, from two pictures, in order to know which pair is more alike.
I have this image:
http://i55.tinypic.com/10fe1y8.jpg
And I would like to know how can I calculate which one of these fits best to it:
http://i56.tinypic.com/zmxd13.jpg
(it should be the one on the right)
Is there anyway to compare the contours as a whole?
I can easily rotate the images but I don't know what functions to use in order to calculate that the reference image on the right is the best fit.
Here it is what I've already tried using opencv:
matchShapes function - I tried this function using 2 gray scales images and I always get the same result in every comparison image and the value seems wrong as it is 0,0002.
So what I realized about matchShapes, but I'm not sure it's the correct assumption, is that the function works with pairs of contours and not full images. Now this is a problem because although I have the contours of the images I want to compare, they are hundreds and I don't know which ones should be "paired up".
So I also tried to compare all the contours of the first image against the other two with a for iteration but I might be comparing,for example, the contour of the 5 against the circle contour of the two reference images and not the 2 contour.
Also tried simple cv::compare function and matchTemplate, none with success.
Well, for this you have a couple of options depending on how robust you need your approach to be.
Simple Solutions (with assumptions):
For these methods, I'm assuming your the images you supplied are what you are working with (i.e., the objects are already segmented and approximately the same scale. Also, you will need to correct the rotation (at least in a coarse manner). You might do something like iteratively rotate the comparison image every 10, 30, 60, or 90 degrees, or whatever coarseness you feel you can get away with.
For example,
for(degrees = 10; degrees < 360; degrees += 10)
coinRot = rotate(compareCoin, degrees)
// you could also try Cosine Similarity, or even matchedTemplate here.
metric = SAD(coinRot, targetCoin)
if(metric > bestMetric)
bestMetric = metric
coinRotation = degrees
Sum of Absolute Differences (SAD): This will allow you to quickly compare the images once you have determined an approximate rotation angle.
Cosine Similarity: This operates a bit differently by treating the image as a 1D vector, and then computes the the high-dimensional angle between the two vectors. The better the match the smaller the angle will be.
Complex Solutions (possibly more robust):
These solutions will be more complex to implement, but will probably yield more robust classifications.
Haussdorf Distance: This answer will give you an introduction on using this method. This solution will probably also need the rotation correction to work properly.
Fourier-Mellin Transform: This method is an extension of Phase Correlation, which can extract the rotation, scale, and translation (RST) transform between two images.
Feature Detection and Extraction: This method involves detecting "robust" (i.e., scale and/or rotation invariant) features in the image and comparing them against a set of target features with RANSAC, LMedS, or simple least squares. OpenCV has a couple of samples using this technique in matcher_simple.cpp and matching_to_many_images.cpp. NOTE: With this method you will probably not want to binarize the image, so there are more detectable features available.

Finding location of rectangles in an image with OpenCV

I'm trying to use OpenCV to "parse" screenshots from the iPhone game Blocked. The screenshots are cropped to look like this:
I suppose for right now I'm just trying to find the coordinates of each of the 4 points that make up each rectangle. I did see the sample file squares.c that comes with OpenCV, but when I run that algorithm on this picture, it comes up with 72 rectangles, including the rectangular areas of whitespace that I obviously don't want to count as one of my rectangles. What is a better way to approach this? I tried doing some Google research, but for all of the search results, there is very little relevant usable information.
The similar issue has already been discussed:
How to recognize rectangles in this image?
As for your data, rectangles you are trying to find are the only black objects. So you can try to do a threshold binarization: black pixels are those ones which have ALL three RGB values less than 40 (I've found it empirically). This simple operation makes your picture look like this:
After that you could apply Hough transform to find lines (discussed in the topic I referred to), or you can do it easier. Compute integral projections of the black pixels to X and Y axes. (The projection to X is a vector of x_i - numbers of black pixels such that it has the first coordinate equal to x_i). So, you get possible x and y values as the peaks of the projections. Then look through all the possible segments restricted by the found x and y (if there are a lot of black pixels between (x_i, y_j) and (x_i, y_k), there probably is a line probably). Finally, compose line segments to rectangles!
Here's a complete Python solution. The main idea is:
Apply pyramid mean shift filtering to help threshold accuracy
Otsu's threshold to get a binary image
Find contours and filter using contour approximation
Here's a visualization of each detected rectangle contour
Results
import cv2
image = cv2.imread('1.png')
blur = cv2.pyrMeanShiftFiltering(image, 11, 21)
gray = cv2.cvtColor(blur, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.015 * peri, True)
if len(approx) == 4:
x,y,w,h = cv2.boundingRect(approx)
cv2.rectangle(image,(x,y),(x+w,y+h),(36,255,12),2)
cv2.imshow('thresh', thresh)
cv2.imshow('image', image)
cv2.waitKey()
I wound up just building on my original method and doing as Robert suggested in his comment on my question. After I get my list of rectangles, I then run through and calculate the average color over each rectangle. I check to see if the red, green, and blue components of the average color are each within 10% of the gray and blue rectangle colors, and if they are I save the rectangle, if they aren't I discard it. This process gives me something like this:
From this, it's trivial to get the information I need (orientation, starting point, and length of each rectangle, considering the game window as a 6x6 grid).
The blocks look like bitmaps - why don't you use simple template matching with different templates for each block size/color/orientation?
Since your problem is the small rectangles I would start by removing them.
Since those lines are much thinner than the borders of the rectangles I would start by applying morphological operations on the image.
Using a structural element that looks like this:
element = [ 1 1
1 1 ]
should remove lines that are less than two pixels wide. After the small lines are removed the rectangle finding algorithm of OpenCV will most likely do the rest of the job for you.
The erosion can be done in OpenCV by the function cvErode
Try one of the many corner detectors like harris corner detector. also it is in general a good idea to try that at multiple resolutions : so do some preprocessing of of varying magnification.
It appears that you want some sort of color dominated square then you can suppress the other colors, by first using something like cvsplit .....and then thresholding the color...so only that region remains....follow that with a cropping operation ...I think that could work as well ....

Resources