Fuzzy Template matching? - opencv

I'm attempting to wrap my head around the basics of CV. The bit that initially got me interested was template matching (it was mentioned in a Pycon talk unrelated to CV), so I figured I'd start there.
I started with this image:
Out of which I want to detect Mario. So I cut him out:
I understand the concept of sliding the template around the image to see the best fit, and following a tutorial, I'm able to find mario with the following code:
def match_template(img, template):
s = time.time()
img_size = cv.GetSize(img)
template_size = cv.GetSize(template)
img_result = cv.CreateImage((img_size[0] - template_size[0] + 1,
img_size[1] - template_size[1] + 1), cv.IPL_DEPTH_32F, 1)
cv.Zero(img_result)
cv.MatchTemplate(img, template, img_result, cv.CV_TM_CCORR_NORMED)
min_val, max_val, min_loc, max_loc = cv.MinMaxLoc(img_result)
# inspect.getargspec(cv.MinMaxLoc)
print min_val
print max_val
print min_loc
print max_loc
cv.Rectangle(img, max_loc, (max_loc[0] + template.width, max_loc[1] + template.height), cv.Scalar(120.), 2)
print time.time() - s
cv.NamedWindow("Result")
cv.ShowImage("Result", img)
cv.WaitKey(0)
cv.DestroyAllWindows()
So far so good, but then I came to realize that this is incredibly fragile. It will only ever find Mario with that specific background, and with that specific animation frame being displayed.
So I'm curious, given that Mario will always have the same Mario-ish attributes, (size, colors) is there a technique with which I could find him regardless of whether his currect frame is standing still, or one of the various run cycle sprites? Kind of like fuzzy matching that you can do on strings, but for images.
Maybe since he's the only red thing, there is a way of simply tracking the red pixels?
The whole other issue is removing the background from the template. Maybe that would help the MatchTemplate function find Mario even though he doesn't exactly match the tempate? As of now, I'm not entirely sure how that would work ( I see that there is a mask param in MatchTemplate, but I'll have to investigate further)
My main question is whether or not template matching is the way to go about detecting an image that is mostly the same, but varies (like when he's walking), or is there another technique I should look into?
Update:
Attempts at matching other Marios
Going off of mmgp's suggestion that it should be workable for matching other things, I ran a couple of tests.
I used this as the template to match:
And then took a couple of screen shots to test the matching against.
For the first, I successfully find Mario, and get a max value of 1.
However, trying to find jumping Mario results in a complete misfire.
Now granted, the mario in the template, and the mario in the scene is facing opposite directions, as well as being different animation frames, but I would think they still match a lot more than anything else in the image -- if only for the colors alone. But it targets the platform as being the closest match to the template.
Note that the max value for this one was 0.728053808212.
Next I tried a scene without mario to see what would happen.
But oddly enough, I get the exact result as the image with jumping mario -- right down to the similarity value: 0.728053808212. Mario being in the picture is just as accurate as him not being in the picture.
Really strange! I don't know the actual details of the underlying algorithm, but I'd imagine, from a standard deviation perspective, the boxes in the scene that at least match the Red in template Mario's suit would be closer to the mean distance than a blue platform, right? So, it's extra confusing that it's not even in the general area of where I would expect it to be.
I'm guessing this is user error on my end, or maybe just a misunderstanding.
Why would a scene with a similar Mario have as much of a match as a scene with no Mario at all?

No method is infallible, but template matching do have a good chance to work there. It might require some pre-processing, and until there is a larger sample (a short video for example) to demonstrate the possible problems, there isn't much point in trying more advanced methods simply because some library implement them for you -- especially if you don't know under which conditions they are expected to work.
For instance, here are the results I get using template matching (red rectangles) -- all them are using the template http://i.stack.imgur.com/EYs9B.png, even the last one:
To achieve that I started by considering only the red channel of both the template and the input image. From that we easily calculate the internal morphological gradient and only then perform the matching. In order to not get a rectangle when Mario is not present, it is needed to set a minimum threshold for the matching. Here is the template and one of the images after these two transformations:
And here is some sample code to achieve that:
import sys
import cv2
import numpy
img = cv2.imread(sys.argv[1])
img2 = img[:,:,2]
img2 = img2 - cv2.erode(img2, None)
template = cv2.imread(sys.argv[2])[:,:,2]
template = template - cv2.erode(template, None)
ccnorm = cv2.matchTemplate(img2, template, cv2.TM_CCORR_NORMED)
print ccnorm.max()
loc = numpy.where(ccnorm == ccnorm.max())
threshold = 0.4
th, tw = template.shape[:2]
for pt in zip(*loc[::-1]):
if ccnorm[pt[::-1]] < threshold:
continue
cv2.rectangle(img, pt, (pt[0] + tw, pt[1] + th),
(0, 0, 255), 2)
cv2.imwrite(sys.argv[2], img)
I expect it to fail in more varied situations, but there are a couple of easy adjustments to be done.

Template matching doesn't always give good results. you should look into Keypoints matching.
Step1: Find Keypoints
Let's assume that you managed to cut out Mario or get ROI image of mario. Make this image your template image. Now, find keypoints in the main image and also in the template. So now you have two sets of keypoints. One for the image and other for Mario(template).
You can use SIFT, SURF, ORB depending on your preferences.
[EDIT]:
This is what I got using this method with SIFT and flann based knn matching. I haven't done the bounding box part.
Since your template is very small, SIFT and SURF would not give many keypoints. But to get good number of feature points, you could try Harris Corner detector. I applied Harris corner on the image and I got pretty good points on Mario.
Step2: Match Keypoints
If you have used SIFT or SURF, you'd have descriptors of both the image and the template. Match these keypoints using KNN or some other efficient matching algorithm. If you are using OpenCV, I'd suggest you to look into flannbased matcher. After matching the keypoints, you would want to filter out the incorrect matches. You can do this by K- nearest neighbors and depending upon the distance of the nearest match you can further filter out keypoints. You can further filter your matches using Forward-Backward Error.
Forward-Backward Error Estimation:
Match template keypoints to the image keypoints This will give you a set of matches.
Match the image keypoints to the template keypoints. This will give you another set of matches.
Common set of both these set will filter out incorrect matches.
[EDIT]:
If you are using Harris Corner detector, you'd get only points and not keypoints. You can either convert them into keypoints or write your own brute force mathcer. It's not that difficult.
Step3: Estimation
After filtering the keypoints, you'd have a cluster of keypoints near your object (in this case, Mario) and few scattered keypoints. To eliminate these scattered keypoints, you could use clustering. DBSCAN clustering will help you get a good cluster of points.
step4: Bounding Box Estimation
Now you have a cluster of keypoints. Using k-means, you should try to find the center of the cluster. Once you obtain the center of the cluster, you can estimate the bounding box.
I hope this helps.
[EDIT]
Trying to match points using Harris Corners. After filtering Harris corners, I'm using brute force method to match the points. some better algorithm might give you better results.

Related

How to extract the paper contours in this image (opencv)?

I'm trying to extract the geometries of the papers in the image below, but I'm having some trouble with grabbing the contours. I don't know which threshold algorithm to use (here I used static threshold = 10, which is probably not ideal.
And as you can see, I can get the correct number of images, but I can't get the proper bounds using this method.
Simply applying Otsu just doesn't work, it doesn't capture the geometries.
I assume I need to apply some edge detection, but I'm not sure what to do once I apply Canny or some other.
I also tried sobel in both directions (+ve and -ve in x and y), but unsure how to extract these contours from there.
How do I grab these contours?
Below is some previews of the images in the process of the final convex hull results.
**Original Image** **Sharpened**
**Dilate,Sharpen,Erode,Sharpen** **Convex Of Approximated Polygons Hulls (which doesn't fully capture desired regions)**
Sorry in advance about the horrible formatting, I have no idea how to make images smaller or title them nicely in SOF

Generalized hough transform for center of an object calculation based of SIFT points?

Do You have any suggestion for this?
I am following the steps in one research paper to re-implement it for my specific problem. I am not a professional programmer, however, I am struggling a lot, more than one month.When I am reaching to scoring step with generalized Hough transform, I do not get any result and what I am getting is a blank image without finding the object center.
What I did includes the following steps:
I define a spatially constrained area for a training image and extract SIFT features within the area. The red point in the center represents the object center in template(training) image.
and this the interest point extracted by SIFT in query image:
Keypoints are matched according to the some conditions: 1)they should be quantized to the same visual word and be spatially consistent. So I get the following points after matching conditions:
I have 15 and 14 points for template and query images, respectively. I send these points along with template image center of object coordinate to generalized hough transform (the code that I found from github). the code is working properly for it default images. However, according to the few points that I am getting by the algorithm, I do not know what I am doing wrong?!
I thought maybe that is because of theta calculation, so I changed this line to return abs of y and x differences. But it did not help.
In line 20 they only consider 90 degrees for binning, Could I ask what is the reason and how can I define a binning according to my problem and range of angles of rotation around the center of an object?
- Does binning range affect the center calculation?
I really appreciate it of you let me know what I am doing wrong here.

Detect triangles, ellipses and rectangles from an image

I am trying to detect the regions of traffic signs. Using OpenCV, my approach is as follows:
The color image:
Using the TanTriggs Preprocessing get rid of the illumination variances:
Equalize histogram:
And binarize (Cv2.Threshold(blobs, blobs, 127, 255, ThresholdTypes.BinaryInv):
Iterate each blob using ConnectedComponents and get the mean color value using the blob as mask. If it is a red color then it may be a red sign.
Then get contours of this blob using FindContours.
Simplify the contours using ApproxPolyDP and check the points of each contour:
If 3 points then triangle shape is acceptable --> candidate for triangle sign
If 4 points then shape is acceptable --> candidate
If more than 4 points, BBox dimensions are acceptable and most of the points are on the ellipse fitted (FitEllipse) --> candidate
This approach works for the separated blobs in the binary image, like the circular 100km sign in my example. However if there is a connection to the outside objects, like the triangle left bottom part in the binary image, it fails.
Because, the mean value of this blob is far from red!
Using Erosion helps in some cases, however makes it worse in many of the other images.
Using different threshold values for the binarization also works for some, but fails on many; like the erosion.
Using HoughCircle is just very slow and I couldn't manage to get good results playing with the parameters.
I have tried using matchShapes but couldn't get good results.
Can anybody show me another way the achieve what I want (with a reasonable computational time)?
Any information, or code in any language is wellcome.
Edit:
Using circularity measure (C=P^2/4πA) or the approach I have described above, triangle and ellips shapes can be found when they are separated. However when the contour is like this for example:
I could not find a robust way to extract the triangle piece. If I could, I would check the mean color, and decide if its a red sign candidate.
Sorry, I don't have the kudos to comment, but can't you use the red colour?
import common
myshow = common.myshow
img = cv2.imread("ms0QB.png")
grey = np.zeros(img.shape[:2],np.uint8)
hsv = cv2.cvtColor(img,cv2.COLOR_mask = np.logical_or(hsv[:,:,0]>160,hsv[:,:,0]<10 )
grey[mask] = 255
cv2.imshow("160<hue<182",grey)
cv2.waitKey()

SIFT/SURF/ORB for detection and orientation of simple pattern

My project is centered on positioning of several small objects with a stationary camera. I've drawn some crisp, simple graphical pattern images (like this), printed them out and try to detect them in the image. My straightforward method:
color masking and blob detecting for primary segmentation. Get
possible positions of the patterns [this works fine, I guess]
run a SIFT/SURF/ORB detection on these small image patches to compare
them with a pattern stored in file.
collect coordinates of the keypoints in found matches, then compute homology and get the precise position/rotation of the pattern in the image
I write on Python, with OpenCV. My initcode for ORB + BFMatcher is:
pt_detector = cv2.ORB()
pt_matcher = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
And for SIFT + FLANN I write:
pt_detector = cv2.SURF(400)
FLANN_INDEX_KDTREE = 0
index_params = dict(algorithm=FLANN_INDEX_KDTREE, trees=5)
search_params = dict(checks=50)
pt_matcher = cv2.FlannBasedMatcher(index_params, search_params)
And then I go simply:
kp_r, des_r = pt_detector.detectAndCompute(pattern, None)
kp_o, des_o = pt_detector.detectAndCompute(obj_res, None)
matches = pt_matcher.match(des_r, des_o)
The problem: detectors find matches all over the sample and template, and though they generally manage to detect the pattern, the orientation is messed up.
Here's an example of matching between the template image (left) and actually found pattern in the camera frame (right). The camera image is masked and thresholded, of course. These 10 best matches from SIFT+FLANN are plain terrible. I've tried SURF, SIFT with FLANN matcher and ORB+BFMatcher on default settings, with no results. I suppose the problem is in the parameters of descriptor and matcher.
Can anyone tell me how should I try to set up the descriptor and matcher for robust matching of simple patterns? Or maybe there is another approach for this task?

Opencv match contour image

I'd like to know what would be the best strategy to compare a group of contours, in fact are edges resulting of a canny edges detection, from two pictures, in order to know which pair is more alike.
I have this image:
http://i55.tinypic.com/10fe1y8.jpg
And I would like to know how can I calculate which one of these fits best to it:
http://i56.tinypic.com/zmxd13.jpg
(it should be the one on the right)
Is there anyway to compare the contours as a whole?
I can easily rotate the images but I don't know what functions to use in order to calculate that the reference image on the right is the best fit.
Here it is what I've already tried using opencv:
matchShapes function - I tried this function using 2 gray scales images and I always get the same result in every comparison image and the value seems wrong as it is 0,0002.
So what I realized about matchShapes, but I'm not sure it's the correct assumption, is that the function works with pairs of contours and not full images. Now this is a problem because although I have the contours of the images I want to compare, they are hundreds and I don't know which ones should be "paired up".
So I also tried to compare all the contours of the first image against the other two with a for iteration but I might be comparing,for example, the contour of the 5 against the circle contour of the two reference images and not the 2 contour.
Also tried simple cv::compare function and matchTemplate, none with success.
Well, for this you have a couple of options depending on how robust you need your approach to be.
Simple Solutions (with assumptions):
For these methods, I'm assuming your the images you supplied are what you are working with (i.e., the objects are already segmented and approximately the same scale. Also, you will need to correct the rotation (at least in a coarse manner). You might do something like iteratively rotate the comparison image every 10, 30, 60, or 90 degrees, or whatever coarseness you feel you can get away with.
For example,
for(degrees = 10; degrees < 360; degrees += 10)
coinRot = rotate(compareCoin, degrees)
// you could also try Cosine Similarity, or even matchedTemplate here.
metric = SAD(coinRot, targetCoin)
if(metric > bestMetric)
bestMetric = metric
coinRotation = degrees
Sum of Absolute Differences (SAD): This will allow you to quickly compare the images once you have determined an approximate rotation angle.
Cosine Similarity: This operates a bit differently by treating the image as a 1D vector, and then computes the the high-dimensional angle between the two vectors. The better the match the smaller the angle will be.
Complex Solutions (possibly more robust):
These solutions will be more complex to implement, but will probably yield more robust classifications.
Haussdorf Distance: This answer will give you an introduction on using this method. This solution will probably also need the rotation correction to work properly.
Fourier-Mellin Transform: This method is an extension of Phase Correlation, which can extract the rotation, scale, and translation (RST) transform between two images.
Feature Detection and Extraction: This method involves detecting "robust" (i.e., scale and/or rotation invariant) features in the image and comparing them against a set of target features with RANSAC, LMedS, or simple least squares. OpenCV has a couple of samples using this technique in matcher_simple.cpp and matching_to_many_images.cpp. NOTE: With this method you will probably not want to binarize the image, so there are more detectable features available.

Resources