Gauge percent similarity of line drawings in OpenCV - opencv

I'm attempting to take two images in OpenCV, both drawn with a program like MS Paint or a simple drawing app, and gauge their percent similarity to each other. I have passing familiarity with some OpenCV image processing methods, but the approaches I've tried so far haven't been effective.
What I've thought of doing so far has been:
Simplest approach - comparing the two images pixel by pixel. This is easy to code up but scale / rotation invariant. The solution needs to be able to recognize an imperfect version
Hausdorff distance. This seems readymade for this problem, and I read a couple of other stack overflow posts about using it, but it takes in contours and I'm not sure how to extract contours from one image and match them to contours in another. One of the images might be empty, or they might be drastically different.
Feature extraction / matching - The approach I've tried so far has been to use feature detectors (ORB, AKAZE) paired with a Flann-based matcher, and have gotten extremely poor results so far. I'm currently changing this to SIFT/SURF and brute-force, but it doesn't seem like this has worked very well.
What are other possible computer vision algorithms I can try? I attached two sample images that are representative of what I'm trying to compare (and I did grayscale the image before processing, so color is not a factor).
Thank you!

Related

OpenCV Matching Exposure across images

I was wondering if its possible to match the exposure across a set of images.
For example, lets say you have 5 images that were taken at different angles. Images 1-3,5 are taken with the same exposure whilst the 4th image have a slightly darker exposure. When I then try to combine these into a cylindrical panorama using (seamFinder with: gc_color, surf detection, MULTI_BAND blending,Wave correction, etc.) the result turns out with a big shadow in the middle due to the darkness from image 4.
I've also tried using exposureCompensator without luck.
Since I'm taking the pictures in iOS, I maybe could increase exposure manually when needed? But this doesn't seem optimal..
Have anyone else dealt with this problem?
This method is probably overkill (and not just a little) but the current state-of-the-art method for ensuring color consistency between different images is presented in this article from HaCohen et al.
Their algorithm can correct a wide range of errors in image sets. I have implemented and tested it on datasets with large errors and it performs very well.
But, once again, I suppose this is way overkill for panorama stitching.
Sunreef has provided a very good paper, but it does seem overkill because of the complexity of a possible implementation.
What you want to do is to equalize the exposure not on the entire images, but on the overlapping zones. If the histograms of the overlapped zones match, it is a good indicator that the images have similar brightness and exposure conditions. Since you are doing more than 1 stitch, you may require a global equalization in order to make all the images look similar, and then only equalize them using either a weighted equalization on the overlapped region or a quadratic optimiser (which is again overkill if you are not a professional photographer). OpenCV has a simple implmentation of a simple equalization compensation algorithm.
The detail::ExposureCompensator class of OpenCV (sample implementation of such a stitiching is here) would be ideal for you to use.
Just create a compensator (try the 2 different types of compensation: GAIN and GAIN_BLOCKS)
Feed the images into the compensator, based on where their top-left cornes lie (in the stitched image) along with a mask (which can be either completely white or white only in the overlapped region).
Apply compensation on each individual image and iteratively check the results.
I don't know any way to do this in iOS, just OpenCV.

OpenCV compare similar hand drawn images

I am trying to compare two mono-chrome, basic hand drawn images, captured electronically. The scale may be different but the essences of the image is the same. I want to compare one hand drawn image to a save library of images and get a relative score of how similar they are. Think of several basic geometric shapes, lines, and curves that make up a drawing.
I have tried several techniques without much luck. Pixel based comparisons are too exact. I have tried scaling and cropping images and that did not get accurate results.
I have tried OpenCV with C# and have had a little success. I have experimented with SURF and it works for a few images, but not others that the eye can tell are very similar.
So now my question: Are there any examples of using openCV or commercial software that can support comparing drawings that are not exact? I prefer C# but I am open to any solutions.
Thanks in advance for any guidance.
(I have been working on this for over a month and have searched the internet and Stack Overflow without success. I of course could have missed something)
You need to extract features from these images and after that using a basic euclidean distance would be enough to calculate similarity. But hand writtend drawn thins are not easy to extract features. For example, companies that work on face recognition generally have much less accuracy on drawn face portraits.
I have a suggestion for you. For a machine learning homework, one of my friends got the signature recognition assingment. I do not fully know how he did it with a high accuracy, but I know feature extraction part. Firtstly he converted it to binary image. And than he calculated the each row's black pixel count. Than he used that features to train a NN or etc.
So you can use this similar approach to extract features. Than use a euclidean distance to calculate similarities.

Comparing similar images as photographs -- detecting difference, image diff

The situation is kind of unique from anything I have been able to find asked already, and is as follows: If I took a photo of two similar images, I'd like to be able to highlight the differing features in the two images. For example the following two halves of a children's spot the difference game:
The differences in the images will be bits missing/added and/or colour change, and the type of differences which would be easily detectable from the original image files by doing nothing cleverer than a pixel-by-pixel comparison. However the fact that they're subject to the fluctuations of light and imprecision of photography, I'll need a far more lenient/clever algorithm.
As you can see, the images won't necessarily line up perfectly if overlaid.
This question is tagged language-agnostic as I expect answers that point me towards relevant algorithms, however I'd also be interested in current implementations if they exist, particularly in Java, Ruby, or C.
The following approach should work. All of these functionalities are available in OpenCV. Take a look at this example for computing homographies.
Detect keypoints in the two images using a corner detector.
Extract descriptors (SIFT/SURF) for the keypoints.
Match the keypoints and compute a homography using RANSAC, that aligns the second image to the first.
Apply the homography to the second image, so that it is aligned with the first.
Now simply compute the pixel-wise difference between the two images, and the difference image will highlight everything that has changed from the first to the second.
My general approach would be to use an optical flow to align both images and perform a pixel by pixel comparison once they are aligned.
However, for the specifics, standard optical flows (OpenCV etc.) are likely to fail if the two images differ significantly like in your case. If that indeed fails, there are recent optical flow techniques that are supposed to work even if the images are drastically different. For instance, you might want to look at the paper about SIFT flows by Ce Liu et al that addresses this problem with sparse correspondences.

OCR detection with openCV

I'm trying to create a simpler OCR enginge by using openCV. I have this image: https://dl.dropbox.com/u/63179/opencv/test-image.png
I have saved all possible characters as images and trying to detect this images in input image.
From here I need to identify the code. I have been trying matchTemplate and FAST detection. Both seem to fail (or more likely: I'm doing something wrong).
When I used the matchTemplate method I found the edges of both the input image and the reference images using Sobel. This provide a working result but the accuracy is not good enough.
When using the FAST method it seems like I cant get any interresting descriptions from the cvExtractSURF method.
Any recomendations on the best way to be able to read this kind of code?
UPDATE 1 (2012-03-20)
I have had some progress. I'm trying to find the bounding rects of the characters but the matrix font is killing me. See the samples below:
My font: https://dl.dropbox.com/u/63179/opencv/IMG_0873.PNG
My font filled in: https://dl.dropbox.com/u/63179/opencv/IMG_0875.PNG
Other font: https://dl.dropbox.com/u/63179/opencv/IMG_0874.PNG
As seen in the samples I find the bounding rects for a less complex font and if I can fill in the space between the dots in my font it also works. Is there a way to achieve this with opencv? If I can find the bounding box of each character it would be much more simple to recognize the character.
Any ideas?
Update 2 (2013-03-21)
Ok, I had some luck with finding the bounding boxes. See image:
https://dl.dropbox.com/u/63179/opencv/IMG_0891.PNG
I'm not sure where to go from here. I tried to use matchTemplate template but I guess that is not a good option in this case? I guess that is better when searching for the exact match in a bigger picture?
I tried to use surf but when I try to extract the descriptors with cvExtractSURF for each bounding box I get 0 descriptors... Any ideas?
What method would be most appropriate to use to be able to match the bounding box against a reference image?
You're going the hard way with FASt+SURF, because they were not designed for this task.
In particular, FAST detects corner-like features that are ubiquituous iin structure-from-motion but far less present in OCR.
Two suggestions:
maybe build a feature vector from the number and locations of FAST keypoints, I think that oyu can rapidly check if these features are dsicriminant enough, and if yes train a classifier from that
(the one I would choose myself) partition your image samples into smaller squares. Compute only the decsriptor of SURF for each square and concatenate all of them to form the feature vector for a given sample. Then train a classifier with these feature vectors.
Note that option 2 works with any descriptor that you can find in OpenCV (SIFT, SURF, FREAK...).
Answer to update 1
Here is a little trick that senior people taught me when I started.
On your image with the dots, you can project your binarized data to the horizontal and vertical axes.
By searching for holes (disconnections) in the projected patterns, you are likely to recover almost all the boudnig boxes in your example.
Answer to update 2
At this point, you're back the my initial answer: SURF will be of no good here.
Instead, a standard way is to binarize each bounding box (to 0 - 1 depending on background/letter), normalize the bounding boxes to a standard size, and train a classifier from here.
There are several tutorials and blog posts on the web about how to do digit recognition using neural networks or SVM's, you just have to replace digits by your letters.
Your work is almost done! Training and using a classifier is tedious but straightforward.

how to find shapes that are slightly elongated oval / rectangle with curved corners / sometimes sector of a circle?

how to recognise a zebra crossing from top view using opencv?
in my previous question the problem is to find the curved zebra crossing using opencv.
now I thought that the following way would be much easier way to detect it,
(i) canny it
(ii) find the contours in it
(iii) find the black stripes in it, in my case it is slightly oval in shape
now my question is how to find that slightly oval shape??
look here for images of the crossing: www.shaastra.org/2013/media/events/70/Tab/422/Modern_Warfare_ps_v1.pdf
Generally speaking, I believe there are two approaches you can consider.
One approach is the more brute force image analysis approach, as you described. Here you are applying heuristics based on your knowledge of the problem in order to identify the pixels involved in the parts of the path. Note that 'brute force' here is not a bad thing, just an adjective.
An alternative approach is to apply pattern recognition techniques to find the regions of the image which have high probability of being part of the path. Here you would be transforming your image into (hopefully) meaningful features: lines, points, gradient (eg: Histogram of Oriented Gradients (HOG)), relative intensity (eg: Haar-like features) etc. and using machine learning techniques to figure out how these features describe the, say, the road vs the tunnel (in your example).
As you are asking about the former, I'm going to focus on that here. If you'd like to know more about the latter have a look around the Internet, StackOverflow, or post specific questions you have.
So, for the 'brute force image analysis' approach, your first step would probably be to preprocess the image as you need;
Consider color normalization if you are going to analyze color later, this will help make your algorithm robust to lighting differences in your garage vs the event studio. It'll also improve robustness to camera collaboration differences, though the organization hosting the competition provide specs for the camera they will use (don't ignore this bit of info).
Consider blurring the image to reduce noise if you're less interested in pixel by pixel values (eg edges) and more interested in larger structures (eg gradients).
Consider sharpening the image for the opposite reasons of blurring.
Do a bit of research on image preprocessing. It's definitely an explored topic but hardly 'solved' (whatever that would mean). There are lots of things to try at this stage and, of course, crap in => crap out.
After that you'll want to generate some 'features'..
As you mentioned, edges seem like an appropriate feature space for this problem. Don't forget that there are many other great edge detection algorithms out there other than Canny (see Prewitt, Sobel, etc.) After applying the edge detection algorithm, though, you still just have pixel data. To get to features you'll want probably want to extract lines from the edges. This is where the Hough transform space will come in handy.
(Also, as an idea, you can think about colorspace in concert with the edge detectors. By that, I mean, edge detectors usually work in the B&W colorspace, but rather than converting your image to grayscale you could convert it to an appropriate colorspace and just use a single channel. For example, if the game board is red and the lines on the crosswalk are blue, convert the image to HSV and grab the hue channel as input for the edge detector. You'll likely get better contrast between the regions than just grayscale. For bright vs. dull use the value channel, for yellow vs. blue use the Opponent colorspace, etc.)
You can also look at points. Algorithms such as the Harris corner detector or the Laplacian of Gaussian (LOG) will extract 'key points' (with a different definition for each algorithm but generally reproducible).
There are many other feature spaces to explore, don't stop here.
Now, this is where the brute force part comes in..
The first thing that comes to mind is parallel lines. Even in a curve, the edges of the lines are 'roughly' parallel. You could easily develop an algorithm to find the track in your game by finding lines which are each roughly parallel to their neighbors. Note that line detectors like the Hough transform are usually applied such that they find 'peaks', or overrepresented angles in the dataset. Thus, if you generate a Hough transform for the whole image, you'll be hard pressed to find any of the lines you want. Instead, you'll probably want to use a sliding window to examine each area individually.
Specifically speaking to the curved areas, you can use the Hough transform to detect circles and elipses quite easily. You could apply a heuristic like: two concentric semi-circles with a given difference in radius (~250 in your problem) would indicate a road.
If you're using points/corners you can try to come up with an algorithm to connect the corners of one line to the next. You can put a limit on the distance and degree in rotation from one corner to the next that will permit rounded turns but prohibit impossible paths. This could elucidate the edges of the road while being robust to turns.
You can probably start to see now why these hard coded algorithms start off simple but become tedious to tweak and often don't have great results. Furthermore, they tend to rigid and inapplicable to other, even similar, problems.
All that said.. you're talking about solving a problem that doesn't have an out of the box solution. Thinking about the solution is half the fun, and half the challenge. Everything I've described here is basic image analysis, computer vision, and problem solving. Start reading some papers on these topics and the ideas will come quickly. Good luck in the competition.

Resources