I want to ask you to help to choose or find a good algorithm for the following problem:
I want to recognize the template in the image, the template is a text of non standard font, so OCR possibly will not handle it. What I want is to recognize it using the template matching algorithm. Please refer to the image:
As you see there is a background, in this image I draw it myself, and the background is simple. Usually it is not so simple: it has illuminance variations and usually colored to one color. So, I want to match this template, but I want the algorithm to be invariant of background color.
I've tryed the opencv's cvMatchTemplate, it handles well if there is a template on the image. But if I rotate object under the camera or remove it so that there will not be any templates, the algorithm finds many false-positive matches.
So I want to find an algorithm that also is rotation-invariant.
Can you suggest any?
Look at Hu Moments - is rotation & size invariant. , OpenCV has a Match shapes method which does most of the work for you.
Related
I'm trying to do OCR to some forms that, however, have some texture as follows:
This texture causes the OCR programs to ignore it tagging it as an image region.
I considered using morphology. A closing operation with a star ends up as follows:
This result is still not good enough for the OCR.
When I manually erase the 'pepper' and do adaptive thresholding an image as follows gives good results on the OCR:
Do you have any other ideas for the problem?
thanks
For the given image, a 5x5 median filter does a little better than the closing. From there, binarization with an adaptive threshold can remove more of the background.
Anyway, the resulting quality will depend a lot on the images and perfect results can't be achieved.
Maybe have a look at this: https://code.google.com/p/ocropus/source/browse/DIRS?repo=ocroold (see ocr-doc-clean).
Considering that you know the font size, you could also consider using connected component filtering, perhaps in combination with a morphological operation. To be able to retain the commas, just be careful if a smaller connected component is near one that has a size similar to the characters that you are trying to read.
The background pattern is very regular and directionnal, so filtering in the Fourier domain must do some pretty good job here. Try for example the Butterworth filter
A concrete example of such filtering using gimp can be found here
I am using some functions such as color contour tracking and image matching which are already available in OpenCV .. I am trying to identify a pink duck, more specifically the head of the duck, but these two functions don't give me the outcome I am expecting for some reasons such as :
the color thing don't always work perfect because the change in the lightning , which accordingly would change the color seen by the camera.
when I use the image matching thing, I use one image of the duck which I took from a specific position and it can identify the duck only when he is in that position, but I want to identify it even when I rotate the duck or play around with it.
Does anyone have an ideas about a better way to track a certain object ?
Thank you
Have you tried converting the image into the hsv colourspace? This colourspace tries to remove the effects of lighting so might be able to improve your colour-based segmentation.
To identify the head of the duck, once you have identified the duck as a whole you could perhaps identify the orientation (using template matching with a set of templates from different viewpoints, or haar cascades, or ...) and then use the known orientation and an empirical rule to determine where the head is located. For example, if you detect the duck in an upright position within a certain bounding box, the head is assumed to be located in the top third of that bounding box.
I think it might just take little more than what OpenCV provides straight forward way.
Given your specific question, you might just want to try shape descriptors of some sort.
Basically, try to take Duck's head's pictures shape from various angles and capture the shapes from it.
Now, you can find a likelihood model (forgive me for not a very accurate term) that can validate the hypothesis that a given captured shape indeed belongs to the class of Duck's head or not. Color can just be an additional feature that might help.
If you are a new person in this field - try catch hold of Duda and Hart: Pattern Classification. This doesn't have solution to find-the-duck-problem but will shape your thinking.
Lets say I have an image or two dimensional pattern similar to QRcode and call it a template. Now I have a set of subimages that I want to match with my template and what's important - find their precise location in the template. I think similar problem is being solved in 'smart papers' http://en.wikipedia.org/wiki/Anoto and in kinect's grid of infrared dot pattern.
Does anyone have some clues how something similar can be implemented (even just
keywords to look up)?
I had few ideas:
opencv template matching method - poor results when rotated, scaled, skewed
SURF feature detection and matching - it's pretty good but result is worse when subimage is a really small chunk of the template. Besides I think that specificly picked up pattern would improve location finding rather than arbitary image. Also I think SURF is an overkill and I need something efficient that can handle real time mobile camera streams.
creating an image consisting of many QRcodes that only stores coordinates as data - drawback i that QRcodes will have to pretty small to allow
fine-grained positioning but then it's difficult to recognise them. Pros - they use only black color and have many white spaces (ink conservation)
2-dimensional colorful gradient image (similar to color model map) - I think this will be sensitive to lightness
QRCodes are square. Using feature detection to find the grid, you can unproject it. Then opencv's template matching will work fine.
I have an image of the target logo that I am trying to use to find target logos in other images. I am currently running two different detection algorithms to help me detect any logos on the image. The first detection I use is Histogram based in which I search the image for a general area on screen where the colors are very similar. From there I run SIFT to further get the object that I am looking for. This works on most logos however the Target logo that I have isn't even picking up and keypoints in the logo.
I was wondering if there was anything I could do to help locate some keypoints in the image. Any advice is greatly appreciated.
Below is the image that isn't being picked up by SIFT:
Thanks in advance.
EDIT
I tired using Julien's idea for template matching based and different scales and rotations of the model, but still got little results. I have included an image that I am trying to test against.
There is no keypoint in your image...
Why ?
Because there is no keypoint in a uniform color plane (why would there be ? as it is uniform nothing is an highlight)
Because everything is symmetric in your image, it wouldn't really help to have keypoints, according to certain feature extractor they would have the same feature vectors
Because there's no corner or high gradient in cross directions which would result in keypoints fro many feature detectors
What you could try is a template matching method if you are searching for this logo without big changes (rotation, translation, noise etc) a simple correlation is the easiiiiest.
If you want to go further, one of my idea, that I have never implemented but which could be funny : would be to have sets of this image that you scale, rotate, warp, desaturate, increase noise with functions and then apply template matching with this set of images you got from your former template...
Well this idea comes from SIFT and Wavelet transform, where we use sort of functions that we change in some ways (rotation, noise, frequency etc...) in order to give robustness to our transform against these basic changes that occur in any image that you want to "inspect".
That could be an idea for you !
Here is an image summarizing my idea, you rotate and scale your template, actually it creates a new rotated/scaled template that you can try to match, it will increase robustness (even if it can be very long if you choose a lot of parameters to change). Well i'm not saying that's an algorithm, but it could be a funny and very basic idea to try...
Julien,
There is another reason that this logo is problematic for feature matching. Most features work pretty bad with artificial images that doesn't have any smoothness. All the derivatives are exactly 1 pixel size and features detector rely on derivatives. You have to smooth the image a bit. Ofcorse for this specific logo it will not help due to high symmetry. You can use hough transform to detect circles inside circles. It would give you better results in comparison with template matching.
I think you can try using MSER features- https://en.wikipedia.org/wiki/Maximally_stable_extremal_regions
See an example:
https://www.mathworks.com/examples/matlab-computer-vision/mw/vision_product-TextDetectionExample-automatically-detect-and-recognize-text-in-natural-images
I want to make a program for checking the printed paper for errors.
PDF File: please refer to the second page, top right picture
As you see, that system could identify the errors made by printer.
I want to know how was it achieved. What are existing documents about this?
Or any ideas you have?
Thank you
This can be very easy or very difficult.
if your images are black white and your scan is quite precise you can try with a simple subtraction between the images (scanned and pattern)
if your scan will read the image with a possible deformation or translation the you will need first an image registration algorithm.
if your scan present background noise you will have some trouble with the subtraction and then it turns very difficult.
may be some image samples can help to suggest you a more specific algorithm.
I think you need to some how compare two images in a way that is robust to deformation. As mentioned before, substracting the two images can be a first step. Another more sophisticated way can be to use distance transform (or chamfering based methods for template matching) to compare how similar the two images are in the presence of some deformation. More sophisticated solutions can use methods like shape contexts.