How to segment ROI using SIFT/SURF - machine-learning

SIFT is used for feature extraction. Most of the tutorials that I have seen out there only show the features detected using SIFT. I need to identify ROI using SIFT. Images look like this but in worse condition (taken from different angles, some are blur, with more texts and numbers in other places too)
I need to extract this and then perform digit recognition:
What are the ways to segment this part? I was going for SIFT/SURF but couldn't find any tutorial to segment out the ROI. If there are any other suggestions then please provide the link.
Edit: Images that I have are grayscale
Edit1: this is just an example image I got from Google, My dataset only has grayscale images not colored

Related

Object Detection for 2-D Shapes using Feature Detection?

My objective -
Input: A PNG floorplan (with many electrical equipment symbols on it), and a user who selects one of those symbols using a bounding box.
Output: The same PNG floorplan but with all matching symbols highlighted
I have been looking into feature detection as a way to find matching symbols, but I can't find any examples online of it used on 2D objects- I only ever see it used on photos or used live in videos. Does Feature Detection work for 2D objects as well? If not, why not?
For those interested, I have been developing in C#, using an OpenCV wrapper API called Emgu CV (it has all the OpenCV functions and some more).
You can take a look at research on logo recognition. You can use classical features detector such as sift or surf and then calculate from the extracted features some shape invariants like features triangle orientation.
Here is a classic paper to take a look for some ideas:
Scalable Logo Recognition
I guess your input is a binary / grayscale image with mainly lines, arrows, circles, ...
Local Feature Detection (eg ORB, SURF, SIFT, ...) is best suited for "high entropy" images, I mean photos of scenes with a lot of texture.
Here you have geometrical shapes, a geometrical method would be better. I think a (geometric) shape detection algorithm would be a better choice, such as the Generalized Hough Transform.

Sharpening image using OpenCV OCR

I've been trying to work on an image processing script /OCR that will allow me to extract the letters (using tesseract) from the boxes found in the image below.
Following alot of processing, I was able to get the picture to look like this
In order to remove the noise I inverted the image followed by floodfilling and gaussian blurring to remove noise. This is what I ended up with next.
After running it through some threholding and erosion to remove the noise (erosion being the step that distorted the text) I was able to get the image to look like this before running it through tesseract
This, while a pretty good rendering, allows for fairly accurate results through tesseract. Though it sometimes fails because it reads the hash (#) as a H or W. This leads me to my question!
Is there a way using opencv, skimage, PIL (opencv preferably) I can sharpen this image in order to increase my chances of tesseract properly reading my image? OR Is there a way I can get from the third to final image WITHOUT having to use erosion which ultimately distorted the text in the image.
Any help would be greatly appreciated!
OpenCV does has functions like filter2D that convolves arbitrary kernel with given image. In particular you can use kernels that are used for image sharpening. The main question is whether this will improve the results of your OCR library or not. The image is already pretty sharp and the noise in the image is not a result of blur. I never worked with teseract myself, but I am fairly sure that it already does all the noise reduction it could. And 'helping' him in this process may actually have opposite effect. For example any sharpening process tends to amplify noise (as opposite to noise reduction processes that usually are blurring images). Most of computer vision libraries give better results when provided with raw (unprocessed) images.
Edit (after question update):
There multiple ways to do so. The first one that I would test is this: Your first binary image is pretty clean and sharp. Instead of of using morphological operations that reduce quality of letters switch to filtering contours. Use findContours function to find all contours in the image and store their hierarchy (i.e. which contour is inside which). From all the found contours you actually need only the contours on first and second levels, i.e. outer and inner contours of each letter (contours at zero level are the outermost contours). Other contours can be discarded. Among the contours that do belong to first level you can discard those whose bounding box is too small to be a real letter. After those two discarding procedures I would expect that most of the remaining contours are the ones that are parts of the letters. Draw them on white image and run OCR. (If you want white letters on black background you will need to invert the order of vertices in the contours).

OCR detection with openCV

I'm trying to create a simpler OCR enginge by using openCV. I have this image: https://dl.dropbox.com/u/63179/opencv/test-image.png
I have saved all possible characters as images and trying to detect this images in input image.
From here I need to identify the code. I have been trying matchTemplate and FAST detection. Both seem to fail (or more likely: I'm doing something wrong).
When I used the matchTemplate method I found the edges of both the input image and the reference images using Sobel. This provide a working result but the accuracy is not good enough.
When using the FAST method it seems like I cant get any interresting descriptions from the cvExtractSURF method.
Any recomendations on the best way to be able to read this kind of code?
UPDATE 1 (2012-03-20)
I have had some progress. I'm trying to find the bounding rects of the characters but the matrix font is killing me. See the samples below:
My font: https://dl.dropbox.com/u/63179/opencv/IMG_0873.PNG
My font filled in: https://dl.dropbox.com/u/63179/opencv/IMG_0875.PNG
Other font: https://dl.dropbox.com/u/63179/opencv/IMG_0874.PNG
As seen in the samples I find the bounding rects for a less complex font and if I can fill in the space between the dots in my font it also works. Is there a way to achieve this with opencv? If I can find the bounding box of each character it would be much more simple to recognize the character.
Any ideas?
Update 2 (2013-03-21)
Ok, I had some luck with finding the bounding boxes. See image:
https://dl.dropbox.com/u/63179/opencv/IMG_0891.PNG
I'm not sure where to go from here. I tried to use matchTemplate template but I guess that is not a good option in this case? I guess that is better when searching for the exact match in a bigger picture?
I tried to use surf but when I try to extract the descriptors with cvExtractSURF for each bounding box I get 0 descriptors... Any ideas?
What method would be most appropriate to use to be able to match the bounding box against a reference image?
You're going the hard way with FASt+SURF, because they were not designed for this task.
In particular, FAST detects corner-like features that are ubiquituous iin structure-from-motion but far less present in OCR.
Two suggestions:
maybe build a feature vector from the number and locations of FAST keypoints, I think that oyu can rapidly check if these features are dsicriminant enough, and if yes train a classifier from that
(the one I would choose myself) partition your image samples into smaller squares. Compute only the decsriptor of SURF for each square and concatenate all of them to form the feature vector for a given sample. Then train a classifier with these feature vectors.
Note that option 2 works with any descriptor that you can find in OpenCV (SIFT, SURF, FREAK...).
Answer to update 1
Here is a little trick that senior people taught me when I started.
On your image with the dots, you can project your binarized data to the horizontal and vertical axes.
By searching for holes (disconnections) in the projected patterns, you are likely to recover almost all the boudnig boxes in your example.
Answer to update 2
At this point, you're back the my initial answer: SURF will be of no good here.
Instead, a standard way is to binarize each bounding box (to 0 - 1 depending on background/letter), normalize the bounding boxes to a standard size, and train a classifier from here.
There are several tutorials and blog posts on the web about how to do digit recognition using neural networks or SVM's, you just have to replace digits by your letters.
Your work is almost done! Training and using a classifier is tedious but straightforward.

Image preprocessing for text recognition

What's the best set of image preprocessing operations to apply to images for text recognition in EmguCV?
I've included two sample images here.
Applying a low or high pass filter won't be suitable, as the text may be of any size. I've tried median and bilateral filters, but they don't seem to affect the image much.
The ideal result would be a binary image with all the text white, and most of the rest black. This image would then be sent to the OCR engine.
Thanks
There's nothing like the best set. Keep in mind that digital images can be acquired by different capture devices and each device can embed its own preprocessing system (filters) and other characteristics that can drastically change the image and even add noises to them. So every case would have to be treated (preprocessed) differently.
However, there are commmon operations that can be used to improve the detection, for instance, a very basic one would be to convert the image to grayscale and apply a threshold to binarize the image. Another technique I've used before is the bounding box, which allows you to detect the text region. To remove noises from images you might be interested in erode/dilate operations. I demonstrate some of these operations on this post.
Also, there are other interesting posts about OCR and OpenCV that you should take a look:
Simple Digit Recognition OCR in OpenCV-Python
Basic OCR in OpenCV
Now, just to show you a simple approach that can be used with your sample image, this is the result of inverting the color and applying a threshold:
cv::Mat new_img = cv::imread(argv[1]);
cv::bitwise_not(new_img, new_img);
double thres = 100;
double color = 255;
cv::threshold(new_img, new_img, thres, color, CV_THRESH_BINARY);
cv::imwrite("inv_thres.png", new_img);
Try morphological image processing. Have a look at this. However, it works only on binary images - so you will have to binarize the image( threshold?). Although, it is simple, it is dependent on font size, so one structure element will not work for all font sizes. If you want a generic solution, there are a number of papers for text detection in images - A search of this term in google scholar should provide you with some useful publications.

Detect a pattern highlighted by infrared light with openCV

For a project I've to detect a pattern and track it in space despite rotation, noise, etc.
It's highlighted with IR light and recorded with an IR camera:
Picture: https://i.stack.imgur.com/RJuVS.png
As on this picture it will be only very simple shape and we can choose which one we're gonna use.
I need direction on how to process a recognition of these shapes please.
What I do currently is thresholding and erosion to get a cleaner shape and then a contour detection and a polygon approximation.
What should I do then? I tried hu-moments but it wasn't good at all.
Could you please give me a global approach to recognize and track such pattern in space?
Can you choose which shape to project?
if so I would recomend using few concentric circles. Then using hough transform for circles you can easily find the center of the shape even when tracking is extremly hard (large movement/low frame rate).
If you must use rectangular shape then there is a good open source which does that. It is part of a project to read street signs and auto-translate them.
Here is a link: http://code.google.com/p/signfinder/
This source is not large and it would be easy to cut out the relevant part.
It uses "good features to track" of openCV in module CornerFinder.
Hope it helped
It is possible, you need following steps: thresholding image, some morphological enhancement,
blob extraction and normalization of blob size, blobs shape analysis, comparison of analysis results with pattern that you want to track.
There is many methods for blobs shape analysis. Simple methods: geometric dimensions, area, perimeter, circularity measurement; bit quads and others (for example, William K. Pratt "Digital Image Processing", chapter 18). Complex methods: spacial moments, template matching, neural networks and others.
In any event, it is very hard to answer exactly without knowledge of pattern shapes that you want to track )
hope it helped

Resources