Image preprocessing for text recognition

Image preprocessing for text recognition - image-processing

What's the best set of image preprocessing operations to apply to images for text recognition in EmguCV?
I've included two sample images here.
Applying a low or high pass filter won't be suitable, as the text may be of any size. I've tried median and bilateral filters, but they don't seem to affect the image much.
The ideal result would be a binary image with all the text white, and most of the rest black. This image would then be sent to the OCR engine.
Thanks

There's nothing like the best set. Keep in mind that digital images can be acquired by different capture devices and each device can embed its own preprocessing system (filters) and other characteristics that can drastically change the image and even add noises to them. So every case would have to be treated (preprocessed) differently.
However, there are commmon operations that can be used to improve the detection, for instance, a very basic one would be to convert the image to grayscale and apply a threshold to binarize the image. Another technique I've used before is the bounding box, which allows you to detect the text region. To remove noises from images you might be interested in erode/dilate operations. I demonstrate some of these operations on this post.
Also, there are other interesting posts about OCR and OpenCV that you should take a look:
Simple Digit Recognition OCR in OpenCV-Python
Basic OCR in OpenCV
Now, just to show you a simple approach that can be used with your sample image, this is the result of inverting the color and applying a threshold:
cv::Mat new_img = cv::imread(argv[1]);
cv::bitwise_not(new_img, new_img);
double thres = 100;
double color = 255;
cv::threshold(new_img, new_img, thres, color, CV_THRESH_BINARY);
cv::imwrite("inv_thres.png", new_img);

Try morphological image processing. Have a look at this. However, it works only on binary images - so you will have to binarize the image( threshold?). Although, it is simple, it is dependent on font size, so one structure element will not work for all font sizes. If you want a generic solution, there are a number of papers for text detection in images - A search of this term in google scholar should provide you with some useful publications.

Related

Text In Image Detection

I'd like to be able to detect and recognize text in an image (such as in the example shown below). I have tried many techniques such as Otsu thresholding, blob detection, filtering and general cleaning up of the image such as Gaussian Blur etc. I have then fed the pre-processed image into OCR software in order to recognize the text, however the results are unsatisfactory as the appearance of the text in my scenario can vary a lot.
What are the best techniques other than those listed can I use to detect and recognise text in images?
I'm currently using the OpenCV library.

Sharpening image using OpenCV OCR

I've been trying to work on an image processing script /OCR that will allow me to extract the letters (using tesseract) from the boxes found in the image below.
Following alot of processing, I was able to get the picture to look like this
In order to remove the noise I inverted the image followed by floodfilling and gaussian blurring to remove noise. This is what I ended up with next.
After running it through some threholding and erosion to remove the noise (erosion being the step that distorted the text) I was able to get the image to look like this before running it through tesseract
This, while a pretty good rendering, allows for fairly accurate results through tesseract. Though it sometimes fails because it reads the hash (#) as a H or W. This leads me to my question!
Is there a way using opencv, skimage, PIL (opencv preferably) I can sharpen this image in order to increase my chances of tesseract properly reading my image? OR Is there a way I can get from the third to final image WITHOUT having to use erosion which ultimately distorted the text in the image.
Any help would be greatly appreciated!

OpenCV does has functions like filter2D that convolves arbitrary kernel with given image. In particular you can use kernels that are used for image sharpening. The main question is whether this will improve the results of your OCR library or not. The image is already pretty sharp and the noise in the image is not a result of blur. I never worked with teseract myself, but I am fairly sure that it already does all the noise reduction it could. And 'helping' him in this process may actually have opposite effect. For example any sharpening process tends to amplify noise (as opposite to noise reduction processes that usually are blurring images). Most of computer vision libraries give better results when provided with raw (unprocessed) images.
Edit (after question update):
There multiple ways to do so. The first one that I would test is this: Your first binary image is pretty clean and sharp. Instead of of using morphological operations that reduce quality of letters switch to filtering contours. Use findContours function to find all contours in the image and store their hierarchy (i.e. which contour is inside which). From all the found contours you actually need only the contours on first and second levels, i.e. outer and inner contours of each letter (contours at zero level are the outermost contours). Other contours can be discarded. Among the contours that do belong to first level you can discard those whose bounding box is too small to be a real letter. After those two discarding procedures I would expect that most of the remaining contours are the ones that are parts of the letters. Draw them on white image and run OCR. (If you want white letters on black background you will need to invert the order of vertices in the contours).

How to detect Hotspots in an image

How to detect a hotspot in an image using opencv? I have tried googling but couldnt get a clue of it.
Description:
I need to filter good images from a live video stream. In this case I need to just detect the Hotspot in a frame. I need to do this in opencv.
What is HotSpot?
Hot spots are shiny areas on a subject’s face which are caused by a flash reflecting off a shiny surface or by uneven lighting. It tends to make the subject look as if they are sweating, which is not a good look.
Update :
http://answers.opencv.org/question/7223/hotspots-in-an-image/
http://en.wikipedia.org/wiki/Specular_highlight
The above two links also could help for my Post?
Image with HotSpot:
Image Without HotSpot:

An automatic rough indication of these "hotspot" areas can be obtained by a gaussian filtering followed by a binarization. The expectation is that the "hotspot" is much brighter than the area around it, so after a gaussian filtering they will be at least slightly highlighted and, at the same time, image artifacts are reduced due to the nature of the low-pass filtering.
Example results follow. Binarization at 0.75 (range is always [0, 1]) after a simple conversion to grayscale, Binarization at 0.85 after a gaussian filtering in the B channel of the HSB colorspace:
In both cases large components were removed due to the assumption that "hotspots" aren't too big.

Balancing contrast and brightness between stitched images

I'm working on an image stitching project, and I understand there's different approaches on dealing with contrast and brightness of an image. I could of course deal with this issue before I even stitched the image, but yet the result is not as consistent as I would hope. So my question is if it's possible by any chance to "balance" or rather "equalize" the contrast and brightness in color pictures after the stitching has taken place?

You want to determine the histogram equalization function not from the entire images, but on the zone where they will touch or overlap. You obviously want to have identical histograms in the overlap area, so this is where you calculate the functions. You then apply the equalization functions that accomplish this on the entire images. If you have more than two stitches, you still want to have global equalization beforehand, and then use a weighted application of the overlap-equalizing functions that decreases the impact as you move away from the stitched edge.
Apologies if this is all obvious to you already, but your general question leads me to a general answer.

You may want to have a look at the Exposure Compensator class provided by OpenCV.
Exposure compensation is done in 3 steps:
Create your exposure compensator
Ptr<ExposureCompensator> compensator = ExposureCompensator::createDefault(expos_comp_type);
You input all of your images along with the top left corners of each of them. You can leave the masks completely white by default unless you want to specify certain parts of the image to work on.
compensator->feed(corners, images, masks);
Now it has all the information of how the images overlap, you can compensate each image individually
compensator->apply(image_index, corners[image_index], image, mask);
The compensated image will be stored in image

Most efficient way to create a contrast on swimmers in a pool considering noise

I want to calculate the number of people in a pool, for a statistic usage. I will use artificial intelligence and image processing on the images generated by a security camera located on the ceiling of a pool. The camera is static, so it has no axis of rotation.
For the image processing step, I would like to focus on the swimmers, and try to remove the rest of the pool. I need a good contrast between the background and the swimmers.
The problem is that output images of the camera have a lot of "noise", such as sunlight, rays of light, black lines at the bottom of the water, flags in the air and cables for sparation corridors.
Here is an example of what the images look like. The real images will just be in better quality, because this example is taken from a picture of the output using my cellphone.
What is the most efficient way of removing sunrays/light rays on my
images? Maybe using a filter?
how I can create a high contrast between the swimmers and the
background, considering the black lines in the bottom of the water?
Because the camera does not move, I can obtain other images, with the same background (excepts for the sunrays) and maybe I could use the differences on the images to extract swimmers?
I am looking for any ideas/filters/references.

My suggestion is to analyse the image into HSV space. For information H (hue) corresponds to the colors. S (saturation) is the purity of the color.
If you are using matlab use the function rgb2hsv() in opencv use cvCvtColor() to convert the color space.
Here is a little experiment I have done on your image. I have converted the image to HSV space. and I have posted the false color map of it. Now with this what u may do is clustering something like k-means to identify the people.
exact commands to regenerate it in octave/matlab is:
>> im = imread( '9Nd5s.png' );
>> hsv = rgb2hsv( im );
>> imagesc( hsv(:,:,1) ), colormap( hot )
Hope this is helpful, let me know if u need any more help. your problem seems interesting to work upon.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart