Sharpening image using OpenCV OCR - opencv

I've been trying to work on an image processing script /OCR that will allow me to extract the letters (using tesseract) from the boxes found in the image below.
Following alot of processing, I was able to get the picture to look like this
In order to remove the noise I inverted the image followed by floodfilling and gaussian blurring to remove noise. This is what I ended up with next.
After running it through some threholding and erosion to remove the noise (erosion being the step that distorted the text) I was able to get the image to look like this before running it through tesseract
This, while a pretty good rendering, allows for fairly accurate results through tesseract. Though it sometimes fails because it reads the hash (#) as a H or W. This leads me to my question!
Is there a way using opencv, skimage, PIL (opencv preferably) I can sharpen this image in order to increase my chances of tesseract properly reading my image? OR Is there a way I can get from the third to final image WITHOUT having to use erosion which ultimately distorted the text in the image.
Any help would be greatly appreciated!

OpenCV does has functions like filter2D that convolves arbitrary kernel with given image. In particular you can use kernels that are used for image sharpening. The main question is whether this will improve the results of your OCR library or not. The image is already pretty sharp and the noise in the image is not a result of blur. I never worked with teseract myself, but I am fairly sure that it already does all the noise reduction it could. And 'helping' him in this process may actually have opposite effect. For example any sharpening process tends to amplify noise (as opposite to noise reduction processes that usually are blurring images). Most of computer vision libraries give better results when provided with raw (unprocessed) images.
Edit (after question update):
There multiple ways to do so. The first one that I would test is this: Your first binary image is pretty clean and sharp. Instead of of using morphological operations that reduce quality of letters switch to filtering contours. Use findContours function to find all contours in the image and store their hierarchy (i.e. which contour is inside which). From all the found contours you actually need only the contours on first and second levels, i.e. outer and inner contours of each letter (contours at zero level are the outermost contours). Other contours can be discarded. Among the contours that do belong to first level you can discard those whose bounding box is too small to be a real letter. After those two discarding procedures I would expect that most of the remaining contours are the ones that are parts of the letters. Draw them on white image and run OCR. (If you want white letters on black background you will need to invert the order of vertices in the contours).

Related

How to denoise and extract ROI of binarised animal footprint images

I am currently working on an animal identification using footprint recognition project. My main task is to process an animal footprint taken from natural substrate and identify the animal the footprint belongs to. The first step is to preprocess the image and extract the ROI. This is where I am having difficulty as the processed images contain a lot of noise.
I have carried out a series of preprocessing steps all of which did reduce the noise but not sufficiently. The image below shows the outcome I have achieved thus far.
In order of left to right, the first image in the top row is an example of an animal footprint that I need to classify. The second image is one example of an image that will be used to train the system and classify the animal species (in this case a bear species). The third and fourth image in the first row show the grayscale and logarithmic transformation of the test image respectively.
The first image in the bottom row is a median blur of the image, the second shows adaptive thresholding. The third image shows the result of an eight-neighbour connectivity test where any pixel missing a neighbour is removed. The fourth image shows the image when erosion is applied after dilation. The last image shows the detected contours.
I have tried working with the contours to remove contours less than a certain area but it still didn't produce a better representation of the image. Displaying the largest contour just displays the entire image.
Using connected components detects a large number because of the high level of noise. I have tried working with blob detection and again did not achieve the desired results.
Im looking for the best and most efficient way to denoise the images and extract ROI.
Sample image:
One simple and effective way is to open the binary image then close it. Opening would help you with the white dots in the center of the image and closing would fill the undesired black spots in the white areas and finally you would have a neat footprint.
I applied binary threshold and followed it with morphological closing operation.
This is what I obtained:
This is the result of binary thresholding.
This is the result of morphological closing.
You have to do further processing to extract the foot perfectly.
I would suggest applying contour now. It should work fine.

Ideas to process challenging image

I'm working with Infra Red image that is an output of a 3D sensor. This sensors project a Infra Red pattern in order to draw a depth map, and, because of this, the IR image has a lot of white spots that reduce its quality. So, I want to process this image to make it smoother in order to make it possible to detect objects laying in the surface.
The original image looks like this:
My objective is to have something like this (which I obtained by blocking the IR projecter with my hand) :
An "open" morphological operation does remove some noise, but I think first there should be some noise removal operation that addresses the white dots.
Any ideas?
I should mention that the algorithm to reduce the noise has to run on real time.
A median filter would be my first attempt .... possibly followed by a Gaussian blur. It really depends what you want to do with it afterwards.
For example, here's your original image after a 5x5 median filter and 5x5 Gaussian blur:
The main difficulty in your images is the large radius of the white dots.
Median and morphologic filters should be of little help here.
Usually I'm not a big fan of these algorithms, but you seem to have a perfect use case for a decomposition of your images on a functional space with a sketch and an oscillatary component.
Basically, these algorithms aim at solving for the cartoon-like image X that approaches the observed image, and that differs from Y only through the removal of some oscillatory texture.
You can find a list of related papers and algorithms here.
(Disclaimer: I'm not Jérôme Gilles, but I know him, and I know that
most of his algorithms were implemented in plain C, so I think most of
them are practical to implement with OpenCV.)
What you can try otherwise, if you want to try simpler implementations first:
taking the difference between the input image and a blurred version to see if it emphasizes the dots, in which case you have an easy way to find and mark them. The output of this part may be enough, but you may also want to fill the previous place of the dots using inpainting,
or applying anisotropic diffusion (like the Rudin-Osher-Fatemi equation) to see if the dots disappear. Despite its apparent complexity, this diffusion can be implemented easily and efficiently in OpenCV by applying the algorithms in this paper. TV diffusion can also be used for the inpainting step of the previous item.
My main point on the noise removal was to have a cleaner image so it would be easier to detect objects. However, as I tried to find a solution for the problem, I realized that it was unrealistic to remove all noise from the image using on-the-fly noise removal algorithms, since most of the image is actually noise.. So I had to find the objects despite those conditions. Here is my aproach
1 - Initial image
2 - Background subtraction followed by opening operation to smooth noise
3 - Binary threshold
4 - Morphological operation close to make sure object has no edge discontinuities (necessary for thin objects)
5 - Fill holes + opening morphological operations to remove small noise blobs
6 - Detection
Is the IR projected pattern fixed or changes over time?
In the second case, you could try to take advantage of the movement of the dots.
For instance, you could acquire a sequence of images and assign each pixel of the result image to the minimum (or a very low percentile) value of the sequence.
Edit: here is a Python script you might want to try

Image Processing in low quality SEM image

I am not very expirienced in image processing...but have acquired some very noisy SEM images and it's hard to distinguish the particles I want to segment from the background. I know it's a general question but still...can you direct me to how I should go about it?
thank you in advance
Adi
It's difficult to be too specific without knowing what you want to achieve. If it's just improvement of the image in visual terms, a median filter would be my first suggestion. Use a small kernel size to avoid eroding edges too much.
As an example, here is a section of your original unprocessed image:
And here is the same section after applying a median filter with a 2 pixel radius:
It looks to me like the image is also slightly defocused, to which you can attempt some deconvolution algorithms like the Wiener filter, for example.
If you want to process the image in some way, for example to count the blobs (although that image is a really poor starting point), you can threshold it to get a binary image and then use morphological operations to refine the content. For example, I took the median filtered image, thresholded to binary and then performed a morphological open operation (an erode followed by a dilate):
To refine the segmentation and to split some of the touching particles, you can try a watershed segmentation on the binary image, like this:
Note that all of these images I produced using ImageJ, if you want to experiment further.

What is the correct method to auto-crop objects from light background?

I'm trying to extract objects from scanned images. There could be a few documents on a white background, and I need to crop and rotate them automatically. This seems like a rather simple task, but I've got stuck at some point and get bad results all the time.
I've tried to:
Binarise the image and get connected components by performing morphological operations.
Perform watershed segmentation by using dilated and eroded binary images as mask components.
Apply Canny detector and fill the contours.
None of this gets me good results. If the object does't have contrast edges (i.e a piece of paper on white background), it splits into a lot of separate components. If I connect these components by applying excessive dilation, background noise also expands and everything becomes a mess.
For example, I have an image:
After applying Canny detector and filling the contours I get something like this:
As you can see, the components are not connected. They are eve too far from each other to be connected by a reasonable amount of dilation. And when I apply watershed to this mask combined with some background points, it yields very bad results.
Some images are noisy:
In this particular case I was able to obtain contour of the whole passport by Canny detector because of it's contrast edges. But threshold method doesn't work here.
If the images are always on a very light background, then you can binarize with a threshold close to the maximum possible value. After that it is a matter of correcting the binary image to get the objects, but this step will vary depending on how your other images look like.
For instance, the following image at left is what we get with a threshold at 99% of the maximum value after a gaussian filtering on the input. After removing components connected to the border and other small components, and also combining with some basic morphological tools, we get the image at right.
This may seem a bit wishy-washy but bear with me:
This looks like quite a challenging case for image processing recipes involving only edge detection, morphological operations and segmentation.
What you are not exploiting here is that you (I believe) know what your document should look like. You are currently looking at completely general solutions which do not take into account this prior knowledge. If you can get some training data then you can go all the way from simple template/patch-based matching (SSD, Normalized Cross-Correlation) to more sophisticated object detection techniques to find the position and rotation of your documents.
My guess is that if your objects are always more or less the same and at the same scale (e.g. passports scanned at a fixed resolution/similar machines) then you can get away with a fairly crude approach. There won't be any one correct method. It's also likely that the technique you end up using will not work until you have done a significant amount of parameter tweaking, so don't give up on anything too quickly.

Image preprocessing for text recognition

What's the best set of image preprocessing operations to apply to images for text recognition in EmguCV?
I've included two sample images here.
Applying a low or high pass filter won't be suitable, as the text may be of any size. I've tried median and bilateral filters, but they don't seem to affect the image much.
The ideal result would be a binary image with all the text white, and most of the rest black. This image would then be sent to the OCR engine.
Thanks
There's nothing like the best set. Keep in mind that digital images can be acquired by different capture devices and each device can embed its own preprocessing system (filters) and other characteristics that can drastically change the image and even add noises to them. So every case would have to be treated (preprocessed) differently.
However, there are commmon operations that can be used to improve the detection, for instance, a very basic one would be to convert the image to grayscale and apply a threshold to binarize the image. Another technique I've used before is the bounding box, which allows you to detect the text region. To remove noises from images you might be interested in erode/dilate operations. I demonstrate some of these operations on this post.
Also, there are other interesting posts about OCR and OpenCV that you should take a look:
Simple Digit Recognition OCR in OpenCV-Python
Basic OCR in OpenCV
Now, just to show you a simple approach that can be used with your sample image, this is the result of inverting the color and applying a threshold:
cv::Mat new_img = cv::imread(argv[1]);
cv::bitwise_not(new_img, new_img);
double thres = 100;
double color = 255;
cv::threshold(new_img, new_img, thres, color, CV_THRESH_BINARY);
cv::imwrite("inv_thres.png", new_img);
Try morphological image processing. Have a look at this. However, it works only on binary images - so you will have to binarize the image( threshold?). Although, it is simple, it is dependent on font size, so one structure element will not work for all font sizes. If you want a generic solution, there are a number of papers for text detection in images - A search of this term in google scholar should provide you with some useful publications.

Resources