Method to separate dotted numbers from an image - opencv

I'm going to do a specialized OCR system which recognizes the above dotted numbers. (The sample picture may not contain all special cases - see below. ) We decided to separate the number string and recognize each digit before we put them altogether to form a final result.
The question is:
How to clearly separate all digits with OpenCV or other image algorithms?
Our difficulty lies in:
1. The image I uploaded is a synthesized image, which was produced using handpicked digits with slight morph in order to simulate anomalies in actual use, e.g. some dots are linked as a whole, some dots are eroded, and some dots are biased. We failed using morphology to determine their contours.
2. However, sometimes the digit may skew too much like italics with kerning, making a "clean and complete" bounding box impossible.
Some of the ideas we thought of are:
1. Find a way to draw slanted lines to separate the digits instead of traditional vertical lines. We assume that these dotted numbers should have been straight-up monospace characters, and only shear will occur instead of rotation.
2. If there are any method better than simple morphology that could link the dots of each number together and manage to keep dots of separate digits away, it will also be useful.
EDIT: Please don't comment below the original question. Just submit your answer. I appreciate every help by you, no matter how simple your answer may seem to be.
EDIT: Since the image I provided is somewhat ideal for real situation, a simple morphological operation won't solve the problem. Also, I'm looking for a solution which separates the characters, and linking the dots together is not the only option.

Related

Segmentation - Separating Touching Objects

I have built a system to segment a binary image containing handwritten symbols and classify them (specifically for music). I'm aware there are commercial applications which do this but this is me trying to do it ground-up as a project.
For the purposes of simplicity, let's assume that I will have two elements in my whole image:
AND
I've built something which segments an image in to regions and classifies them. This works fine most of the time.
However, sometimes the elements touch, at which point my classifier breaks down. For example:
OR
What's the best way to separate the two? I've done quite a bit of research but I think my lack of domain knowledge may be letting me down here!
Things I've found:
Template matching doesn't work well as, the symbols are handwritten
Thinning/eroding doesn't really work either, especially when it's two sharps (above right) overlapping as they degrade too much.
Watershed filling doesn't really work on two complex shapes
Things which might work and I'd appreciate a "go for it" or "avoid" vote before I go down the rabbit hole.
Slide a window L->R of varying sizes and try and classify it. Pick the window and position which has the highest positive classification confidence.
Take projections (horizontal and vertical) and "cut" the image at the minimum value (which would be the thinnest place on the respective axis
This seems to me like a very hard problem, and I do not have a good general solution. Especially the case of multiple connected # will be difficult to solve.
In your particular case I would try the following, assuming that there are usually not more than two or three symbols clumped together:
When a blob is too big for a single symbol
for each possible symbol
take a region in the top left, top right, bottom left, bottom right corner with the correct size for the symbol
run your recognition for that region
if succesfull, remove the recognized symbol, repeat for the rest
This is not a very sophisticated solution, and how well it works strongly depends on your particular character recognition
Another idea:
If most of your shapes tend to have thin vertical segments, you may be able to identify these segments via probabilistic Hough transform, and use the found vertical line segments as starting points for your recognition, whenever a blob contains more than one symbol.
Yet another idea to separate shapes:
split the blob at the biggest convexity defect that has a given minimum distance from the border of the blob. Caveat: this works best for convex shapes, and probably not at all for your # signs
Alternative 4:
In sheet music, the same sort of symbols tend to appear together, like # followed by a note on the same row, or multiple # at the start of the line in a certain pattern. It might be worthwhile to have a special combined recognizer for such symbols that tend to clump together.
(On that note, how do you currently separate the symbols from the staff lines?)

Segmentation for connected characters

How can I segment if the characters are connected? I just tried using watershed with distance transform (http://opencv-code.com/tutorials/count-and-segment-overlapping-objects-with-watershed-and-distance-transform/) to find the number of components but it seems that it does not perform well.
It requires the object to be separated after a threshold in order to perform well.
Having said so, how can I segment the characters effectively? Need helps/ideas.
As attached is the example of binary image.
An example of heavily connected.
Ans:
#mmgp this is my o/p
I believe there are two approaches here: 1) redo the binarization step that led to these images you have right now; 2) consider different possibilities based on image size. Let us focus on the second approach given the question.
In your smallest image, only two digits are connected, and that happens only when considering 8-connectivity. If you handle your image as 4-connected, then there is nothing to do because there are no two components connected that should be separated. This is shown below. The right image can be obtained simply by finding the points that are connected to another one only when considering 8-connectivity. In this case, there are only two such points, and by removing them we disconnect the two digits '1'.
In your other image this is no longer the case. And I don't have a simple method to apply on it that can be applied on the smaller image without making it worse. But, actually, we could consider upscaling both images to some common size, using interpolation by nearest neighbor so we don't move from the binary representation. By resizing both of your images so they width equal to 200, and keeping the aspect ratio, we can apply the following morphological method to both of them. First do a thinning:
Now, as can be seen, the morphological branch points are the ones connecting your digits (there is another one at the left-most digit 'six' too, which will be handled). We can extract these branch points and apply a morphological closing with a vertical line of 2*height+1 (height is from your image), so no matter where the point is, its closing will produce a full vertical line. Since your image is not so small anymore, this line doesn't need to be 1 point-wide, in fact I considered a line that is 6 points-wide. Since some of the branch points are horizontally close, this closing operation will join them in the same vertical line. If a branch point is not close to another, then performing an erosion will remove a vertical line. And, by doing this, we eliminate the branch point related to the digit six at left. After applying these steps, we obtain the following image at left. Subtracting the original image from it, we get the image at right.
If we apply these same steps to the '8011' image, we end with the exactly same image as we started with. But this is still good, because applying the simple method that remove points that are only connected in 8-connectivity, we obtain the separated components as before.
It is common to use "smearing algorithms" for this. Also known as Run Length Smoothing Algorithm (RLSA). It is a method that segments black and white images into blocks. You can find some information here or look around on the internet to find an implementation of the algorithm.
Not sure if I want to help you solve captchas, but one idea would be to use erosion. Depending on how many pixels you have to work with it might be able to sufficiently separate the characters without destroying them. This would likely be best used as a pre-processing step for some other segmentation algorithm.

Finding data entry points in a blank, scanned application form

I am a relative newcomer to image processing and this is the problem I'm facing - Say I have the image of an application form, like this:
Now I would like to detect the locations of all the locations where data is to be entered. In this case, it would be the rectangles divided into a number of boxes like so(not all fields marked):
I can live with the photograph box also being detected. I've tried running the squares.cpp sample in the OpenCV sources, which does not quite get me what I want. I also tried the modified version here - the results were worse(my use case is definitely very different from the OP's in that question).
Also, Hough transforming to get the lines is not really working with/without blur-threshold as the noise in scanned image is contributing to extraneous lines, and also, thresholding is taking away parts of the combs(the small squares), and hence the line detection is not up to the mark.
Note that this form is not a scanned copy of a printed form, but the real input might very well be a noisy, scanned image of a printed form.
While I'm definitely sure that this is possible(at least with some tolerance allowed) and I'm trying to get at the solution, it would be really helpful if I get insights and ideas from other people who might have tried something like this/enjoy hacking on CV problems. Also, it would be really nice if the answers explain why a particular operation was done (e.g., dilation to try and fill up any holes left by thresholding, etc)
Are the forms consistent in any way? Are the "such boxes" the same size on all forms? If you can rely on a consistent size, like the character boxes in the form above, you could use template matching.
Otherwise, the problem seems to be: find any/all rectangles on the image (with a post processing step to filter out any that have a significant amount of markings within, or to merge neighboring rectangles).
The more you can take advantage of the consistencies between the forms, the easier the problem will be. Use any context you can get.
EDIT
Using the gradients (computed by using a Sobel kernel in both the x and the y direction) you can weed out a lot of the noise.
Using both you can find the direction of the gradients (equation can be found here: en.wikipedia.org/wiki/Sobel_operator). Let's say we define a discriminating feature of a box to be a vertical or horizontal gradient. If the pixel's gradient has an orientation that's either straight horizontal or straight vertical, keep it, set all else to white.
To make this more robust to noise, you can use a sliding window (3x3) in which you compute the median orientation. If the median (or mean) orientation of the window is vertical or horizontal, keep the current (middle of the window) pixel, otherwise set it to white.
You can use OpenCV for the gradient computation, and possibly the orientation/phase calculation, but you'll probably need to write the code it do the actual sliding window code. I'm not intimately familiar with OpenCV

What's a simple and efficient method for extracting line segments from a simple 2D image?

Specifically, I'm trying to extract all of the relevant line segments from screenshots of the game 'asteroids'. I've looked through the various methods for edge detection, but none seem to fit my problem for two reasons:
They detect smooth contours, whereas I just need the detection of straight line segments, and only those within a certain range of length. Now, these constraints should make my task considerably easier than the general case, but I don't want to just use a full blown edge detector and then clear the result of curved lines, as that would be prohibitively costly. Speed is of the utmost importance for my purposes.
They output a modified image where the edges are highlights, whereas I want a set of pixel coordinates depicting the endpoints of the detected line segments. Alternatively, a list of all of the pixels included in each segment would work as well.
I have an inkling that one possible solution would involve a hough transform, but I don't know how to use this to get the actual locations of the line segments (i.e. endpoints in pixel space). Though even if I did, I have no idea if that would be the simplest or most efficient way of doing things, hence the general wording of the question title.
Lastly, here's a sample image:
Notice that all of the major lines are similar in length and density, and that the overall image contrast is very high. I'm hoping the solution to my problem will exploit these features, because again, efficiency is paramount.
One caveat: while most of the line segments in this context are part of a polygon, I don't want a solution that relies on this fact.
Have a look at the Line Segment Detector algorithm.
Here's what they do :
You can find an impressive video at the bottom of the page.
There's a C implementation (that works with C++ compilers) that works out of the box. There are just one or two files, and no additional dependencies
But, be warned, the algorithm is under the GNU Allegro GPL license.
Also check out EDlines http://ceng.anadolu.edu.tr/cv/EDLines/
Very fast and provides a very useful output

How to detect and correct broken lines or shapes in a bitmap?

I want to find an algorithm which can find broken lines or shapes in a bitmap. consider a situation in which I have a bitmap with just two colors, back and white ( Images used in coloring books), there are some curves and lines which should be connected to each other, but due to some scanning errors, white bits sit instead of black ones. How should I detect them? (After this job, I want to convert bitmaps into vector file. I want to work with potrace algorithm).
If you have any Idea, please let me know.
Here is a simple algorithm to heal small gaps:
First, use a filter which creates a black pixel when any of its eight neighbors is black. This will grow your general outline.
Next, use a thinning filter which removes the extra outline but leaves the filled gaps alone.
See this article for some filters and parameters: Image Processing Lab in C#
The simplest approach is to use a morphological technique called closing.
This will work only if the gaps in the lines are quite small in relation to how close the different lines are to each other.
How you choose the structuring elemt to perform the closing can also make performance better or worse.
The Wikipedia article is very theoretical (or mathematical) so you might want to turn to Google or any book on Image Processing to get a better explanation on how it is done.
Maybe Hough Transform can help you. Bonus: you get the lines parameters for your vector file.

Resources