refining captcha with a little noise - image-processing

I'm trying to crack a particular web CAPTCHA. I'm planning to do it by segmenting the characters and passing them to an ANN (mostly for features, I will be using method of moments as it seems difficult to completely remove noise completely)
The captcha is very noisy, and unfortunately there is no color difference between the noise and the actual text, so separation based on color will not work. After quite some thought, I managed to implement a flood-fill style algorithm on the pixels of the captcha to separate small disconnected components, and after this I ended up with something like this:
Most of the noise is gone but some of it is left around the letters themselves (since it is touching the text).
I'm not an expert on image filters, and I'm finding it very difficult to find the right filter to reduce the remaining noise and enhance the characters.
Any Ideas on what filter(s) I could use for this purpose.
(Note: I'm not using any image manipulation tool/library for this. I'm writing raw pixel manipulation code, but I can implement most filters given their convolution kernel)
The problem is that due to this noise, it is becoming difficult to segment the characters. Clearly trying to find vertical lines with no dark pixels is not going to work, since there is noise and some of the letters are touching.
Any ideas on how I could segment these efficiently?
EDIT: Original image

what about trying morphological operators like closing and opening? they are very easy to implement and a simple but efficient tool.
After one closing with a 3x3 cross structuring element (kernel) and binarising the image the noise is almost gone:
I am sure just a bit more trying will render great results.
edit: to clear things up a little, the closing is a dilation followed by an erosion (other way around for opening). A dilation is assigning every pixel in your image the maximal value of all pixels in the kernel (structuring element) around it, conversly, the erosion assign every pixel the minimal value of all pixels in the kernel around it.
Also take a look at the wikipedia link and the external links in there.

Related

What Kind of pre-processing techniques can I apply to make an object more clear?

I applied few techniques of denoising on MRI images and could not realize what techniques are applicable on my data to make the cartilage object more clear. First I applied Contrast-limited adaptive histogram equalization (CLAHE) with this function:
J = adapthisteq(I)
But I got a white image. This is original image and manual segmentation of two thin objects(cartilage):
And then I read a paper that they had used some preprocessing on microscopy images, such as: Anisotropic diffusion filter(ADF), then, K-SVD algorithm, and then Batch-Orthogonal Matching Pursuit (OMP). I applied the first two and the output is as following:
It seems my object is not clear. It should be brighter than other objects. I do not what kind of algorithms are applicable to make the cartilage objects more clear. I really appreciate any help.
Extended:
This is the object:
Edited (now knowing exactly what you are looking for)
The differences between your cartilage and the surrounding tissue is very slight and for that reason I do not think you can afford to do any filtration. What I mean by this is that the two things that I can kinda catch with my eye is that the edge on the cartilage is very sharp (the grey to black drop-off), and also there seems to be a texture regularity in the cartilage that is smoother than the rest of the image. To be honest, these features are incredibly hard to even pick out by eye, and a common rule of thumb is that if you can't do it with your eye, vision processing is going to be rough.
I still think you want to do histogram stretching to increase your contrast.
1:In order to do a clean global contrast stretch you will need to remove bone/skin edge/ whatever that line on the left is from the image (bright white). To do this, I would suggest looking at the intensity histogram and setting a cut-off after the first peak (make sure to limit this so some value well above what cartilage could be in case there is no white signal). After determining that value, cut all pixels above that intensity from the image.
2:There appears to be low frequency gradients in this image (the background seems to vary in intensity), global histogram management (normalization) doesn't handle this well, CLAHE can handle this if set up well. But a far simpler solution worth trying is just hitting the image with a high pass filter as this will help to remove some of those (low frequency) background shifts. (after this step you should see no bulk intensity variation across the image.
3: I think you should try various implementations of histogram stretching, your goal in your histogram stretch implementation is to make the cartilage look more unique in the image compared to all other tissue.
This is by far the hardest step as you need to actually take a stab at what makes that tissue different from the rest of the tissue. I am at work, but when I get off, I will try to brainstorm some concepts for this final segmentation step here. In the meantime, what you want to try to identify is anything unique about the cartilage tissue at this point. My top ideas are cylindrical style color gradient, surface roughness, edge sharpness, location, size/shape

Ideas to process challenging image

I'm working with Infra Red image that is an output of a 3D sensor. This sensors project a Infra Red pattern in order to draw a depth map, and, because of this, the IR image has a lot of white spots that reduce its quality. So, I want to process this image to make it smoother in order to make it possible to detect objects laying in the surface.
The original image looks like this:
My objective is to have something like this (which I obtained by blocking the IR projecter with my hand) :
An "open" morphological operation does remove some noise, but I think first there should be some noise removal operation that addresses the white dots.
Any ideas?
I should mention that the algorithm to reduce the noise has to run on real time.
A median filter would be my first attempt .... possibly followed by a Gaussian blur. It really depends what you want to do with it afterwards.
For example, here's your original image after a 5x5 median filter and 5x5 Gaussian blur:
The main difficulty in your images is the large radius of the white dots.
Median and morphologic filters should be of little help here.
Usually I'm not a big fan of these algorithms, but you seem to have a perfect use case for a decomposition of your images on a functional space with a sketch and an oscillatary component.
Basically, these algorithms aim at solving for the cartoon-like image X that approaches the observed image, and that differs from Y only through the removal of some oscillatory texture.
You can find a list of related papers and algorithms here.
(Disclaimer: I'm not Jérôme Gilles, but I know him, and I know that
most of his algorithms were implemented in plain C, so I think most of
them are practical to implement with OpenCV.)
What you can try otherwise, if you want to try simpler implementations first:
taking the difference between the input image and a blurred version to see if it emphasizes the dots, in which case you have an easy way to find and mark them. The output of this part may be enough, but you may also want to fill the previous place of the dots using inpainting,
or applying anisotropic diffusion (like the Rudin-Osher-Fatemi equation) to see if the dots disappear. Despite its apparent complexity, this diffusion can be implemented easily and efficiently in OpenCV by applying the algorithms in this paper. TV diffusion can also be used for the inpainting step of the previous item.
My main point on the noise removal was to have a cleaner image so it would be easier to detect objects. However, as I tried to find a solution for the problem, I realized that it was unrealistic to remove all noise from the image using on-the-fly noise removal algorithms, since most of the image is actually noise.. So I had to find the objects despite those conditions. Here is my aproach
1 - Initial image
2 - Background subtraction followed by opening operation to smooth noise
3 - Binary threshold
4 - Morphological operation close to make sure object has no edge discontinuities (necessary for thin objects)
5 - Fill holes + opening morphological operations to remove small noise blobs
6 - Detection
Is the IR projected pattern fixed or changes over time?
In the second case, you could try to take advantage of the movement of the dots.
For instance, you could acquire a sequence of images and assign each pixel of the result image to the minimum (or a very low percentile) value of the sequence.
Edit: here is a Python script you might want to try

What is the correct method to auto-crop objects from light background?

I'm trying to extract objects from scanned images. There could be a few documents on a white background, and I need to crop and rotate them automatically. This seems like a rather simple task, but I've got stuck at some point and get bad results all the time.
I've tried to:
Binarise the image and get connected components by performing morphological operations.
Perform watershed segmentation by using dilated and eroded binary images as mask components.
Apply Canny detector and fill the contours.
None of this gets me good results. If the object does't have contrast edges (i.e a piece of paper on white background), it splits into a lot of separate components. If I connect these components by applying excessive dilation, background noise also expands and everything becomes a mess.
For example, I have an image:
After applying Canny detector and filling the contours I get something like this:
As you can see, the components are not connected. They are eve too far from each other to be connected by a reasonable amount of dilation. And when I apply watershed to this mask combined with some background points, it yields very bad results.
Some images are noisy:
In this particular case I was able to obtain contour of the whole passport by Canny detector because of it's contrast edges. But threshold method doesn't work here.
If the images are always on a very light background, then you can binarize with a threshold close to the maximum possible value. After that it is a matter of correcting the binary image to get the objects, but this step will vary depending on how your other images look like.
For instance, the following image at left is what we get with a threshold at 99% of the maximum value after a gaussian filtering on the input. After removing components connected to the border and other small components, and also combining with some basic morphological tools, we get the image at right.
This may seem a bit wishy-washy but bear with me:
This looks like quite a challenging case for image processing recipes involving only edge detection, morphological operations and segmentation.
What you are not exploiting here is that you (I believe) know what your document should look like. You are currently looking at completely general solutions which do not take into account this prior knowledge. If you can get some training data then you can go all the way from simple template/patch-based matching (SSD, Normalized Cross-Correlation) to more sophisticated object detection techniques to find the position and rotation of your documents.
My guess is that if your objects are always more or less the same and at the same scale (e.g. passports scanned at a fixed resolution/similar machines) then you can get away with a fairly crude approach. There won't be any one correct method. It's also likely that the technique you end up using will not work until you have done a significant amount of parameter tweaking, so don't give up on anything too quickly.

Finding data entry points in a blank, scanned application form

I am a relative newcomer to image processing and this is the problem I'm facing - Say I have the image of an application form, like this:
Now I would like to detect the locations of all the locations where data is to be entered. In this case, it would be the rectangles divided into a number of boxes like so(not all fields marked):
I can live with the photograph box also being detected. I've tried running the squares.cpp sample in the OpenCV sources, which does not quite get me what I want. I also tried the modified version here - the results were worse(my use case is definitely very different from the OP's in that question).
Also, Hough transforming to get the lines is not really working with/without blur-threshold as the noise in scanned image is contributing to extraneous lines, and also, thresholding is taking away parts of the combs(the small squares), and hence the line detection is not up to the mark.
Note that this form is not a scanned copy of a printed form, but the real input might very well be a noisy, scanned image of a printed form.
While I'm definitely sure that this is possible(at least with some tolerance allowed) and I'm trying to get at the solution, it would be really helpful if I get insights and ideas from other people who might have tried something like this/enjoy hacking on CV problems. Also, it would be really nice if the answers explain why a particular operation was done (e.g., dilation to try and fill up any holes left by thresholding, etc)
Are the forms consistent in any way? Are the "such boxes" the same size on all forms? If you can rely on a consistent size, like the character boxes in the form above, you could use template matching.
Otherwise, the problem seems to be: find any/all rectangles on the image (with a post processing step to filter out any that have a significant amount of markings within, or to merge neighboring rectangles).
The more you can take advantage of the consistencies between the forms, the easier the problem will be. Use any context you can get.
EDIT
Using the gradients (computed by using a Sobel kernel in both the x and the y direction) you can weed out a lot of the noise.
Using both you can find the direction of the gradients (equation can be found here: en.wikipedia.org/wiki/Sobel_operator). Let's say we define a discriminating feature of a box to be a vertical or horizontal gradient. If the pixel's gradient has an orientation that's either straight horizontal or straight vertical, keep it, set all else to white.
To make this more robust to noise, you can use a sliding window (3x3) in which you compute the median orientation. If the median (or mean) orientation of the window is vertical or horizontal, keep the current (middle of the window) pixel, otherwise set it to white.
You can use OpenCV for the gradient computation, and possibly the orientation/phase calculation, but you'll probably need to write the code it do the actual sliding window code. I'm not intimately familiar with OpenCV

How to detect and correct broken lines or shapes in a bitmap?

I want to find an algorithm which can find broken lines or shapes in a bitmap. consider a situation in which I have a bitmap with just two colors, back and white ( Images used in coloring books), there are some curves and lines which should be connected to each other, but due to some scanning errors, white bits sit instead of black ones. How should I detect them? (After this job, I want to convert bitmaps into vector file. I want to work with potrace algorithm).
If you have any Idea, please let me know.
Here is a simple algorithm to heal small gaps:
First, use a filter which creates a black pixel when any of its eight neighbors is black. This will grow your general outline.
Next, use a thinning filter which removes the extra outline but leaves the filled gaps alone.
See this article for some filters and parameters: Image Processing Lab in C#
The simplest approach is to use a morphological technique called closing.
This will work only if the gaps in the lines are quite small in relation to how close the different lines are to each other.
How you choose the structuring elemt to perform the closing can also make performance better or worse.
The Wikipedia article is very theoretical (or mathematical) so you might want to turn to Google or any book on Image Processing to get a better explanation on how it is done.
Maybe Hough Transform can help you. Bonus: you get the lines parameters for your vector file.

Resources