How do I segment the connected characters in this case? - opencv

It seems that I need some advice on segmenting connected characters (see the image below).
As you can see, C and U, as well as 4,9 and 9 are connected and therefore when I try to draw contours they are joined into one block. Unfortunately, there are plenty of such problematic images so I think I need to find some solution.
I have tried using different morphological transforms (erosion, dilation, opening), but that doesn't solve the problem.
Thanks in advance for any recommendations.

It seems to me that the best solution will be to work on the preprocessing, if there is a possibility.
Otherwise, you can try Machine Learning techniques. You may get inspiration from Viola-Jones or Histograms of Oriented Gradients + SVM algorithms (even though those algorithms solve a problem that differs from Optical Character Recognition, I had plenty of insights from them). In other words, try "sliding" a window along a horizontal of predefined aspect ratio and recognize characters. But the problem may be that you will need to train a model, which may require a lot of data.
As I said earlier, it may be a good idea to reconsider the image preprocessing step. By the way, it seems that in the case of "C" and "U", erosion may help.
Good luck!:)

Related

Recognition of repeated pattern in an image

Consider an image which is a composite of repeated pattern of varying size and unknown topography (as shown below)
How do we find the repeated pattern (along with its location) ?
An easy way to do this is to compute the autocorrelation of the image. At least the blocks with the same size can be identified this way.
A more elaborate way is explained in this post. You first of course will need to subdivide your big image into small images.
I'd have a look at the SIFT and RANSAC algorithm, it might not be exactly what you need, but it'll lead you in the right direction. What makes this hard is that you don't know which features you're looking for ahead of time so you will need some overseeing algorithm helping you make guesses.
Open source implementation
https://robwhess.github.io/opensift/
Wikipedia with some good links at the bottom as well as descriptions of similar algorithms

OpenCV and SVM training for luggages detection

In my project, I am trying to differenciate a luggage from anything else, usually a human.
For the moment, I use OpenCV and SVM training method with 2 classes, one with luggages, and another one with humans. Before injecting the frames, I converted them to grayscale, but I don't apply additional filters. The result of the prediction is not very accurate.
I am wondering if applying additional filters to the frames before training might give a better result. For example contours detection. If the contour is close to a 'rectangle' then it is a luggage otherwise it is 'something else'. I am also thinking about switching to a ONE_CLASS method.
What do you think ? Or do you have better ideas ?
Regards,
Julien.
After giving much thought regarding the question, I think Anomaly Detection is the best way to go. I got that idea since you mentioned ONE_CLASS method.
Assuming that luggage is of rectangular shape in an image, your suggestion of "anything close to rectangle is a luggage", is also a viable approach. Hence you have only one class 'Luggage'.
As the term implies, 'anomaly detection' is used to detect objects that do not conform to a particular pattern. In other words, it is used to detect outliers (objects other than those present in the dataset).
Since you are emphasizing on luggage alone I presume this approach to be the best.
You could try other approaches as well, in case you come across any.
So the rectangle approximation method seems to fit my requirements. I haven't tested with a lot of images yet, so I am not 100% sure I'll go for it. As always, there is exceptions: when the color of the luggage is close to the color of the background, the result is not accurate. Is there a way to amplify the difference between two close colors ?
Regards,
Julien.

How to enhance colors and contrast of an noisy image

I asked this question previously "How to extract numbers from an image" LINK and finally i made this step but there is some test cases that leads to awful outputs when i try to recognize digits .. Consider this image as an example
This image is low contrast (from my POV) i tried to adjust its contrast and the results still unacceptable .I tried also to sharp it then i applied gamma correction but the results still not fair ,so the extracted numbers doesn't recognized well by the classifier
this is the image after (sharpening + gamma)
Number 4 after separation :
Could anybody tell me what is the best ideas to solve such a problem ?
Sharpening is not always the best tool to approach a problem like this. Contrary to what the name implies, sharpening does not "recover" information to add detail and edges back into an image. Instead, sharpening is a class of operations that increase local contrast along edges.
Because your original image is highly degraded, this sharpening operation looks to be adding a lot of noise in, and generally not making anything better.
There is another class of algorithms called "deblurring" algorithms that attempt to actually reconstruct image detail through (much more complex) mathematical models. Some versions of this are blind deconvolution, regularized deconvolution, and Wiener deconvolution.
However, it is important to note that all of these methods are approximations - once image content is lost through an operation such as blurring , it can (almost) never be fully recovered. Also, these methods are generally much more complex.
The best way to handle these situations is make sure that they never happen. Ensure good focus during image capture, use a system with a resolution well suited to your task, control the lighting environment. However, when these methods do not or cannot work, image reconstruction techniques are needed.
Your image is blurred, and I suggest you try wiener deconvolution. You can assume the point spread function a Gaussian function and observe what's going on with the deconvolution process. Since you do not know the blur kernel in advance, blind deconvolution is an alternative.

Detecting "city" background versus "desert" background in images using image processing/computer vision

I'm searching for algorithms/methods that are used to classify or differentiate between two outdoor environments. Given an image with vehicles, I need to be able to detect whether the vehicles are in a natural desert landscape, or whether they're in the city.
I've searched but can't seem to find relevant work on this. Perhaps because I'm new at computer vision, I'm using the wrong search terms.
Any ideas? Is there any work (or related) available in this direction?
I'd suggest reading Prince's Computer Vision: Models, Learning, and Inference (free PDF available). It covers image classification, as well as many other areas of CV. I was fortunate enough to take the Machine Vision course at UCL which the book was designed for and it's an excellent reference.
Addressing your problem specifically, a simple MAP or MLE model on pixel colours will probably provide a reasonable benchmark. From there you could look at more involved models and feature engineering.
Seemingly complex classifications similar to "civilization" vs "nature" might be able to be solved simply with the help of certain heuristics along with classification based on color. Like Gilevi said, city scenes are sure to contain many flat lines and right angles, while desert scenes are dominated by rolling dunes and so on.
To address this directly, you could use OpenCV's hough - lines algorithm on the images (tuned for this problem of course) and look at:
a) how many lines are fit to the image at a given threshold
b) of the lines that are fit what is the expected angle between two of them; if the angles are uniformly distributed then chances are its nature, but if the angles are clumped up around multiples of pi/2 (more right angles and straight lines) then it is more likely to be a cityscape.
Color components, textures, and degree of smoothness(variation or gradient of image) may differentiate the desert and city background. You may also try Hough transform, which is used for line detection that can be viewed as city feature (building, road, bridge, cars,,,etc).
I would recommend you this research very similar with your project. This article presents a comparison of different classification techniques to obtain the scene classifier (urban, highway, and rural) based on images.
See my answer here: How to match texture similarity in images?
You can use the same method. I already solved in the past problems like the one you described with this method.
The problem you are describing is that of scene categorization. Search for works that use the SUN database.
However, you only working with two relatively different categories, so I don't think you need to kill yourself implementing state-of-the-art algorithms. I think taking GIST features + color features and training a non-linear SVM would do the trick.
Urban environments is usually characterized with a lot of horizontal and vertical lines, GIST captures that information.

how to recognize an same image with different size ?

We as human, could recognize these two images as same image :
In computer, it will be easy to recognize these two image if they are in the same size, so we have to make Preprocessing stage or step before recognize it, like scaling, but if we look deeply to scaling process, we will know that it's not an efficient way.
Now, could you help me to find some way to convert images into objects that doesn't deal with size or pixel location, to be input for recognition method ?
Thanks advance.
I have several ideas:
Let the image have several color thresholds. This way you get large
areas of the same color. The shapes of those areas can be traced with
curves which are math. If you do this for the larger and the smaller
one and see if the curves match.
Try to define key spots in the area. I don't know for sure how
this works but you can look up face detection algoritms. In such
an algoritm there is a math equation for how a face should look.
If you define enough object in such algorithms you can define
multiple objects in the images to see if the object match on the
same spots.
And you could see if the predator algorithm can accept images
of multiple size. If so your problem is solved.
It looks like you assume that human's brain recognize image in computationally effective way, which is rather not true. this algorithm is so complicated that we did not find it. It also takes a large part of your brain to deal with visual data.
When it comes to software there are some scale(or affine) invariant algorithms. One of such algorithms is LeNet 5 neural network.

Resources