This is my first question here!
I have found the Hough Transform method not very robust to noise. Can I totally avoid it and efficiently find lines in the image?
Thanks!
There are various edge detectors that provide more general output than lines, such as the Canny edge detector. With suitable processing, this can give you 'edge chains' made of 'edgels'. From there, you can do some analysis of the edge chains to split it into line segments (generally based on looking for 'kinks' in the chain to identify corners.) This can be a fairly noisy process too, though, and there are lots of different kinds of heuristic criteria you might use to refine your set of lines further.
In general, the Hough Transform is a first-pass that gives you a (noisy) set of candidate lines in the image and is suitable when you don't know where lines might be. From there, you need to do further analysis of your candidates to determine which match what you're actually looking for. Like most problems in computer vision, the details of how you actually apply these basic tools depends very much on what you're attempting to analyze.
Related
Consider an image which is a composite of repeated pattern of varying size and unknown topography (as shown below)
How do we find the repeated pattern (along with its location) ?
An easy way to do this is to compute the autocorrelation of the image. At least the blocks with the same size can be identified this way.
A more elaborate way is explained in this post. You first of course will need to subdivide your big image into small images.
I'd have a look at the SIFT and RANSAC algorithm, it might not be exactly what you need, but it'll lead you in the right direction. What makes this hard is that you don't know which features you're looking for ahead of time so you will need some overseeing algorithm helping you make guesses.
Open source implementation
https://robwhess.github.io/opensift/
Wikipedia with some good links at the bottom as well as descriptions of similar algorithms
how to recognise a zebra crossing from top view using opencv?
in my previous question the problem is to find the curved zebra crossing using opencv.
now I thought that the following way would be much easier way to detect it,
(i) canny it
(ii) find the contours in it
(iii) find the black stripes in it, in my case it is slightly oval in shape
now my question is how to find that slightly oval shape??
look here for images of the crossing: www.shaastra.org/2013/media/events/70/Tab/422/Modern_Warfare_ps_v1.pdf
Generally speaking, I believe there are two approaches you can consider.
One approach is the more brute force image analysis approach, as you described. Here you are applying heuristics based on your knowledge of the problem in order to identify the pixels involved in the parts of the path. Note that 'brute force' here is not a bad thing, just an adjective.
An alternative approach is to apply pattern recognition techniques to find the regions of the image which have high probability of being part of the path. Here you would be transforming your image into (hopefully) meaningful features: lines, points, gradient (eg: Histogram of Oriented Gradients (HOG)), relative intensity (eg: Haar-like features) etc. and using machine learning techniques to figure out how these features describe the, say, the road vs the tunnel (in your example).
As you are asking about the former, I'm going to focus on that here. If you'd like to know more about the latter have a look around the Internet, StackOverflow, or post specific questions you have.
So, for the 'brute force image analysis' approach, your first step would probably be to preprocess the image as you need;
Consider color normalization if you are going to analyze color later, this will help make your algorithm robust to lighting differences in your garage vs the event studio. It'll also improve robustness to camera collaboration differences, though the organization hosting the competition provide specs for the camera they will use (don't ignore this bit of info).
Consider blurring the image to reduce noise if you're less interested in pixel by pixel values (eg edges) and more interested in larger structures (eg gradients).
Consider sharpening the image for the opposite reasons of blurring.
Do a bit of research on image preprocessing. It's definitely an explored topic but hardly 'solved' (whatever that would mean). There are lots of things to try at this stage and, of course, crap in => crap out.
After that you'll want to generate some 'features'..
As you mentioned, edges seem like an appropriate feature space for this problem. Don't forget that there are many other great edge detection algorithms out there other than Canny (see Prewitt, Sobel, etc.) After applying the edge detection algorithm, though, you still just have pixel data. To get to features you'll want probably want to extract lines from the edges. This is where the Hough transform space will come in handy.
(Also, as an idea, you can think about colorspace in concert with the edge detectors. By that, I mean, edge detectors usually work in the B&W colorspace, but rather than converting your image to grayscale you could convert it to an appropriate colorspace and just use a single channel. For example, if the game board is red and the lines on the crosswalk are blue, convert the image to HSV and grab the hue channel as input for the edge detector. You'll likely get better contrast between the regions than just grayscale. For bright vs. dull use the value channel, for yellow vs. blue use the Opponent colorspace, etc.)
You can also look at points. Algorithms such as the Harris corner detector or the Laplacian of Gaussian (LOG) will extract 'key points' (with a different definition for each algorithm but generally reproducible).
There are many other feature spaces to explore, don't stop here.
Now, this is where the brute force part comes in..
The first thing that comes to mind is parallel lines. Even in a curve, the edges of the lines are 'roughly' parallel. You could easily develop an algorithm to find the track in your game by finding lines which are each roughly parallel to their neighbors. Note that line detectors like the Hough transform are usually applied such that they find 'peaks', or overrepresented angles in the dataset. Thus, if you generate a Hough transform for the whole image, you'll be hard pressed to find any of the lines you want. Instead, you'll probably want to use a sliding window to examine each area individually.
Specifically speaking to the curved areas, you can use the Hough transform to detect circles and elipses quite easily. You could apply a heuristic like: two concentric semi-circles with a given difference in radius (~250 in your problem) would indicate a road.
If you're using points/corners you can try to come up with an algorithm to connect the corners of one line to the next. You can put a limit on the distance and degree in rotation from one corner to the next that will permit rounded turns but prohibit impossible paths. This could elucidate the edges of the road while being robust to turns.
You can probably start to see now why these hard coded algorithms start off simple but become tedious to tweak and often don't have great results. Furthermore, they tend to rigid and inapplicable to other, even similar, problems.
All that said.. you're talking about solving a problem that doesn't have an out of the box solution. Thinking about the solution is half the fun, and half the challenge. Everything I've described here is basic image analysis, computer vision, and problem solving. Start reading some papers on these topics and the ideas will come quickly. Good luck in the competition.
What method is suitable to capture (detect) MRZ from a photo of a document? I'm thinking about cascade classifier (e.g. Viola-Jones), but it seems a bit weird to use it for this problem.
If you know that you will look for text in a passport, why not try to find passport model points on it first. Match template of a passport to it by using ASM/AAM (Active shape model, Active Appearance Model) techniques. Once you have passport position information you can cut out the regions that you are interested in. This will take some time to implement though.
Consider this approach as a great starting point:
Black top-hat followed by a horisontal derivative highlights long rows of characters.
Morphological closing operation(s) merge the nearby characters and character rows together into a single large blob.
Optional erosion operation(s) remove the small blobs.
Otsu thresholding followed by contour detection and filtering away the contours which are apparently too small, too round, or located in the wrong place will get you a small number of possible locations for the MRZ
Finally, compute bounding boxes for the locations you found and see whether you can OCR them successfully.
It may not be the most efficient way to solve the problem, but it is surprisingly robust.
A better approach would be the use of projection profile methods. A projection profile method is based on the following idea:
Create an array A with an entry for every row in your b/w input document. Now set A[i] to the number of black pixels in the i-th row of your original image.
(You can also create a vertical projection profile by considering columns in the original image instead of rows.)
Now the array A is the projected row/column histogram of your document and the problem of detecting MRZs can be approached by examining the valleys in the A histogram.
This problem, however, is not completely solved, so there are many variations and improvements. Here's some additional documentation:
Projection profiles in Google Scholar: http://scholar.google.com/scholar?q=projection+profile+method
Tesseract-ocr, a great open source OCR library: https://code.google.com/p/tesseract-ocr/
Viola & Jones' Haar-like features generate many (many (many)) features to try to describe an object and are a bit more robust to scale and the like. Their approach was a unique approach to a difficult problem.
Here, however, you have plenty of constraint on the problem and anything like that seems a bit overkill. Rather than 'optimizing early', I'd say evaluate the standard OCR tools off the shelf and see where they get you. I believe you'll be pleasantly surprised.
PS:
You'll want to preprocess the image to isolate the characters on a white background. This can be done quite easily and will help the OCR algorithms significantly.
You might want to consider using stroke width transform.
You can follow these tips to implement it.
I'm trying to detect the presence of lines from a picture of a geometrical drawing. For example, there is a triangle, and I'm looking for the bisector of one of its angles. So I know exactly where and how long the line should be.
My approach so far is to detect all the lines with the Hough transform function and look for my line among those. However, this is rather slow and conceptually not very nice. Indeed, since I know the two extremities of the segment I'm looking for, it seems more natural to directly look for at this very place.
To do this, my first intuition was to use the result of Canny and loop through each pixel that should contain a line. Before I implement this, I was wondering if something similar already existed or if someone more expert in CV would recommend another approach. I've seen this but I'm looking for lines, not contour, so not sure this would work...
The Canny idea might work quite well. Since you were looking for other ideas, you might also consider trying corner detection. Assuming the geomerical shape is segmented you might try either the classic Harris corners or you might try some of the newer feature detectors like SURF, SIFT, FAST, etc. You may have to adjust the parameter sensitivities of these detectors in order to isolate the corners, but that might allow you to compute the lines based on a set of points. Also, a combination of Canny edge detection and feature detection might be more robust as well.
Just a thought :) Hope it helps!
Specifically, I'm trying to extract all of the relevant line segments from screenshots of the game 'asteroids'. I've looked through the various methods for edge detection, but none seem to fit my problem for two reasons:
They detect smooth contours, whereas I just need the detection of straight line segments, and only those within a certain range of length. Now, these constraints should make my task considerably easier than the general case, but I don't want to just use a full blown edge detector and then clear the result of curved lines, as that would be prohibitively costly. Speed is of the utmost importance for my purposes.
They output a modified image where the edges are highlights, whereas I want a set of pixel coordinates depicting the endpoints of the detected line segments. Alternatively, a list of all of the pixels included in each segment would work as well.
I have an inkling that one possible solution would involve a hough transform, but I don't know how to use this to get the actual locations of the line segments (i.e. endpoints in pixel space). Though even if I did, I have no idea if that would be the simplest or most efficient way of doing things, hence the general wording of the question title.
Lastly, here's a sample image:
Notice that all of the major lines are similar in length and density, and that the overall image contrast is very high. I'm hoping the solution to my problem will exploit these features, because again, efficiency is paramount.
One caveat: while most of the line segments in this context are part of a polygon, I don't want a solution that relies on this fact.
Have a look at the Line Segment Detector algorithm.
Here's what they do :
You can find an impressive video at the bottom of the page.
There's a C implementation (that works with C++ compilers) that works out of the box. There are just one or two files, and no additional dependencies
But, be warned, the algorithm is under the GNU Allegro GPL license.
Also check out EDlines http://ceng.anadolu.edu.tr/cv/EDLines/
Very fast and provides a very useful output