I'm just learning OpenCV, and have a question about line detection. I have a situation where I need to detect a horizontal black line on a white background. I am guaranteed that the line will always show up horizontally (within a few degrees) and need to detect where it is in the images from the camera.
My thought is, since it is always horizontal, I can just search vertically for the "edge" through a few columns on the image, and call it good. Maybe even narrow the amount of pixels I'm capturing from the camera as an extra boost in speed.
Is there a builtin function for this type of line detection though?
I don't need the extra power, and cannot afford the processing time of Canny or Hough, I just want to find a guaranteed horizontal line as fast as possible.
The images (with my solution running) look like this:
If you provide the type of image you talking about then suggesting solution will be better.
Also I wonder, why you can't use Canny Detector which is not so computationally expensive. Further more you can downside your image, compute the edges and filter horizontal edges.
On the other hand, knowing that your horizontal edges are always horizontal on the image plane, you can use Template matching.
The method I ended up going with is a for loop. After thresholding the image, I search along two columns to find all the "edges" or changes in value. Then I process this list to find only horizontal pairs of edges.
I then find all lines that are close enough together, and have a desired infill (boolean comparison against thresholded image), which effectively only finds the strips of tape I am interested in tracking.
This takes about 1/50th the time of JUST the Canny call, not including the findContours, etc that are also necessary. I have not tested against Hough however, but I believe this will still be significantly faster.
Since the biggest issue was processing speed, I made several other optimizations as well.
Code can be found here (It's very well commented, I promise): Code as a gist
Related
I am a relative newcomer to image processing and this is the problem I'm facing - Say I have the image of an application form, like this:
Now I would like to detect the locations of all the locations where data is to be entered. In this case, it would be the rectangles divided into a number of boxes like so(not all fields marked):
I can live with the photograph box also being detected. I've tried running the squares.cpp sample in the OpenCV sources, which does not quite get me what I want. I also tried the modified version here - the results were worse(my use case is definitely very different from the OP's in that question).
Also, Hough transforming to get the lines is not really working with/without blur-threshold as the noise in scanned image is contributing to extraneous lines, and also, thresholding is taking away parts of the combs(the small squares), and hence the line detection is not up to the mark.
Note that this form is not a scanned copy of a printed form, but the real input might very well be a noisy, scanned image of a printed form.
While I'm definitely sure that this is possible(at least with some tolerance allowed) and I'm trying to get at the solution, it would be really helpful if I get insights and ideas from other people who might have tried something like this/enjoy hacking on CV problems. Also, it would be really nice if the answers explain why a particular operation was done (e.g., dilation to try and fill up any holes left by thresholding, etc)
Are the forms consistent in any way? Are the "such boxes" the same size on all forms? If you can rely on a consistent size, like the character boxes in the form above, you could use template matching.
Otherwise, the problem seems to be: find any/all rectangles on the image (with a post processing step to filter out any that have a significant amount of markings within, or to merge neighboring rectangles).
The more you can take advantage of the consistencies between the forms, the easier the problem will be. Use any context you can get.
EDIT
Using the gradients (computed by using a Sobel kernel in both the x and the y direction) you can weed out a lot of the noise.
Using both you can find the direction of the gradients (equation can be found here: en.wikipedia.org/wiki/Sobel_operator). Let's say we define a discriminating feature of a box to be a vertical or horizontal gradient. If the pixel's gradient has an orientation that's either straight horizontal or straight vertical, keep it, set all else to white.
To make this more robust to noise, you can use a sliding window (3x3) in which you compute the median orientation. If the median (or mean) orientation of the window is vertical or horizontal, keep the current (middle of the window) pixel, otherwise set it to white.
You can use OpenCV for the gradient computation, and possibly the orientation/phase calculation, but you'll probably need to write the code it do the actual sliding window code. I'm not intimately familiar with OpenCV
So, my problem is that I have to find common points between two images of a microchip. Here's an example of two images:
Between these two images, we can clearly see some common pattern like the wires on the bottom right of the first images that can be found in relatively the same place in the second image. Also, the sort of white Z shape in the first image can be seen in the second images, a bit harder, but it's there.
I tried to match them with SURF (OpenCV), found no common point at all. Tried to apply some filter on both images, like edge detection, thresholding, and other filter that I could found in GIMP, but whatever I tried, no common point were ever found.
I'd like to know if you have any idea to solve this problem ? My suggestion right now would be to manually match key features in both images with line segments, but preferably, it should be automated.
A solution that uses OpenCV would be preferable, but I'm looking for any suggestion possible. In OpenCV, all pattern matching situation that I saw were problems way more obvious that this one. No difference in color and so on.
Unless realtime is required, do a simple approach to test if rotation can be automated:
Circuit boards like the ones in the images, are often based on perpendicular straight line segments. Hence you can "despeckle" and remove stuff like coffee stains, by finding linesegments.
Think about creating a kernel, that have a line with dark pixels on one side, and bright pixels on the other. Fold it on the image (or cross-correlate it) to identify all pixels that have a sequence of bright/dark pixels which are nearly vertical or horizontal.
you may interlace to speed things up.
edges of stains and speckles may survive this, if you want angles close to 45* representatations!
The resulting image can be interpreted as a sparse pointcloud.
You can now use RANSAC or other similar approaches to describe many of the remaining correlations, as line segments.
* use a 2 point line segment as input model for RANSAC, Degrade if small.
* Determine infinite lines that have many inliers
* use growth or binninng approaches to segmentate lines.
benefits:
high likelyhood of line segment representations that are actually present as circuitry in image. 2 point description of segments, possible transforms are easy.
easy interpretation of data, as it can be overlayed in openCV
Rotation should be easily found as the rotation that matches most found lines to horizontal and/or vertical axis'es.
apply rotation.
repeat for both images.
now you can determine best translation between the images, by simple x,y cross correlation.
If the top image is always of that quality (quasi bilevel patterns, easy edge detection), I would try a good geometric matching algorithm (such as Cognex or Halcon), training with the top image and searching the bottom one.
Maybe it is worth to first compensate rotation (I hope there is no scaling). You would do that by determining the dominant edge direction, possibly using a Hough transform. Or, much better, by careful mechanical alignment of the sensors.
Anyway, chances of success are low, this is a difficult problem.
I am currently facing a, in my opinion, rather common problem which should be quite easy to solve but so far all my approached have failed so I am turning to you for help.
I think the problem is explained best with some illustrations. I have some Patterns like these two:
I also have an Image like (probably better, because the photo this one originated from was quite poorly lit) this:
(Note how the Template was scaled to kinda fit the size of the image)
The ultimate goal is a tool which determines whether the user shows a thumb up/thumbs down gesture and also some angles in between. So I want to match the patterns against the image and see which one resembles the picture the most (or to be more precise, the angle the hand is showing). I know the direction in which the thumb is showing in the pattern, so if i find the pattern which looks identical I also have the angle.
I am working with OpenCV (with Python Bindings) and already tried cvMatchTemplate and MatchShapes but so far its not really working reliably.
I can only guess why MatchTemplate failed but I think that a smaller pattern with a smaller white are fits fully into the white area of a picture thus creating the best matching factor although its obvious that they dont really look the same.
Are there some Methods hidden in OpenCV I havent found yet or is there a known algorithm for those kinds of problem I should reimplement?
Happy New Year.
A few simple techniques could work:
After binarization and segmentation, find Feret's diameter of the blob (a.k.a. the farthest distance between points, or the major axis).
Find the convex hull of the point set, flood fill it, and treat it as a connected region. Subtract the original image with the thumb. The difference will be the area between the thumb and fist, and the position of that area relative to the center of mass should give you an indication of rotation.
Use a watershed algorithm on the distances of each point to the blob edge. This can help identify the connected thin region (the thumb).
Fit the largest circle (or largest inscribed polygon) within the blob. Dilate this circle or polygon until some fraction of its edge overlaps the background. Subtract this dilated figure from the original image; only the thumb will remain.
If the size of the hand is consistent (or relatively consistent), then you could also perform N morphological erode operations until the thumb disappears, then N dilate operations to grow the fist back to its original approximate size. Subtract this fist-only blob from the original blob to get the thumb blob. Then uses the thumb blob direction (Feret's diameter) and/or center of mass relative to the fist blob center of mass to determine direction.
Techniques to find critical points (regions of strong direction change) are trickier. At the simplest, you might also use corner detectors and then check the distance from one corner to another to identify the place when the inner edge of the thumb meets the fist.
For more complex methods, look into papers about shape decomposition by authors such as Kimia, Siddiqi, and Xiaofing Mi.
MatchTemplate seems like a good fit for the problem you describe. In what way is it failing for you? If you are actually masking the thumbs-up/thumbs-down/thumbs-in-between signs as nicely as you show in your sample image then you have already done the most difficult part.
MatchTemplate does not include rotation and scaling in the search space, so you should generate more templates from your reference image at all rotations you'd like to detect, and you should scale your templates to match the general size of the found thumbs up/thumbs down signs.
[edit]
The result array for MatchTemplate contains an integer value that specifies how well the fit of template in image is at that location. If you use CV_TM_SQDIFF then the lowest value in the result array is the location of best fit, if you use CV_TM_CCORR or CV_TM_CCOEFF then it is the highest value. If your scaled and rotated template images all have the same number of white pixels then you can compare the value of best fit you find for all different template images, and the template image that has the best fit overall is the one you want to select.
There are tons of rotation/scaling independent detection functions that could conceivably help you, but normalizing your problem to work with MatchTemplate is by far the easiest.
For the more advanced stuff, check out SIFT, Haar feature based classifiers, or one of the others available in OpenCV
I think you can get excellent results if you just compute the two points that have the furthest shortest path going through white. The direction in which the thumb is pointing is just the direction of the line that joins the two points.
You can do this easily by sampling points on the white area and using Floyd-Warshall.
Specifically, I'm trying to extract all of the relevant line segments from screenshots of the game 'asteroids'. I've looked through the various methods for edge detection, but none seem to fit my problem for two reasons:
They detect smooth contours, whereas I just need the detection of straight line segments, and only those within a certain range of length. Now, these constraints should make my task considerably easier than the general case, but I don't want to just use a full blown edge detector and then clear the result of curved lines, as that would be prohibitively costly. Speed is of the utmost importance for my purposes.
They output a modified image where the edges are highlights, whereas I want a set of pixel coordinates depicting the endpoints of the detected line segments. Alternatively, a list of all of the pixels included in each segment would work as well.
I have an inkling that one possible solution would involve a hough transform, but I don't know how to use this to get the actual locations of the line segments (i.e. endpoints in pixel space). Though even if I did, I have no idea if that would be the simplest or most efficient way of doing things, hence the general wording of the question title.
Lastly, here's a sample image:
Notice that all of the major lines are similar in length and density, and that the overall image contrast is very high. I'm hoping the solution to my problem will exploit these features, because again, efficiency is paramount.
One caveat: while most of the line segments in this context are part of a polygon, I don't want a solution that relies on this fact.
Have a look at the Line Segment Detector algorithm.
Here's what they do :
You can find an impressive video at the bottom of the page.
There's a C implementation (that works with C++ compilers) that works out of the box. There are just one or two files, and no additional dependencies
But, be warned, the algorithm is under the GNU Allegro GPL license.
Also check out EDlines http://ceng.anadolu.edu.tr/cv/EDLines/
Very fast and provides a very useful output
I am new in the Image processing, and I want to identify the QRCode in the image.
Actually there are three finder patterns, and at first I need to find them.
So I tried some methods, first is related with binarization, but when image has shadows and strong difference in illumination, then it is difficult to make a good binary image.
Actually the adaptive thershold depends on the size of the sliding window, which may be not good for a big barcodes. So even if I make a good binary image, can you suggest me methods of finding the barcode's finder pattern and barcode itself. The easiest way, if we talk about QRCode, is to find all contours of the image and select those which are square shaped and inlude two square shaped contours inside.
Also another method is to scan each horizontal line of the image to find the correct finder pattern, it depends on how well the binary image was made.
So I see the way of solving this problem, but I want to know are there any other different methods of finding the finder patterns of barcode? I think pattern matching is not good here. Can you also suggest a good binarization method, which do not depend on illumination. I tried many adaptive threshold binarization methods, but they have common problem, if the image contains a big black square, then binary image will have not a whole square, but a square with some parts of white color in the middle of the square, this is because the size of sliding window in the adaptive threshold method is not big enough.
You can look at the method used by ZXing: http://code.google.com/p/zxing/source/browse/trunk under core/src/com/google/zxing/qrcode/Detector.java
Basically it looks across the image for black-white-black-white-black in approximately a 1:1:3:1:1 pattern. Unless the angle of rotation is near 45, 135, 225, or 315 degrees, and unless the code is severely perspective-distorted, this method will find a finder pattern. It then cross-checks a couple ways -- looks vertically at that point in the image to confirm it also finds such a pattern. It also has a few more checks to throw out false positives, and then determine which pattern is which.
You could also try a Threshold Hysteresis with a rate of change control. Here is the link to the normal Threshold Hysteresis. Set the first threshold to a typical white value. Set the second threshold to less than the lowest white value in the corners.
The difference is that you want to check the difference between pixels for all values in between the first and second threshold. Ideally if the difference is positive, then act normally. But if it is negative, you only want to threshold if the difference is small.
This will be able to compensate for lighting variations, but will ignore the large changes between the background and the barcode. The final result is a binary object image, not an image of edges. Also there is no adaptive window to try and size properly.