How to detect exact, predefined shapes with hough transform, like a "W"?

How to detect exact, predefined shapes with hough transform, like a "W"? - image-processing

Let's say I have some system that scans documents, where all documents use the same font and font size.
In these documents, there will always be the same looking letter "W". Let's say it is always 20 px large. How can I set up the hough transform to recognize this letter "W" at 20 px large in my documents?

A quick Google search yields the following information of interest:
Generalizing the Hough Transform to Detect Arbitrary Shapes
and it looks like a lecture using the above paper as its source.
Also, if it's an actual "W", would an OCR engine like Tesseract be better suited to your needs?

The Hough transform for lines finds best fit line equations. You would need to do additional processing to find just the line segments. If the character thickness is several pixels, then to effectively find lines you might want to reduce the thickness to one pixel. There are techniques to do that, but also various algorithmic traps.
Once you have your line segments, you would still have to write an algorithm to identify characters based on the relative position and angle of the line segments. It's harder than it first appears.
A normalized cross-correlation (template matching) could work if you're certain that the image will always be in a certain rotation, the characters will always be the same size, etc. But even for scans you'll see some rotation and some variation in contrast.
All that aside, it's likely cheaper in the long run to use a commercial OCR package or reasonably good open source project. OCR is hard to implement if you're not already familiar with image processing.

Related

Extract coordinates from image file

How to get an array of coordinates of a (drawn) line in image? Coordinates should be relative to image borders. Input: *.img . Output array of coordinates (with fixed step). Any 3rd party software to do this? For example there is high contrast difference - white background and color black line; or red and green etc.
Example:

Oh, you mean non-straight lines. You need to define a "line". Intuitively, you might mean a connected area of the image with a high aspect ratio between the length of its medial axis and the distance between medial axis and edges (ie relatively long and narrow, even if it winds around). Possible approach:
Threshold or select by color. Perhaps select by color based on a histogram of colors, or posterize as described here: Adobe Photoshop-style posterization and OpenCV, then call scipy.ndimage.measurements.label()
For each area above, skeletonize. Helpful tutorial: "Skeletonization using OpenCV-Python". However, you will likely need the distance to the edges as well, so use skimage.morphology.medial_axis(..., return_distance=True)
Do some kind of cleanup/filtering on the skeleton to remove short branches, etc. Thinking about your particular use, and assuming your lines don't loop around, you can just find the longest single path in the skeleton. This is where you can also decide if a shape is a "line" or not, based on how long the longest path in its skeleton is, relative to distance to the edges. Not sure how to best do that in opencv, but "Analyze Skeleton" in Fiji/ImageJ will let you filter by branch length.
What is left is the most elongated medial axis of the original "line" shape. You can resample that to some step that you prefer, or fit it with a spline, etc.
Due to the nature of what you want to do, it is hard to come up with a sample code that will work on a range of images. This is likely to require some careful tuning. I recommend using a small set of images (corpus), running any version of your algo on them and checking the results manually until it is pretty good, then trying it on a large corpus.
EDIT: Original answer, only works for straight lines:
You probably want to use the Hough transform (OpenCV tutorial).
Python sample code: Horizontal Line detection with OpenCV
EDIT: Related question with sample code to skeletonize: How can I get a full medial-axis line with its perpendicular lines crossing it?

pattern recognition between two very different images

So, my problem is that I have to find common points between two images of a microchip. Here's an example of two images:
Between these two images, we can clearly see some common pattern like the wires on the bottom right of the first images that can be found in relatively the same place in the second image. Also, the sort of white Z shape in the first image can be seen in the second images, a bit harder, but it's there.
I tried to match them with SURF (OpenCV), found no common point at all. Tried to apply some filter on both images, like edge detection, thresholding, and other filter that I could found in GIMP, but whatever I tried, no common point were ever found.
I'd like to know if you have any idea to solve this problem ? My suggestion right now would be to manually match key features in both images with line segments, but preferably, it should be automated.
A solution that uses OpenCV would be preferable, but I'm looking for any suggestion possible. In OpenCV, all pattern matching situation that I saw were problems way more obvious that this one. No difference in color and so on.

Unless realtime is required, do a simple approach to test if rotation can be automated:
Circuit boards like the ones in the images, are often based on perpendicular straight line segments. Hence you can "despeckle" and remove stuff like coffee stains, by finding linesegments.
Think about creating a kernel, that have a line with dark pixels on one side, and bright pixels on the other. Fold it on the image (or cross-correlate it) to identify all pixels that have a sequence of bright/dark pixels which are nearly vertical or horizontal.
you may interlace to speed things up.
edges of stains and speckles may survive this, if you want angles close to 45* representatations!
The resulting image can be interpreted as a sparse pointcloud.
You can now use RANSAC or other similar approaches to describe many of the remaining correlations, as line segments.
* use a 2 point line segment as input model for RANSAC, Degrade if small.
* Determine infinite lines that have many inliers
* use growth or binninng approaches to segmentate lines.
benefits:
high likelyhood of line segment representations that are actually present as circuitry in image. 2 point description of segments, possible transforms are easy.
easy interpretation of data, as it can be overlayed in openCV
Rotation should be easily found as the rotation that matches most found lines to horizontal and/or vertical axis'es.
apply rotation.
repeat for both images.
now you can determine best translation between the images, by simple x,y cross correlation.

If the top image is always of that quality (quasi bilevel patterns, easy edge detection), I would try a good geometric matching algorithm (such as Cognex or Halcon), training with the top image and searching the bottom one.
Maybe it is worth to first compensate rotation (I hope there is no scaling). You would do that by determining the dominant edge direction, possibly using a Hough transform. Or, much better, by careful mechanical alignment of the sensors.
Anyway, chances of success are low, this is a difficult problem.

What's a simple and efficient method for extracting line segments from a simple 2D image?

Specifically, I'm trying to extract all of the relevant line segments from screenshots of the game 'asteroids'. I've looked through the various methods for edge detection, but none seem to fit my problem for two reasons:
They detect smooth contours, whereas I just need the detection of straight line segments, and only those within a certain range of length. Now, these constraints should make my task considerably easier than the general case, but I don't want to just use a full blown edge detector and then clear the result of curved lines, as that would be prohibitively costly. Speed is of the utmost importance for my purposes.
They output a modified image where the edges are highlights, whereas I want a set of pixel coordinates depicting the endpoints of the detected line segments. Alternatively, a list of all of the pixels included in each segment would work as well.
I have an inkling that one possible solution would involve a hough transform, but I don't know how to use this to get the actual locations of the line segments (i.e. endpoints in pixel space). Though even if I did, I have no idea if that would be the simplest or most efficient way of doing things, hence the general wording of the question title.
Lastly, here's a sample image:
Notice that all of the major lines are similar in length and density, and that the overall image contrast is very high. I'm hoping the solution to my problem will exploit these features, because again, efficiency is paramount.
One caveat: while most of the line segments in this context are part of a polygon, I don't want a solution that relies on this fact.

Have a look at the Line Segment Detector algorithm.
Here's what they do :
You can find an impressive video at the bottom of the page.
There's a C implementation (that works with C++ compilers) that works out of the box. There are just one or two files, and no additional dependencies
But, be warned, the algorithm is under the GNU Allegro GPL license.

Also check out EDlines http://ceng.anadolu.edu.tr/cv/EDLines/
Very fast and provides a very useful output

Analysis and transformation of the image on the basis of this analysis for better OCR results

I have an OCR project, but it works good only with images in which the text is fairly straight, not upside down. (not rotated text)
So I want to make OCR to be able to recognize any kind of images, even upside down. But I don't know what are approaches to solve this problem.
I need something like analysis of lines of letters, but even then I can't identify if line is upside down or not.

If the images you are performing OCR on are from a magazine or book where there is lots of text on multiple lines, I suggest trying to find the rotation of the page.
Probably the simplest way to do this is applying the hough transform for lines. Since the empty space between each line of text should be a a broad white line this could work without any preprocessing of the image. Otherwise try blurring it or using the "close" morphological operation to make the lines of text into opaque blocks.
Once you find the lines in the image with the hough transform you should just extract the principal angle of rotation (like the mean angle of all lines) and rotate it back.

My answer to you will be very high level as this is not simple, as you can imagine. You probably are doing some sort of image segmentation, where you segment each character of your text. But in order to recognize the characters, even when they are rotated, you need to use a feature vector with rotational invariant characteristics. To do it some people are using
Zernike Moment
Neocognitron neural network - widely used for handwriting
I don't think it's a simple task

Not sure if you are creating an OCR engine or using one. Most commercial OCR engines can detect that a page is upside-down (or 90 degree rotated) and auto-rotate it. For example, my company's GlyphReader OCR Engine can do that.
One simple solution is to take a portion of your image and run it through the engine at the four angles until you get back a good amount of recognized text. You can use a dictionary to see if what you are getting back is words and confidence levels to see how sure the engine is of its recognition.
If your engine can report confidence levels, and they are reporting consistently under some threshold, then you should stop and see if the document is rotated.
For 90 and 270, a hough transform will tell you whether the lines in the image are horizontal or vertical. It can also tell you if they are just slightly rotated off the horizontal so that you can correct that as well.

Image processing / super light OCR

I have 55 000 image files (in both JPG and TIFF format) which are pictures from a book.
The structure of each page is this:
some text
--- (horizontal line) ---
a number
some text
--- (horizontal line) ---
another number
some text
There can be from zero to 4 horizontal lines on any given page.
I need to find what the number is, just below the horizontal line.
BUT, numbers strictly follow each other, starting at one on page one, so in order to find the number, I don't need to read it: I could just detect the presence of horizontal lines, which should be both easier and safer than trying to OCR the page to detect the numbers.
The algorithm would be, basically:
for each image
count horizontal lines
print image name, number of horizontal lines
next image
The question is: what would be the best image library/language to do the "count horizontal lines" part?

Probably the easiest way to detect your lines is using the Hough transform in OpenCV (which has wrappers for many languages).
The OpenCV Hough tranform will detect all lines in the image and return their angles and start/stop coordinates. You should only keep the ones whose angles are close to horizontal and of adequate length.
O'Reilly's Learning OpenCV explains in detail the function's input and output (p.156).

If you have good contrast, try running connected components and analyze the result. It can be an alternative to finding lines through Hough and cover the case when your structured elements are a bit curved or a line algorithm picks up the lines you don’t want it to pick up.
Connected components is a super fast, two raster scan algorithm and will give you a mask with all you connected elements in it marked with different labels and accounted for. You can discard anything short ( in terms of aspect ratio). Overall, this can be more general, faster but probably a bit more involved than running Hough transform. The Hough transform on the other hand will be more tolerable for contrast artifacts and even accidental gaps in lines.
OpenCV has the function findContours() that find components for you.

you might want to try John' Resig's OCR and Neural Nets in Javascript

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart