Detecting drawn lines and dots on a notebook paper - image-processing

I have a picture of notebook (with squares) and lines and dots are drawn in it like in the description. Output should be a data structure which contains info about boundaries and dots. How one can accomplish that? If possible, program should process this dynamically (given a video).

Yes this can be accomplished by various image processing techniques.
One famous technique that can help is called the Canny Edge Detector. It can detect all the defined edges within an image. More can be looked into it here. Various python and C# image processing libraries make this extremely easy. Take for example OpenCV
For detecting dots in the middle of the edges, that would be up to you to come up with, unless anyone knows of a library to make that easy as well. I suggest looking at each square that we detected by the canny edge detector and see if there are any dark color values around the middle.
For the data structure, that is also up to you.
Remember that a video is just a sequence of images. Just apply the same technique to all the images.

Related

How to do a perspective transformation of an image which is missing corners using opencv java

I am trying to build a document scanner using openCV. I am trying to auto crop an uploaded image. I have few use cases where there is a gap in the border when the document is out of frame(captured image).
Ex image
Below is the canny edge detection of the given image.
The borders are missing here and findContours does not return me proper results due to this.
How can I handle such images.
Both automatic canny edge detection as well as dilate does not work in such cases because it can join only small edges.
Also few documents might have only 2 sides or 3 sides captured using camera and how can we crop the other areas which is not required.
Example Image:
Is there any specific technique for handling such documents?
Please suggest few ideas.
Your problem is unusual. One way to solve this problem which comes to my mind is to:
Add white borders around image.
https://docs.opencv.org/3.4/dc/da3/tutorial_copyMakeBorder.html
Find lines in edges
http://www.robindavid.fr/opencv-tutorial/chapter5-line-edge-and-contours-detection.html
https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_houghlines/py_houghlines.html
Make Probablistic HoughLines
Crop image by these lines. It will for sure work for image like 1st one.
For image like 2nd one you can use perpendicular and parallel lines.
For sure your algorithm must be pretty complex to works good. The easiest way is to take a picture of whole document if it is possible.
Good luck!

Feature detection on a small, noisy image with OpenCV

I have an image that is both pretty noisy, small (the relevant portion is 381 × 314) and the features are very subtle.
The source image and the cropped relevant area are here as well: http://imgur.com/a/O8Zc2
The task is to count the number of white-ish dots within the relevant area using Python but I would be happy with just isolating the lighter dots and lines within the area and removing the background structure (in this case the cell).
With OpenCV I've tried Histogram equalization (destroys the details), finding contours (didn't work), using color ranges (too close in color?)
Any suggestions or guidance on other things to try? I don't believe I can get a higher res image so is this task possible with the rather difficult source?
(This is not a Python answer, since I never used the Python/OpenCV binding. The images below were created using Mathematica. But I just used basic image processing functions, so you should be able to implement that in Python on your own.)
A very general "trick" in image processing is to think about removing the thing you're looking for, instead of actually looking for it. Because often, removing it is much easier than finding it. You could for instance apply a morphological opening, median filter or a gaussian filter to it:
These filters effectively remove details smaller than the filter size, and leave the coarser structures more or less untouched. So you can just take the difference from the original image and look for local maxima:
(You'll have to play around with different "detail removal filters" and filter sizes. There's no way to tell which one works best with just one image.)

square detection, image processing

I am looking for an efficient way to detect the small boxes around the numbers (see images)?
I already tried to use hough transformation with no success. Any ideas? I need some hints! I am using opencv...
For inspiration, you can have a look at the
Matlab video sudoku solver demo and explanation
Sudoku Grab, an Iphone App, whose author explains the computer vision part on his blog
Alternatively, if you are always hunting for the same grid you could deploy something like this:
Make a perfect artificial template of the grid and detect or save all coordinates from all corners.
In the target image, do the same thing, for example with Harris points. Be creative, you might also be able to use the distinct triangles that can be found in your images.
Using the coordinates from the template and the found harris points, determine the affine transformation x = Ax' between the template and the target image. That transformation can then be used to map the template grid onto the target image. At the very least this will give you some prior information to help guide further segmentation.
The gist of the idea and examples of the estimation of affine matrix A can be found on the site of Zissermans book Multiple View Geometry in Computer Vision and Peter Kovesi
I'd start by trying to detect the rectangular boundary of the overall sheet, then applying a perspective transform to make it truly rectangular. Crop that portion of the image out. If possible, then try to make the alternating white and grey sub-rectangles have an equal background brightness - maybe try adaptive histogram equalization.
Then the Hough transform might perform better. Alternatively, you could then take an approach that's broadly similar to this demonstration by Robert Bemis on MATLAB Central (it's analysing a DNA microarray image rather than Lotto cards, but it's essentially finding bounding boxes of items arranged in a grid). At a high level, the approach is to calculate the autocorrelation along columns and rows of pixels to detect the periodicity of the items in the grid, and use that to impose a bounding box on each item.
Sorry the above advice is mostly MATLAB-based; I'm afraid I'm not an opencv user, but hopefully it will give you some ideas at least.

Finding a grid in an image

Having a match-3 game screenshot (for example http://www.gameplay3.com/images/games/jewel-quest-ii-01S.jpg), what would be the correct way to find the bound box for the grid (table with tiles)? The board doesn't have to be a perfect rectangle (as can be seen in the screenshot), but each cell is completely square.
I've tried several games, and found that there are some per-game image transformations that can be done to enhance the tiles inside the grid (for example in this game it's enough to take the V channel out of HSV color space). Then I can enlarge the tiles so that they overlap, find the largest contour of the image and get the bound box from it.
The problem with above approach is that every game (or even level inside the same game) may need a different transformation to get hold of the tiles. So the question is - is there a standard way to enhance either tiles inside the grid or grid's lines (I've tried finding lines with Hough transform, but, although the grid seems pretty visible to the eye, Hough doesn't find it)?
Also, what if the screenshot is obtained using the phone camera instead of taking a screenshot of a desktop? From my experience, captured images have less defined colors (which depends on lighting), and also can be distorted a little, as there is no way to hold the phone exactly in front of the screen.
I would go with the following approach for a screenshot:
Find corners in the image using for example a canny like edge detector.
Perform a hough line transform. This should work quite nicely on the edge image.
If you have some information about size of the tiles you could eliminate false positive lines using some sort of spatial model of the grid (eg. lines only having a small angle to x/y axis of the image and/or distance/angle of tile borders.
Identifiy tile borders under the found hough lines by looking for edges found by canny under/next to the lines.
Which implementation of the hough transform did you use? How did you preprocess the image?
Another approach would be to use some sort of machine learning approach. As you are working in OpenCV you could use either a Haar like feature detector. An example for face detection using Haar like features can be found here:
OpenCV Haar Face Detector example
Another machine learning approach would be to follow a Histogram of Oriented Gradients (Hog) approach in combination with a Support Vector Machine (SVM). An example is located here:
HOG example
You can find general information about HoG detection at:
Hog detection

Analysis and transformation of the image on the basis of this analysis for better OCR results

I have an OCR project, but it works good only with images in which the text is fairly straight, not upside down. (not rotated text)
So I want to make OCR to be able to recognize any kind of images, even upside down. But I don't know what are approaches to solve this problem.
I need something like analysis of lines of letters, but even then I can't identify if line is upside down or not.
If the images you are performing OCR on are from a magazine or book where there is lots of text on multiple lines, I suggest trying to find the rotation of the page.
Probably the simplest way to do this is applying the hough transform for lines. Since the empty space between each line of text should be a a broad white line this could work without any preprocessing of the image. Otherwise try blurring it or using the "close" morphological operation to make the lines of text into opaque blocks.
Once you find the lines in the image with the hough transform you should just extract the principal angle of rotation (like the mean angle of all lines) and rotate it back.
My answer to you will be very high level as this is not simple, as you can imagine. You probably are doing some sort of image segmentation, where you segment each character of your text. But in order to recognize the characters, even when they are rotated, you need to use a feature vector with rotational invariant characteristics. To do it some people are using
Zernike Moment
Neocognitron neural network - widely used for handwriting
I don't think it's a simple task
Not sure if you are creating an OCR engine or using one. Most commercial OCR engines can detect that a page is upside-down (or 90 degree rotated) and auto-rotate it. For example, my company's GlyphReader OCR Engine can do that.
One simple solution is to take a portion of your image and run it through the engine at the four angles until you get back a good amount of recognized text. You can use a dictionary to see if what you are getting back is words and confidence levels to see how sure the engine is of its recognition.
If your engine can report confidence levels, and they are reporting consistently under some threshold, then you should stop and see if the document is rotated.
For 90 and 270, a hough transform will tell you whether the lines in the image are horizontal or vertical. It can also tell you if they are just slightly rotated off the horizontal so that you can correct that as well.

Resources