Compare two hand-drawn canvas diagram for similarity in php - image-processing

I'm implementing the feature where users can draw the diagram on canvas for the prepositional phrase. When the user done with the diagram then I convert and save the diagram as an Image. Now I want to find out whether the diagram drawn by the user is correct. For that, I'm comparing the user-drawn diagram with the correct diagram I have in my system. I've tried the pHash but the score isn't satisfying. Is there a way to do it in PHP/Javascript?
Below is the sample Image on which I want to perform the comparison.
Thank you.

Related

How can I transform an image to match a circular model in OpenCV

I'm trying to make a program that can take an image of a dartboard and read the score. So far I can get the position of each dart by comparing it to a model image as you can see here:
However this only works if the input image is practically the same. In this other case the board is slightly in a different perspective so I was thinking maybe I can transform the image to match the model image and then do the process that you can see above.
So my question is: How can I transform this last image to match the shape and pespective of the model dart board with OpenCV?
The dart board is basically planar. Thus, you can model the wanted transformation by a homography. Now you can perform a simple feature extraction and matching like here or if speed is not as important utilize an intensity based parametric alignment algorithm (more accurate).
However, as already mentioned in the comments, it will not be as simple afterwards. The dart flights will (depending on the distortion) most likely cover an area of your board which does not coincide with the actual score. Actually, even with a frontal view it is difficult to say.
I assume you will have to find the point on which the darts stick in your board. Furthermore, I think this will be easier with a view from a certain angle. Maybe, you can fit lines segments just in the area where you detected a difference beforehand.
I don't think comparing an image with the model that was captured using a different subject with a different angle is a good idea. There should be lots of small differences even after perfectly matching them geometrically - like shades, lighting, color differences, etc.
I would just capture an image every time the game begin (reference) and extract the features (straight lines seem good enough) and then after the game, capture an image, subtract the reference, and do blob analysis to find darts.

Analyzing a hand-drawn flowchart diagram

I'm trying to detect objects and text in a hand-drawn diagram.
My goal is to be able to "parse" something like this into an object structure for further processing.
My first aim is to detect text, lines and boxes (arrows etc... are not important (for now ;))
I can do Dilatation, Erosion, Otsu thresholding, Invert etc and easily get to something like this
What I need some guidance for are the next steps.
I've have several ideas:
Contour Analysis
OCR using UNIPEN
Edge detection
Contour Analysis
I've been reading about "Contour Analysis for Image Recognition in C#" on CodeProject which could be a great way to recognize boxes etc. but my issue is that the boxes are connected and therefore do not form separate objects to match with a template.
Therefore I need some advises IF this is a feasible way to go.
OCR using UNIPEN
I would like to use UNIPEN (see "Large pattern recognition system using multi neural networks" on CodeProject) to recognize handwritten letters and then "remove" them from the image leaving only the boxes and lines.
Edge detection
Another way could be to detect all lines and corners and in that way infer the boxes and lines that the image consist of. In that case ideas on how to straighten the lines and find the 90 degree corners would be helpful.
Generally, I think I just need some pointers on which strategy to apply, not code samples (though it would be great ;))
I will try to answer about the contour analysis and the lines between them.
If you need to turn the interconnected boxes into separate objects, that can be achieved easily enough:
close the gaps in the box edges with morphological closing
perform connected components labeling and look for compact objects (e.g. objects whose area is close to the area of their bounding box)
You will get the insides of the boxes. These can be elliptical or rectangular or any shape you may find in common diagrams, the contour analysis can tell you which. A problem may arise for enclosed background areas (e.g. the space between the ABC links in your example diagram). You might eliminate these on the criterion that their bounding box overlaps with multiple other objects' bounding boxes.
Now find line segments with HoughLinesP. If a segment finishes or starts within a certain distance of the edge of one of the objects, you can assume it is connected to that object.
As an added touch you could try to detect arrow ends on either side by checking the width profile of the line segments in a neighbourhood of their endpoints.
It is an interesting problem, I will try to remember it and give it to my students to grit their teeth on.

Image registration algorithms / techniques used to enable extraction of fields from scanned document?

I am trying to determine the best method to extract handwritten data from a scanned document.
The handwritten data is in specific boxed areas. I generated the digital version of the document, and therefore I know both the co-ordinates of the boxed areas, and could also generate additional variations of the document if need be (i.e. a version that is masked to make the fields easier to extract)
The reason I can't just extract the fields using the co-ordinates from document generation is there is shifting/scaling/perspective modifications which are occurring during the scanning process, which can push/pull the co-ordinates for each individual box differently (the scanned document does have corner markers used for alignment, but even so unintended transformations commonly take place).
I assume high level there are two ways to address this issue: step through the co-ordinates of each box on the page and attempt to "correct" them with some technique/algorithm, or compare a completed form with a blank form (masked?) and try to extract the correct fields that way.
What is the most efficient technique / algorithm to adjust for these modifications and accurately extract the areas which contain handwriting? Are there other options?
There many possible techniques that can achieve nearly 100% accuracy for your problem.
Just follow steps described on this page http://www.codeproject.com/Articles/24809/Image-Alignment-Algorithms. In short, you first compute optical flow between two images and then estimate transformation that produces such optical flow.
Note: this approach works best when matched images are almost identical.
Your second approach would do. Some more details: since you have printed letters in the form such as "Section A", "A6" and "other", and you mentioned you have corner markers for alignment, you could use them as landmarks, perform a template matching to find the coordinates of the landmarks in original and scanned documents. Then use these two sets of landmarks (The corner marks might be sufficient) to generate an affine transformation M = cv2.getAffineTransform(landmarks1, landmarks2), apply cv2.warpAffine(img, M, ...) to the scanned image, to transform it to match the original document. With this, the boxes will be aligned properly (might still have a little bit of shift), then you could locate each boxes correctly. See https://www.geeksforgeeks.org/python-opencv-affine-transformation/
After typing all the above I found this webpage talking about the same thing with code: https://learnopencv.com/feature-based-image-alignment-using-opencv-c-python/

Algorithm for capturing machine readable zones

What method is suitable to capture (detect) MRZ from a photo of a document? I'm thinking about cascade classifier (e.g. Viola-Jones), but it seems a bit weird to use it for this problem.
If you know that you will look for text in a passport, why not try to find passport model points on it first. Match template of a passport to it by using ASM/AAM (Active shape model, Active Appearance Model) techniques. Once you have passport position information you can cut out the regions that you are interested in. This will take some time to implement though.
Consider this approach as a great starting point:
Black top-hat followed by a horisontal derivative highlights long rows of characters.
Morphological closing operation(s) merge the nearby characters and character rows together into a single large blob.
Optional erosion operation(s) remove the small blobs.
Otsu thresholding followed by contour detection and filtering away the contours which are apparently too small, too round, or located in the wrong place will get you a small number of possible locations for the MRZ
Finally, compute bounding boxes for the locations you found and see whether you can OCR them successfully.
It may not be the most efficient way to solve the problem, but it is surprisingly robust.
A better approach would be the use of projection profile methods. A projection profile method is based on the following idea:
Create an array A with an entry for every row in your b/w input document. Now set A[i] to the number of black pixels in the i-th row of your original image.
(You can also create a vertical projection profile by considering columns in the original image instead of rows.)
Now the array A is the projected row/column histogram of your document and the problem of detecting MRZs can be approached by examining the valleys in the A histogram.
This problem, however, is not completely solved, so there are many variations and improvements. Here's some additional documentation:
Projection profiles in Google Scholar: http://scholar.google.com/scholar?q=projection+profile+method
Tesseract-ocr, a great open source OCR library: https://code.google.com/p/tesseract-ocr/
Viola & Jones' Haar-like features generate many (many (many)) features to try to describe an object and are a bit more robust to scale and the like. Their approach was a unique approach to a difficult problem.
Here, however, you have plenty of constraint on the problem and anything like that seems a bit overkill. Rather than 'optimizing early', I'd say evaluate the standard OCR tools off the shelf and see where they get you. I believe you'll be pleasantly surprised.
PS:
You'll want to preprocess the image to isolate the characters on a white background. This can be done quite easily and will help the OCR algorithms significantly.
You might want to consider using stroke width transform.
You can follow these tips to implement it.

square detection, image processing

I am looking for an efficient way to detect the small boxes around the numbers (see images)?
I already tried to use hough transformation with no success. Any ideas? I need some hints! I am using opencv...
For inspiration, you can have a look at the
Matlab video sudoku solver demo and explanation
Sudoku Grab, an Iphone App, whose author explains the computer vision part on his blog
Alternatively, if you are always hunting for the same grid you could deploy something like this:
Make a perfect artificial template of the grid and detect or save all coordinates from all corners.
In the target image, do the same thing, for example with Harris points. Be creative, you might also be able to use the distinct triangles that can be found in your images.
Using the coordinates from the template and the found harris points, determine the affine transformation x = Ax' between the template and the target image. That transformation can then be used to map the template grid onto the target image. At the very least this will give you some prior information to help guide further segmentation.
The gist of the idea and examples of the estimation of affine matrix A can be found on the site of Zissermans book Multiple View Geometry in Computer Vision and Peter Kovesi
I'd start by trying to detect the rectangular boundary of the overall sheet, then applying a perspective transform to make it truly rectangular. Crop that portion of the image out. If possible, then try to make the alternating white and grey sub-rectangles have an equal background brightness - maybe try adaptive histogram equalization.
Then the Hough transform might perform better. Alternatively, you could then take an approach that's broadly similar to this demonstration by Robert Bemis on MATLAB Central (it's analysing a DNA microarray image rather than Lotto cards, but it's essentially finding bounding boxes of items arranged in a grid). At a high level, the approach is to calculate the autocorrelation along columns and rows of pixels to detect the periodicity of the items in the grid, and use that to impose a bounding box on each item.
Sorry the above advice is mostly MATLAB-based; I'm afraid I'm not an opencv user, but hopefully it will give you some ideas at least.

Resources