I tear up one paper, and its fragments like :
and I want to make these fragments of the paper to recover.
I have done something by using opencv:
take pictures of these fragments and save them into computer disk
read each fragments and get their contour lines
calculate the similarity for each contour lines of the fragment
???
What I should do if I want to recover these fragment to be one whole "paper" which like the paper don't tear up?
Any suggestions or methods would be appreciated, thank you very much!
Although this is like a search problem; due to infinite numbers of states coming from number of numerous points and orientations, it is impossible to implement something like depth first search. You need a proper vision algorithm to reduce these number of states.
How did you compare the "contour lines" ? Where are these lines ? Your contours might be circular, and could have a very big number of "lines". You should look for "how can I create this contour, joining some parts of other contours"
"is this perimeter has a similarity with another one ? which part of the perimeters have highest similarity ?"
Both of these questions yield to one solution that come to my mind, solving it human-way: Select a contour. Connect another one from some point. Rotate it until it hits. If there is a black space between, change the connection point. Try all points. If you find a connection point yielding to "no black space", connect these contours into one contour and continue the operation. If there exist no point, try another contour.
EDIT: Although this method does nothing but searching; it decreases the number of states with black space (contour finding) and hitting (coordinate comparison)
ENHANCEMENT: Use contour finding method with "simple approximation" (CV_APPROX_SIMPLE) to deduce the straight lines of the paper. You can then reduce the number of points you are going to try, by noting these points as paper sides.
EDIT2 : You may want to look at corner detection and stitching methods:
http://tobw.net/index.php?cat_id=2&project=Panorama%20Stitching%20Demo%20in%20Matlab,
http://docs.opencv.org/modules/stitching/doc/introduction.html
Related
Just wish to receive some ideas on I can solve this problem.
For a clearer picture, here are examples of some of the image that we are looking at:
I have tried looking into thresholding it, like otsu, blobbing it, etc. However, I am still unable to segment out the books and count them properly. Hardcover is easy of course, as the cover clearly separates the books, but when it comes to softcover, I have not been able to successfully count the number of books.
Does anybody have any suggestions on what I can do? Any help will be greatly appreciated. Thanks.
I ran a sobel edge detector and used Hough transform to detect lines on the last image and it seemed to be working okay for me. You can then link the edges on the output of the sobel edge detector and then count the number of horizontal lines. Or, you can do the same on the output of the lines detected using Hough.
You can further narrow down the area of interest by converting the image into a binary image. The outputs of all of these operators can be seen in following figure ( I couldn't upload an image so had to host it here) http://www.pictureshoster.com/files/v34h8hvvv1no4x13ng6c.jpg
Refer to http://www.mathworks.com/help/images/analyzing-images.html#f11-12512 for some more useful examples on how to do edge, line and corner detection.
Hope this helps.
I think that #audiohead's recommendation is good but you should be careful when applying the Hough transform for images that will have the library's stamp as it might confuse it with another book (You can see that the letters form some break-lines that will be detected by sobel).
Consider to apply first an edge preserving smoothing algorithm such as a Bilateral Filter. When tuned correctly (setting of the Kernels) it can avoid these such of problems.
A Different Solution That Might Work (But can be slow)
Here is a different approach that is based on pixel marking strategy.
a) Based on some very dark threshold, mark all black pixels as visited.
b) While there are unvisited pixels: Pick the next unvisited pixel and apply a region-growing algorithm http://en.wikipedia.org/wiki/Region_growing while marking its pixels with a unique number. At this stage you will need to analyse the geometric shape that this region is forming. A good criteria to detecting a book is that the region is creating some form of a rectangle where width >> height. This will detect a book and mark all its pixels to the unique number.
Once there are no more unvisited pixels, the number of unique numbers is the number of books you will have + For each pixel on your image you will now to which book does it belongs.
Do you have to keep the books this way? If you can change the books to face back side to the camera then I think you can get more information about the different colors used by different books.The lines by Hough transform or edge detection will be more prominent this way.
There exist more sophisticated methods which are much better in contour detection and segmentation, you can have a look at them here, however it is quite slow, http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/resources.html
Once you get the ultrametric contour map, you can perform some computation on them to count the number of books
I would try a completely different approach; with paperbacks, the covers are medium-dark lines whilst the rest of the (assuming white pages) are fairly white and "bloomed", so I'd try to thicken up the dark edges to make them easy to detect, then that would give the edges akin to working with hardbacks which you say you've done.
I'd try something like an erosion to thicken up the edges. This would be a nice, fast operation.
first, I've learning just couple of week about image processing, NN, dll, by myself, so I'm really new n really far to pro. n sorry for my bad english.
there's image or photo of my drawing, I want to get the coordinates of object/shape (black dot) n the number around it, the number indicating the sequence number of dot.
How to get it? How to detect the dots? Shape recognition for the dots? Number handwriting recognition for the numbers? Then segmentation to get the position? Or use template matching? But every dot has a bit different shape because of hand drawing. Use neural network? in NN, the neuron is usually contain every pixel to recognize an character, right? can I use an picture of character or drawing dot contained by each neuron to recognize my whole picture?
I'm very new, so I'm really need your advice, correct me if I wrong! Please tell me what I must learn, what I must do, what I must use.
Thank you very much. :'D
This is a difficult problem which can't be solved by a quick solution.
Here is how I would approach it:
Get a better picture. Your image is very noisy and is taken in low light with high ISO. Use a better camera and better lighting conditions so you can get the background to be as white as possible and the dots as black as possible. Try to maximize the contrast.
Threshold the image so that all the background is white and the dots and numbers are black. Maybe you could apply some erosion and/or dilation to help connect the dark edges together.
Detect the rectangle somehow and set your work area to be inside the rectangle (crop the rest of the image so that you are left with the area inside the rectangle). You could do this by detecting the contours in the image and then the contour that has the largest area is the rectangle (because it's the largest object in the image). Of course, this is not the only way. See this: OpenCV find contours
Once you are left with only the dots, circles and numbers you need to find a way to detect them and discriminate between them. You could again find all contours (or maybe you've found them all from the previous step). You need to figure out a way to see if a certain contour is a circle, a filled circle (dot) or a number. This is a problem in it's own. Maybe you could count the white/black pixels in the contour's bounding box. Dots have more black pixels than circles and numbers. You also need to do something about numbers that connect with dots (like the number 5 in your image)
Once you know what is a dot, circle or number you could use an OCR library (Tesseract or any other OCR lib) to try and recognize the numbers. You could also use a neural network library (maybe trained with the MNIST dataset) to recognize the digits. A good one would be a convolutional neural network similar to LeNet-5.
As you can see, this is a problem that requires many different steps to solve, and many different components are involved. The steps I suggested might not be the best, but with some work I think it can be solved.
What method is suitable to capture (detect) MRZ from a photo of a document? I'm thinking about cascade classifier (e.g. Viola-Jones), but it seems a bit weird to use it for this problem.
If you know that you will look for text in a passport, why not try to find passport model points on it first. Match template of a passport to it by using ASM/AAM (Active shape model, Active Appearance Model) techniques. Once you have passport position information you can cut out the regions that you are interested in. This will take some time to implement though.
Consider this approach as a great starting point:
Black top-hat followed by a horisontal derivative highlights long rows of characters.
Morphological closing operation(s) merge the nearby characters and character rows together into a single large blob.
Optional erosion operation(s) remove the small blobs.
Otsu thresholding followed by contour detection and filtering away the contours which are apparently too small, too round, or located in the wrong place will get you a small number of possible locations for the MRZ
Finally, compute bounding boxes for the locations you found and see whether you can OCR them successfully.
It may not be the most efficient way to solve the problem, but it is surprisingly robust.
A better approach would be the use of projection profile methods. A projection profile method is based on the following idea:
Create an array A with an entry for every row in your b/w input document. Now set A[i] to the number of black pixels in the i-th row of your original image.
(You can also create a vertical projection profile by considering columns in the original image instead of rows.)
Now the array A is the projected row/column histogram of your document and the problem of detecting MRZs can be approached by examining the valleys in the A histogram.
This problem, however, is not completely solved, so there are many variations and improvements. Here's some additional documentation:
Projection profiles in Google Scholar: http://scholar.google.com/scholar?q=projection+profile+method
Tesseract-ocr, a great open source OCR library: https://code.google.com/p/tesseract-ocr/
Viola & Jones' Haar-like features generate many (many (many)) features to try to describe an object and are a bit more robust to scale and the like. Their approach was a unique approach to a difficult problem.
Here, however, you have plenty of constraint on the problem and anything like that seems a bit overkill. Rather than 'optimizing early', I'd say evaluate the standard OCR tools off the shelf and see where they get you. I believe you'll be pleasantly surprised.
PS:
You'll want to preprocess the image to isolate the characters on a white background. This can be done quite easily and will help the OCR algorithms significantly.
You might want to consider using stroke width transform.
You can follow these tips to implement it.
So, my problem is that I have to find common points between two images of a microchip. Here's an example of two images:
Between these two images, we can clearly see some common pattern like the wires on the bottom right of the first images that can be found in relatively the same place in the second image. Also, the sort of white Z shape in the first image can be seen in the second images, a bit harder, but it's there.
I tried to match them with SURF (OpenCV), found no common point at all. Tried to apply some filter on both images, like edge detection, thresholding, and other filter that I could found in GIMP, but whatever I tried, no common point were ever found.
I'd like to know if you have any idea to solve this problem ? My suggestion right now would be to manually match key features in both images with line segments, but preferably, it should be automated.
A solution that uses OpenCV would be preferable, but I'm looking for any suggestion possible. In OpenCV, all pattern matching situation that I saw were problems way more obvious that this one. No difference in color and so on.
Unless realtime is required, do a simple approach to test if rotation can be automated:
Circuit boards like the ones in the images, are often based on perpendicular straight line segments. Hence you can "despeckle" and remove stuff like coffee stains, by finding linesegments.
Think about creating a kernel, that have a line with dark pixels on one side, and bright pixels on the other. Fold it on the image (or cross-correlate it) to identify all pixels that have a sequence of bright/dark pixels which are nearly vertical or horizontal.
you may interlace to speed things up.
edges of stains and speckles may survive this, if you want angles close to 45* representatations!
The resulting image can be interpreted as a sparse pointcloud.
You can now use RANSAC or other similar approaches to describe many of the remaining correlations, as line segments.
* use a 2 point line segment as input model for RANSAC, Degrade if small.
* Determine infinite lines that have many inliers
* use growth or binninng approaches to segmentate lines.
benefits:
high likelyhood of line segment representations that are actually present as circuitry in image. 2 point description of segments, possible transforms are easy.
easy interpretation of data, as it can be overlayed in openCV
Rotation should be easily found as the rotation that matches most found lines to horizontal and/or vertical axis'es.
apply rotation.
repeat for both images.
now you can determine best translation between the images, by simple x,y cross correlation.
If the top image is always of that quality (quasi bilevel patterns, easy edge detection), I would try a good geometric matching algorithm (such as Cognex or Halcon), training with the top image and searching the bottom one.
Maybe it is worth to first compensate rotation (I hope there is no scaling). You would do that by determining the dominant edge direction, possibly using a Hough transform. Or, much better, by careful mechanical alignment of the sensors.
Anyway, chances of success are low, this is a difficult problem.
I am currently facing a, in my opinion, rather common problem which should be quite easy to solve but so far all my approached have failed so I am turning to you for help.
I think the problem is explained best with some illustrations. I have some Patterns like these two:
I also have an Image like (probably better, because the photo this one originated from was quite poorly lit) this:
(Note how the Template was scaled to kinda fit the size of the image)
The ultimate goal is a tool which determines whether the user shows a thumb up/thumbs down gesture and also some angles in between. So I want to match the patterns against the image and see which one resembles the picture the most (or to be more precise, the angle the hand is showing). I know the direction in which the thumb is showing in the pattern, so if i find the pattern which looks identical I also have the angle.
I am working with OpenCV (with Python Bindings) and already tried cvMatchTemplate and MatchShapes but so far its not really working reliably.
I can only guess why MatchTemplate failed but I think that a smaller pattern with a smaller white are fits fully into the white area of a picture thus creating the best matching factor although its obvious that they dont really look the same.
Are there some Methods hidden in OpenCV I havent found yet or is there a known algorithm for those kinds of problem I should reimplement?
Happy New Year.
A few simple techniques could work:
After binarization and segmentation, find Feret's diameter of the blob (a.k.a. the farthest distance between points, or the major axis).
Find the convex hull of the point set, flood fill it, and treat it as a connected region. Subtract the original image with the thumb. The difference will be the area between the thumb and fist, and the position of that area relative to the center of mass should give you an indication of rotation.
Use a watershed algorithm on the distances of each point to the blob edge. This can help identify the connected thin region (the thumb).
Fit the largest circle (or largest inscribed polygon) within the blob. Dilate this circle or polygon until some fraction of its edge overlaps the background. Subtract this dilated figure from the original image; only the thumb will remain.
If the size of the hand is consistent (or relatively consistent), then you could also perform N morphological erode operations until the thumb disappears, then N dilate operations to grow the fist back to its original approximate size. Subtract this fist-only blob from the original blob to get the thumb blob. Then uses the thumb blob direction (Feret's diameter) and/or center of mass relative to the fist blob center of mass to determine direction.
Techniques to find critical points (regions of strong direction change) are trickier. At the simplest, you might also use corner detectors and then check the distance from one corner to another to identify the place when the inner edge of the thumb meets the fist.
For more complex methods, look into papers about shape decomposition by authors such as Kimia, Siddiqi, and Xiaofing Mi.
MatchTemplate seems like a good fit for the problem you describe. In what way is it failing for you? If you are actually masking the thumbs-up/thumbs-down/thumbs-in-between signs as nicely as you show in your sample image then you have already done the most difficult part.
MatchTemplate does not include rotation and scaling in the search space, so you should generate more templates from your reference image at all rotations you'd like to detect, and you should scale your templates to match the general size of the found thumbs up/thumbs down signs.
[edit]
The result array for MatchTemplate contains an integer value that specifies how well the fit of template in image is at that location. If you use CV_TM_SQDIFF then the lowest value in the result array is the location of best fit, if you use CV_TM_CCORR or CV_TM_CCOEFF then it is the highest value. If your scaled and rotated template images all have the same number of white pixels then you can compare the value of best fit you find for all different template images, and the template image that has the best fit overall is the one you want to select.
There are tons of rotation/scaling independent detection functions that could conceivably help you, but normalizing your problem to work with MatchTemplate is by far the easiest.
For the more advanced stuff, check out SIFT, Haar feature based classifiers, or one of the others available in OpenCV
I think you can get excellent results if you just compute the two points that have the furthest shortest path going through white. The direction in which the thumb is pointing is just the direction of the line that joins the two points.
You can do this easily by sampling points on the white area and using Floyd-Warshall.