I am trying to detect a visible edge that seperates two different types of texture.
The problem is, although the texture difference is quite visible, we couldn't manage to detect the edge that seperates two regions with acceptable accuracy (e.g. few pixels) using GLCM or Gabor filters.
I am questioning whether these algorithms are suitable for this. I just wonder whether there would be any algortihm recommendations. Thanks.
In the sample image, the edge extends from near center top to bottom left.
This is not feasible with current state of the art. "the texture difference is quite visible" is a false impression.
Related
I have an input image as follows and wish to segment the parts into regions. I also want the segmented parts to not been just the pixels which contribute to the solid color but also the edge anti-aliasing between the edge of the region and the next region.
Does there exist any filter or method to segment the image in this way? The important part is that the end result segmented part must contain the edge anti-aliasing between it and the next regions. A correct solution is shown in yellow.
In these two images I zoomed the pixels to be large so the edge anti-aliasing between region edges can be seen clearly.
An example output that I want for the yellow region is shown.
For a definition of "edge anti-aliasing" see https://markpospesel.wordpress.com/2012/03/30/efficient-edge-antialiasing/
I'm not sure what exactly you want. For example, would some pixels belong to two segments? If that is the case, then I'm relatively sure you have to do something on your own. Otherwise, the following might work:
Opening and Closing
Opening and closing are two morphological operations which will smooth borders
Clustering
There are many clustering algorithms. They are what you want for non-semantic segmentation (for semantic segmentation, you might want to read my literature survey). One example is
P. F. Felzenszwalb, “Graph based image segmentation.”
I would simply give those algorithms a try and see if one directly works.
Other clustering algorithms:
K-means
DB-SCAN
CLARANS
AGNES
DIANA
I have taken on a project to automatically analyse images taken from a microscope of a specific type micro fractures. The problem is that the camera used was on an "auto" setting and so the micro fractures (which look like pin pricks) are a variety of shades from one photo to the next.
The background is also at various saturation levels and there are some items (which appear very bright in the photos) which look like fractures but are something different which I need to discount.
Could anyone recommend a technique I could investigate to help me solve this issue?
This is quite a normal situation in image recognition -- different lighting conditions, different orientation of objects, different scale, different image resolution. Methods have been developed to extract useful features out of such images. I am not an expert in that area, but I suspect that any general book on the subject contains at least a brief review of image normalization and feature extraction methods.
If the micro fractures are sharp edge transitions, then a combination of simple techniques may allow you to find connected regions of strong edge points that correspond to those fractures. If the fractures also appear dark, then you should be able to distinguish them from the bright fracture-like features.
Briefly:
Generate an edge map
(If necessary) Remove edge pixels corresponding to bright features.
Select an edge strength that separates the fractures from the background
Clean up the edge map image
Find connected regions in the edge map image
If you want to find thin features with strong edges in a background, then one step could be to generate an edge map (or edge image) in which each pixel represents the local edge strength. A medium gray pixel surrounded by other medium gray pixels would have relatively low edge strength, whereas a black pixel surrounded by light gray pixels would have relatively high edge strength. Various edge-finding techniques include Sobel, Prewitt, Canny, Laplacian, and Laplacian of Gaussian (LoG); I won't describe those here since Wikipedia has entries on them if you're not familiar with them.
Once you have an edge map, you could use a binary threshold to convert the edge map into black and white pixels. If you have evidence that fractures have an edge strength of 20, then you would use the value 20 as a binarization threshold on the image. Binarization will then leave you with a black and white edge map with white pixels for strong edges, and black pixels for the background.
Once you have the binarized edge map, you may need to perform a morphological "close" operation to ensure that white pixels that may be close to one another become part of the same connected region.
Once you've performed a close on the binarized edge map you can search for connected components (which may be called "contours" or "blobs"). For most applications it's better to identify 4-connected regions in which a pixel is considered connect to pixels to the top, left, bottom, and right, but not to its neighbors at the top left and other corners. If the features are typically single-pixel lines or meandering cracks, and if there isn't much noise, then you might be able to get away with identifying 8-connected regions.
Once you've identified connected regions you can filter based on the area, length of the longest axis, and/or other parameters.
If both dark and light features can have strong edges, and if you want to eliminate the bright features, then there are a few ways to eliminate them. In the original image you might clip the image by setting all values over a threshold brightness to that brightness. If the features you want to keep are darker than the median gray value of the image, then you could ignore all pixels brighter than the median gray value. If the background intensity varies widely, you might calculate a median for some local region.
Once we see your images I'm sure you'll get more suggestions. If the problem you're trying to solve turns out to be similar to one I worked on, which was to find cracks in highly textured surfaces, then I can be more specific about algorithms to try.
I'm trying to understand Viola Jones method, and I've mostly got it.
It uses simple Haar like features boosted into strong classifiers and organized into layers /cascade in order to accomplish better performances (not bother with obvious 'non object' regions).
I think I understand integral image and I understand how are computed values for the features.
The only thing I can't figure out is how is algorithm dealing with the face size variations.
As far as I know they use 24x24 subwindow that slides over the image, and within it algorithm goes through classifiers and tries to figure out is there a face/object on it, or not.
And my question is - what if one face is 10x10 size, and other 100x100? What happens then?
And I'm dying to know what are these first two features (in first layer of the cascade), how do they look like (keeping in mind that these two features, acording to Viola&Jones, will almost never miss a face, and will eliminate 60% of the incorrect ones) ? How??
And, how is possible to construct these features to work with these statistics for different face sizes in image?
Am I missing something, or maybe I've figured it all wrong?
If I'm not clear enough, I'll try to explain better my confusion.
Training
The Viola-Jones classifier is trained on 24*24 images. Each of the face images contains a similarly scaled face. This produces a set of feature detectors built out of two, three, or four rectangles optimised for a particular sized face.
Face size
Different face sizes are detected by repeating the classification at different scales. The original paper notes that good results are obtained by trying different scales a factor of 1.25 apart.
Note that the integral image means that it is easy to compute the rectangular features at any scale by simply scaling the coordinates of the corners of the rectangles.
Best features
The original paper contains pictures of the first two features selected in a typical cascade (see page 4).
The first feature detects the wide dark rectangle of the eyes above a wide brighter rectangle of the cheeks.
----------
----------
++++++++++
++++++++++
The second feature detects the bright thin rectangle of the bridge of the nose between the darker rectangles on either side containing the eyes.
---+++---
---+++---
---+++---
So what I need to do is measuring a foot length from an image taken by an ordinary user. That image will contain a foot with a black sock wearing, a coin (or other known size object), and a white paper (eg A4) where the other two objects will be upon.
What I already have?
-I already worked with opencv but just simple projects;
-I already started to read some articles about Camera Calibration ("Learn OpenCv") but still don't know if I have to go so far.
What I am needing now is some orientation because I still don't understand if I'm following right way to solve this problem. I have some questions: Will I realy need to calibrate camera to get two or three measures of the foot? How can I find the points of interest to get the line to measure, each picture is a different picture or there are techniques to follow?
Ps: sorry about my english, I really have to improve it :-/
First, some image acquisition things:
Can you count on the black sock and white background? The colors don't matter as much as the high contrast between the sock and background.
Can you standardize the viewing angle? Looking directly down at the foot will reduce perspective distortion.
Can you standardize the lighting of the scene? That will ease a lot of the processing discussed below.
Lastly, you'll get a better estimate if you zoom (or position the camera closer) so that the foot fills more of the image frame.
Analysis. (Note this discussion will directed to your question of identifying the axes of the foot. Identifying and analyzing the coin would use a similar process, but some differences would arise.)
The next task is to isolate the region of interest (ROI). If your camera is looking down at the foot, then the ROI can be limited to the white rectangle. My answer to this Stack Overflow post is a good start to square/rectangle identification: What is the simplest *correct* method to detect rectangles in an image?
If the foot lies completely in the white rectangle, you can clip the image to the rect found in step #1. This will limit the image analysis to region inside the white paper.
"Binarize" the image using a threshold function: http://opencv.willowgarage.com/documentation/cpp/miscellaneous_image_transformations.html#cv-threshold. If you choose the threshold parameters well, you should be able to reduce the image to a black region (sock pixels) and white regions (non-sock pixel).
Now the fun begins: you might try matching contours, but if this were my problem, I would use bounding boxes for a quick solution or moments for a more interesting (and possibly robust) solution.
Use cvFindContours to find the contours of the black (sock) region: http://opencv.willowgarage.com/documentation/structural_analysis_and_shape_descriptors.html#findcontours
Use cvApproxPoly to convert the contour to a polygonal shape http://opencv.willowgarage.com/documentation/structural_analysis_and_shape_descriptors.html#approxpoly
For the simple solution, use cvMinRect2 to find an arbitrarily oriented bounding box for the sock shape. The short axis of the box should correspond to the line in largura.jpg and the long axis of the box should correspond to the line in comprimento.jpg.
http://opencv.willowgarage.com/documentation/structural_analysis_and_shape_descriptors.html#minarearect2
If you want more (possible) accuracy, you might try cvMoments to compute the moments of the shape. http://opencv.willowgarage.com/documentation/structural_analysis_and_shape_descriptors.html#moments
Use cvGetSpatialMoment to determine the axes of the foot. More information on the spatial moment may be found here: http://en.wikipedia.org/wiki/Image_moments#Examples_2 and here http://opencv.willowgarage.com/documentation/structural_analysis_and_shape_descriptors.html#getspatialmoment
With the axes known, you can then rotate the image so that the long axis is axis-aligned (i.e. vertical). Then, you can simply count pixels horizontally and vertically to obtains the lengths of the lines. Note that there are several assumptions in this moment-oriented process. It's a fun solution, but it may not provide any more accuracy - especially since the accuracy of your size measurements is largely dependent on the camera positioning issues discussed above.
Lastly, I've provided links to the older C interface. You might take a look at the new C++ interface (I simply have not gotten around to migrating my code to 2.4)
Antonio Criminisi likely wrote the last word on this subject years ago. See his "Single View Metrology" paper , and his PhD thesis if you have time.
You don't have to calibrate the camera if you have a known-size object in your image. Well... at least if your camera doesn't distort too much and if you're not expecting high quality measurements.
A simple approach would be to detect a white (perspective-distorted) rectangle, mapping the corners to an undistorted rectangle (using e.g. cv::warpPerspective()) and use the known size of that rectangle to determine the size of other objects in the picture. But this only works for objects in the same plane as the paper, preferably not too far away from it.
I am not sure if you need to build this yourself, but if you just need to do it, and not code it. You can use KLONK Image Measurement for this. There is a free and payable versions.
I am currently facing a, in my opinion, rather common problem which should be quite easy to solve but so far all my approached have failed so I am turning to you for help.
I think the problem is explained best with some illustrations. I have some Patterns like these two:
I also have an Image like (probably better, because the photo this one originated from was quite poorly lit) this:
(Note how the Template was scaled to kinda fit the size of the image)
The ultimate goal is a tool which determines whether the user shows a thumb up/thumbs down gesture and also some angles in between. So I want to match the patterns against the image and see which one resembles the picture the most (or to be more precise, the angle the hand is showing). I know the direction in which the thumb is showing in the pattern, so if i find the pattern which looks identical I also have the angle.
I am working with OpenCV (with Python Bindings) and already tried cvMatchTemplate and MatchShapes but so far its not really working reliably.
I can only guess why MatchTemplate failed but I think that a smaller pattern with a smaller white are fits fully into the white area of a picture thus creating the best matching factor although its obvious that they dont really look the same.
Are there some Methods hidden in OpenCV I havent found yet or is there a known algorithm for those kinds of problem I should reimplement?
Happy New Year.
A few simple techniques could work:
After binarization and segmentation, find Feret's diameter of the blob (a.k.a. the farthest distance between points, or the major axis).
Find the convex hull of the point set, flood fill it, and treat it as a connected region. Subtract the original image with the thumb. The difference will be the area between the thumb and fist, and the position of that area relative to the center of mass should give you an indication of rotation.
Use a watershed algorithm on the distances of each point to the blob edge. This can help identify the connected thin region (the thumb).
Fit the largest circle (or largest inscribed polygon) within the blob. Dilate this circle or polygon until some fraction of its edge overlaps the background. Subtract this dilated figure from the original image; only the thumb will remain.
If the size of the hand is consistent (or relatively consistent), then you could also perform N morphological erode operations until the thumb disappears, then N dilate operations to grow the fist back to its original approximate size. Subtract this fist-only blob from the original blob to get the thumb blob. Then uses the thumb blob direction (Feret's diameter) and/or center of mass relative to the fist blob center of mass to determine direction.
Techniques to find critical points (regions of strong direction change) are trickier. At the simplest, you might also use corner detectors and then check the distance from one corner to another to identify the place when the inner edge of the thumb meets the fist.
For more complex methods, look into papers about shape decomposition by authors such as Kimia, Siddiqi, and Xiaofing Mi.
MatchTemplate seems like a good fit for the problem you describe. In what way is it failing for you? If you are actually masking the thumbs-up/thumbs-down/thumbs-in-between signs as nicely as you show in your sample image then you have already done the most difficult part.
MatchTemplate does not include rotation and scaling in the search space, so you should generate more templates from your reference image at all rotations you'd like to detect, and you should scale your templates to match the general size of the found thumbs up/thumbs down signs.
[edit]
The result array for MatchTemplate contains an integer value that specifies how well the fit of template in image is at that location. If you use CV_TM_SQDIFF then the lowest value in the result array is the location of best fit, if you use CV_TM_CCORR or CV_TM_CCOEFF then it is the highest value. If your scaled and rotated template images all have the same number of white pixels then you can compare the value of best fit you find for all different template images, and the template image that has the best fit overall is the one you want to select.
There are tons of rotation/scaling independent detection functions that could conceivably help you, but normalizing your problem to work with MatchTemplate is by far the easiest.
For the more advanced stuff, check out SIFT, Haar feature based classifiers, or one of the others available in OpenCV
I think you can get excellent results if you just compute the two points that have the furthest shortest path going through white. The direction in which the thumb is pointing is just the direction of the line that joins the two points.
You can do this easily by sampling points on the white area and using Floyd-Warshall.