I'm getting some info from image with Canny algorithm and findContours function.
Sometimes I get too many noisy points in some images which contains hairs or any other detailed stuff. I wonder how can I merge close enough points with OpenCV. For example I wish I could merge all points which are distanced from each other on less then X. (sqrt(dxdx + dydy) < X I mean).
I heard that OpenCV has it's own wrapper around FLANN, but I'm not sure how do I use it.
And yeah, I want merging to be done on all contours awared of each other, not inside each contour individually.
Use DBScan Clustering. on bounding rectangles.
Related
When reading about classic computer vision I am confused on how multiscale feature matching works.
Suppose we use an image pyramid,
How do you deal with the same feature being detected at multiple scales? How do you decide which to make a deacriptor for?
How do you connected features between scales? For example let's say you have a feature detected and matched to a descriptor at scale .5. Is this location then translated to its location in the initial scale?
I can share something about SIFT that might answer question (1) for you.
I'm not really sure what you mean in your question (2) though, so please clarify?
SIFT (Scale-Invariant Feature Transform) was designed specifically to find features that remains identifiable across different image scales, rotations, and transformations.
When you run SIFT on an image of some object (e.g. a car), SIFT will try to create the same descriptor for the same feature (e.g. the license plate), no matter what image transformation you apply.
Ideally, SIFT will only produce a single descriptor for each feature in an image.
However, this obviously doesn't always happen in practice, as you can see in an OpenCV example here:
OpenCV illustrates each SIFT descriptor as a circle of different size. You can see many cases where the circles overlap. I assume this is what you meant in question (1) by "the same feature being detected at multiple scales".
And to my knowledge, SIFT doesn't really care about this issue. If by scaling the image enough you end up creating multiple descriptors from "the same feature", then those are distinct descriptors to SIFT.
During descriptor matching, you simply brute-force compare your list of descriptors, regardless of what scale it was generated from, and try to find the closest match.
The whole point of SIFT as a function, is to take in some image feature under different transformations, and produce a similar numerical output at the end.
So if you do end up with multiple descriptors of the same feature, you'll just end up having to do more computational work, but you will still essentially match the same pair of feature across two images regardless.
Edit:
If you are asking about how to convert coordinates from the scaled images in the image pyramid back into original image coordinates, then David Lowe's SIFT paper dedicates section 4 on that topic.
The naive approach would be to simply calculate the ratios of the scaled coordinates vs the scaled image dimensions, then extrapolate back to the original image coordinates and dimensions. However, this is inaccurate, and becomes increasingly so as you scale down an image.
Example: You start with a 1000x1000 pixel image, where a feature is located at coordinates (123,456). If you had scaled down the image to 100x100 pixel, then the scaled keypoint coordinate would be something like (12,46). Extrapolating back to the original coordinates naively would give the coordinates (120,460).
So SIFT fits a Taylor expansion of the Difference of Gaussian function, to try and locate the original interesting keypoint down to sub-pixel levels of accuracy; which you can then use to extrapolate back to the original image coordinates.
Unfortunately, the math for this part is quite beyond me. But if you are fluent in math, C programming, and want to know specifically how SIFT is implemented; I suggest you dive into Rob Hess' SIFT implementation, lines 467 through 648 is probably the most detailed you can get.
Using Emgu CV I have extracted a set of closed polygons from the contours in an image of a road network. The polygons represent road outlines. The result is shown below, plotted over an OpenStreetMaps map (the polygons in 'pixel' form from Emgu CV have been converted to latitude/longitude form to be plotted).
Set of polygons representing road outlines:
I would now like to compute the Voronoi diagram of this set of polygons, which will help me find the centerline of the road. But in Emgu CV I can only find a way to get the Voronoi diagram of a set of points. This is done by finding the Delaunay triangulation of the set of points (using the Subdiv2D class) and then computing the voronoi facets with GetVoronoiFacets.
I have tried computing the Voronoi diagram of the points defined by all the polygons in the set (each polygon is a list of points), but this gives me an extremely complicated Voronoi diagram, as one might expect:
Voronoi diagram of set of points:
This image shows a smaller portion of the first picture (for clarity, since it is so convoluted). Indeed some of the lines in the diagram seem to represent the road centerline, but there are so many other lines, it will be tough to find a criterion to extract the "good" lines.
Another potential problem that I am facing is that, as you should be able to tell from the first picture, some polygons are in the interior of others, so we are not in the standard situation of a set of disjoint closed polygons. That is, sometimes the road is between the outer boundary of one polygon and the inner boundary of another.
I'm looking for suggestions as to how to compute the Voronoi graph of the set of polygons using Emgu CV (or Open CV), hopefully overcoming this second problem I've outlined as well. I'm also open to other suggestions for how to acheive this without using Emgu CV.
If you already have polygons, you can try computing the Straight Skeleton.
I haven't tried it, but CGAL has an implementation. Note that this particular function license is GPL.
A possible issue may be:
The current version of this CGAL package can only construct the
straight skeleton in the interior of a simple polygon with holes, that
is it doesn't handle general polygonal figures in the plane.
Probably there are workarounds for that. For example you can include all polygons in a bigger rectangle (this way the original polygons will be holes of the new rectangle). This may not work well if the original polygons have holes. To solve that, you could execute the algorithm for each polygon with holes and then put all polygons in a rectangle, removing all holes and execute the algorithm again.
The problem is to detect a known rectangular object in an image.
Which of the following is computationally less expensive:
Finding homography - For finding homography, we use the template of the known object to do feature matching.
Contour detection - We try to detect the biggest contour in the image. In this particular case we assume that the biggest contour will correspond to the known rectangular object we are trying to find.
In both the cases we do perspective transform after detecting the object to set the perspective.
NOTE: We are using Open-CV functions to find the homography and detecting contour.
You should try finding the biggest contour. It's the simplest and will be far faster. You needs to detect Canny edges then find contours and find the one with the biggest area. However, it can fail if contours are unclear or if there is a bigger object as it doesn't consider shape. You can also apply both of your ideas to get better results.
EDIT:
To reply your comment, you have Canny edge + find contours + find biggest against find features + match features
I think that the first combination is less computationally expensive. Moreover, there is a good implementation of squares/rectangle detection here.
However, if the contours of the rectangle are not clear, and if moreover the rectangle is highly textured, you should get better results with features matching.
I ran findCountours on the following Image:
And got the following contour image (I'm showing only "parent" contours according to the hierarchy):
As you can see, there are many different contours around each object (each one in a different color). Now, I want to unify the contours around the person to obtain one enclosing contour, so I could segment her our from the image.
I'm not sure that it can be done, but I thought I should ask here.
Is there any method to intelligently unify the contours in the image so I could segment different objects out?
Thanks,
Gil.
First, do you want to achieve the result only on this image or any other image where different people will present in different pose and different dresses?
If you want to segment only this image, then with some color thresholding or with some morphology operations you can achieve it. But to make it work for any image with different persons probably you may need to pursue a PhD in computer vision.
But if your task is segmentation only then I would suggest a Semi-Automatic Segmentation technique like Grab Cut or graph cut. These are very popular segmentation algorithms which are readily available in opencv or matlab. They work very well on all kind of images. Here is the result of grab cut algorithm on your image.
There is lots of work on Contour based segmentation in the literature out there.
The Ultrametric contour map produces a hierarchy of contours which are segmentations of objects in an input image.
Pub: Contour Detection and Hierarchical Image SegmentationPablo Arbelaez, Michael Maire, Charless Fowlkes, Jitendra Malik
I am doing something similar to this problem:
Matching a curve pattern to the edges of an image
Basically, I have the same curve in two images, but with some affine transform between the two. Here is an example of two images:
Image1
Image2
So in order to get to Image2, you can apply some translation, rotation, scale, etc. to Image1.
Does anyone know how to solve for this transform?
Phase correlation doesn't work because it's not a translation only. Optical flow doesn't work since there's not enough detail to resolve translation, rotation, scale (It's pretty much a binary image). I'm not sure if Hough Transforms will give me good data.
I think some sort of keypoint matching algorithm like sift or surf would work with this kind of data as well.
The basic idea would be to find a limited number of "interesting" keypoints in each image, then match these keypoints pairwise.
Here is a quick test of your image with an online ASIFT demo:
http://demo.ipol.im/demo/my_affine_sift/result?key=BF9F4E4E006AB5168497709836C39C74#
It is probably more suited for normal greyscale images, but nevertheless it seems to work for this data. It looks like the lines connect roughly the same points around both of the curves; plugging all these pairs into something like the FindHomography function in OpenCv, the small discrepancies should even themselves out and you get the affine transformation matrix between the two images.
For your particular data you might be able to come up with better keypoint descriptors; perhaps something to detect the line ends, line crossings and sharp corners.
Or how about this: It is a little more work, but if you can vectorize your paths into a bezier or b-spline, you can get some natural keypoints from the spline descriptors.
I do not know any vectorisation library, but Inkscape has a basic implementation with which you could test the approach.
Once you have a small set of descriptors instead of a large 2d bitmap, you only need to match these descriptors between the two images, as per FindHomography.
answer to comment:
The points of interest are merely small areas that have certain properties. So the center of those areas might be black or white; the algorithm does not specifically look for white pixels or large-scale shapes such as the curve. What matter is that the lines connect roughly the same points on both curves, at least at first glance.