I am trying to determine the movement and rotation of an object (can be plain-colored, but does not have to be) on a not completely constant background. Here is an example:
Using keypoints to find the transformation as in the tutorials does not work because the objects I am dealing with do not necessarily provide enough edges for this.
Building the difference image and doing a segmentation there also often fails, because of the changed background. In this example it is not that bad, but there could be changed reflections or slight deformations.
Any ideas on how I to find the transformation matrix (affine, with only four degrees of freedom) that maps the object (in this example the blue thing) from one image to the other?
Use binary threshold on result of images subtraction(as described here - Foreground Extraction) - it should delete small changes(which are results of changes in lightning). Before that you may try to use some filter to blur edges, median filter may be a good option(but try different filters too) - try using this technique on both input images and result of images subtraction.
//edit:
For determine the transformation you may try to use SURF - http://docs.opencv.org/doc/tutorials/features2d/feature_homography/feature_homography.html
If you don't need to calculate rotation try to use optical flow technique - http://robots.stanford.edu/cs223b05/notes/CS%20223-B%20T1%20stavens_opencv_optical_flow.pdf or much simpler method:
1. Calculate geometrical center(just add positions of all points and divide the result by number of points) of marker contour on first and on second image(name them contour1 and contour2). Alternatively you may calculate center of mass of filled contour - it's up to you.
2. You transformation is: movementVector = centerOfContour2 - centerOfContour1
If the results will be not accurate enough, try to find biggest contour on image and draw this contour on empty image(so you won't have any noises, artifacts etc). Perform all operations on the new image.
Related
I need to track cars on the road from top-view video.
My application contain two main parts:
Detecting cars on the frame (Tensorflow trained network)
Tracking detected cars (opencv trackers)
I have troubles with opencv trackers. Initially i tried to different trackers, but only MOSSE is fast enough. This tracker works almost perfect for case with straight road, but i faced problems with rotating cars. This situation appears on crossroads.
As i understood, bounding box of rotated object is bigger that bbox of horizontal or vertical object. As result bbox contains big part of static background and the tracker lose target object.
Are there any alternative trackers which can track contours (not bounding boxes)?
Can i adjust quality of existing opencv trackers results by any settings or by adjusting picture?
Schema:
Real image:
If your camera is stationary the following scenario is feasible:
use ‌background subtraction methods to separate background image from foreground blobs.
Improve the foreground results using morphological operations.
Detect car blobs and remove other blobs.
Track foreground blobs in video i.e. binary track (simply use this or even apply KF).
A very basic but effective approach in this scenario might be to track the center coordinates of the bounding box, if the center coordinates only change along one axis (with a small tolerance for either axis), its a linear motion (not a rotation). If both x and y change, the car is moving in the roundabout.
This only has the weakness that it will detect diagonal motion, but since you are looking at a centered roundabout, that shouldn't be an issue.
It will also be very efficient memory-wise.
You should use PCA method, which can calculate the orientation of an detected object and which way it is facing. You can change the threshold of detection to select objects more like the cars (based upon shape and colour - a HSV conversion which in your case is red) in your picture.
Link to an introduction to Principal Component Analysis (PCA)
Method 1 :
- Detect bounding boxes and subtract the background to get blobs rotated rectangles.
Method 2 :
- implement your own version of detector with rotated boxes.
Method 3 :
- Use segmentation instead ... Unet for example.
There are no other trackers than the ones found in the library.
Your best bet is to filter the image and use findcontours.
Optical flow and background subtraction will help with this. You can combine optical flow with your car detector to rule out false positives.
https://docs.opencv.org/3.4/d4/dee/tutorial_optical_flow.html
https://docs.opencv.org/3.4/d1/dc5/tutorial_background_subtraction.html
I have two binary images of hand which are almost same.How should I compare them to know whether they represent almost same shape or not.I have tried finding euclidean distance between two images but its not giving correct answer if the image is slightly changed or moved to left or right or slight decrease in size.I have also tried HOG descriptors in opencv still I am unable to get correct answer if I compare more than one image.What is the best way to compare two binary images based on shape or any feature to know nearly matching images not considering the size of the image.Links to images are http://postimg.org/image/w20tuuzmv/ and http://postimg.org/image/jndr4br9x/
I think that Generalized Hough transform might be a good solution for you. Here is a tutorial about it.
Alternatively uou can try to cut hand from one image (just use contour bounding rect) and than use it as a template and search for it in second image using template matching technique - here you can read more about. When you will find point with highest correlation value, you need to decide whether it is big enough - you need to find threshold on your own.
Are the images just rotated, translated and scaled? If so you could compute the principal components of the images using PCA, then rotate the images so that the first component is in a certain direction (e.g. always vertical) you could then compute the centroids of the images and translate them to be always in the same position (e.g. center of the image), to use always the same scale you could resize the images so that the sum of the distances between each white pixel with the centroid is the same in both images. Now it's easy to compare the images for example score = np.sum(A==B)
I have a processed binary image of dimension 300x300. This processed image contains few object(person or vehicle).
I also have another RGB image of the same scene of dimensiion 640x480. It is taken from a different position
note : both cameras are not the same
I can detect objects to some extent in the first image using background subtraction. I want to detect corresponding objects in the 2nd image. I went through opencv functions
getAffineTransform
getPerspectiveTransform
findHomography
estimateRigidTransform
All these functions require corresponding points(coordinates) in two images
In the 1st binary image, I have only the information that an object is present,it does not have features exactly similar to second image(RGB).
I thought conventional feature matching to determine corresponding control points which could be used to estimate the transformation parameters is not feasible because I think I cannot determine and match features from binary and RGB image(am I right??).
If I am wrong, what features could I take, how should I proceed with Feature matching, find corresponding points, estimate the transformation parameters.
The solution which I tried more of Manual marking to estimate transformation parameters(please correct me if I am wrong)
Note : There is no movement of both cameras.
Manually marked rectangles around objects in processed image(binary)
Noted down the coordinates of the rectangles
Manually marked rectangles around objects in 2nd RGB image
Noted down the coordinates of the rectangles
Repeated above steps for different samples of 1st binary and 2nd RGB images
Now that I have some 20 corresponding points, I used them in the function as :
findHomography(src_pts, dst_pts, 0) ;
So once I detect an object in 1st image,
I drew a bounding box around it,
Transform the coordinates of the vertices using the above found transformation,
finally draw a box in 2nd RGB image with transformed coordinates as vertices.
But this doesnt mark the box in 2nd RGB image exactly over the person/object. Instead it is drawn somewhere else. Though I take several sample images of binary and RGB and use several corresponding points to estimate the transformation parameters, it seems that they are not accurate enough..
What are the meaning of CV_RANSAC and CV_LMEDS option, ransacReprojecThreshold and how to use them?
Is my approach good...what should I modify/do to make the registration accurate?
Any alternative approach to be used?
I'm fairly new to OpenCV myself, but my suggestions would be:
Seeing as you have the objects identified in the first image, I shouldn't think it would be hard to get keypoints and extract features? (or maybe you have this already?)
Identify features in the 2nd image
Match the features using OpenCV FlannBasedMatcher or similar
Highlight matching features in 2nd image or whatever you want to do.
I'd hope that because all your features in the first image should be positives (you know they are the features you want), then it'll be relatively straight forward to get accurate matches.
Like I said, I'm new to this so the ideas may need some elaboration.
It might be a little late to answer this and the asker might not see this, but if the 1st image is originally a grayscale then this could be done:
1.) 2nd image ----> grayscale ------> gray2ndimg
2.) Point to Point correspondences b/w gray1stimg and gray2ndimg by matching features.
I'm creating a part scanner in C that pulls all possibilities for scanned parts as images in a directory. My code currently fetches all images from that directory and dumps them into a vector. I then produce groups of contours for all the images. The program then falls into a while loop where it constantly grabs images from a webcam, and generates contours for those as well. I have set up a jig for the part to rest on, so orientation and size are not a concern, however I don't want to have to calibrate the machine, so there may be movement between the template images and the part images taken.
What is the best way to compare the contours? I have tried several methods including matchTemplate without contours, but if you take a look at the two parts below, you can see that these two are very close to each other, so matchShapes and matchTemplate can't distinguish between them the way I was using them. I'm also not sure how to use cvMatchShapes. It works with just loading the images directly into match shapes, but the results are inconclusive. I think that contours is the way to go, I'm just not sure of how to go about implementing the comparison phase. Any help would be great.
You can view the templates here: http://www.cryogendesign.com/partDetection.html"
If you are ready for do-it-yourself, one approach could be to compute a "distance image" (assign every pixel the smallest Euclidean distance to the contour taken as the reference). See http://en.wikipedia.org/wiki/Distance_transform.
Using this distance image, you can quickly compute the average distance of a new contour to the reference one (for every contour pixel, get the distance from the distance image). The average distance gives you an indication of the goodness-of-fit and will let you find the best match to a set of reference templates.
If the parts have some moving freedom, the situation is a bit harder: before computing the average distance, you must fit the new contour to the reference one. You will need to apply a suitable transform (translation, rotation, possibly scaling), and find the parameters that will minimize... the average distance.
You can calculate the chamfer distance between the two contours:
T and E are the set of edges of the template and the image and x is the point of reference where you start to compare the two set of edges. So for each x you get a different value.
DT is the distance transform of an image. Matlab provides the algorithm here.
If you want a more detailed version of how to calculate the chamfer distance, take a look here.
I am currently facing a, in my opinion, rather common problem which should be quite easy to solve but so far all my approached have failed so I am turning to you for help.
I think the problem is explained best with some illustrations. I have some Patterns like these two:
I also have an Image like (probably better, because the photo this one originated from was quite poorly lit) this:
(Note how the Template was scaled to kinda fit the size of the image)
The ultimate goal is a tool which determines whether the user shows a thumb up/thumbs down gesture and also some angles in between. So I want to match the patterns against the image and see which one resembles the picture the most (or to be more precise, the angle the hand is showing). I know the direction in which the thumb is showing in the pattern, so if i find the pattern which looks identical I also have the angle.
I am working with OpenCV (with Python Bindings) and already tried cvMatchTemplate and MatchShapes but so far its not really working reliably.
I can only guess why MatchTemplate failed but I think that a smaller pattern with a smaller white are fits fully into the white area of a picture thus creating the best matching factor although its obvious that they dont really look the same.
Are there some Methods hidden in OpenCV I havent found yet or is there a known algorithm for those kinds of problem I should reimplement?
Happy New Year.
A few simple techniques could work:
After binarization and segmentation, find Feret's diameter of the blob (a.k.a. the farthest distance between points, or the major axis).
Find the convex hull of the point set, flood fill it, and treat it as a connected region. Subtract the original image with the thumb. The difference will be the area between the thumb and fist, and the position of that area relative to the center of mass should give you an indication of rotation.
Use a watershed algorithm on the distances of each point to the blob edge. This can help identify the connected thin region (the thumb).
Fit the largest circle (or largest inscribed polygon) within the blob. Dilate this circle or polygon until some fraction of its edge overlaps the background. Subtract this dilated figure from the original image; only the thumb will remain.
If the size of the hand is consistent (or relatively consistent), then you could also perform N morphological erode operations until the thumb disappears, then N dilate operations to grow the fist back to its original approximate size. Subtract this fist-only blob from the original blob to get the thumb blob. Then uses the thumb blob direction (Feret's diameter) and/or center of mass relative to the fist blob center of mass to determine direction.
Techniques to find critical points (regions of strong direction change) are trickier. At the simplest, you might also use corner detectors and then check the distance from one corner to another to identify the place when the inner edge of the thumb meets the fist.
For more complex methods, look into papers about shape decomposition by authors such as Kimia, Siddiqi, and Xiaofing Mi.
MatchTemplate seems like a good fit for the problem you describe. In what way is it failing for you? If you are actually masking the thumbs-up/thumbs-down/thumbs-in-between signs as nicely as you show in your sample image then you have already done the most difficult part.
MatchTemplate does not include rotation and scaling in the search space, so you should generate more templates from your reference image at all rotations you'd like to detect, and you should scale your templates to match the general size of the found thumbs up/thumbs down signs.
[edit]
The result array for MatchTemplate contains an integer value that specifies how well the fit of template in image is at that location. If you use CV_TM_SQDIFF then the lowest value in the result array is the location of best fit, if you use CV_TM_CCORR or CV_TM_CCOEFF then it is the highest value. If your scaled and rotated template images all have the same number of white pixels then you can compare the value of best fit you find for all different template images, and the template image that has the best fit overall is the one you want to select.
There are tons of rotation/scaling independent detection functions that could conceivably help you, but normalizing your problem to work with MatchTemplate is by far the easiest.
For the more advanced stuff, check out SIFT, Haar feature based classifiers, or one of the others available in OpenCV
I think you can get excellent results if you just compute the two points that have the furthest shortest path going through white. The direction in which the thumb is pointing is just the direction of the line that joins the two points.
You can do this easily by sampling points on the white area and using Floyd-Warshall.