How to extract and locate the specific region of image by using OpenCV? - opencv

I am a newbie to OpenCV. I would like to work on a small project for tracking the rotation speed of a gear (by using webcam). However, until now, I have no idea how to work on this.
The posted image shows a machine which contains two 'big' gears. What I am interested in the gear only on left hand side (the red line as I highlighted).
Link
My plan is:
Extract the Interested gear region .
Mask all unrelated region. So, the masked image shows the left gear only (ROI).
.....
The problem is, how can I locate/extract/mask the ROI and mask?.I go through some example about cvMatchTemplate(), but it doesn't support rotations and scalings. Due to using webcam, the captured image may scaled or rotated. cvfindcontour() will extract all contours in the image rather then ROI.

If you previously know the gear you can use a picture of it to extract keypoints with SIFT, SURF, FAST or any corner detection algorithm. Then do as follows:
1- Apply FAST on every frame to detect keypoints.
2- Extract SIFT descriptors from those keypoints
3- Match detected points in the scene with your previously extracted points from the image. You can use FLANN matcher for this.
4- Those matches will define a region in the scene containing the gear you are looking for.
This is not trivial so you will need to look for information in OpenCV documentation for using all these functions.

Related

OpenCV Image Matching

I have two images from a stereo camera of the same scene, but few different perspectives (imgLeft and imgRight).
Now, I want to find a ROI (red rectangle in the image below) of the right image in the left one. I need to do this very fast, because I'm doing this in a video. How can I do this? I do not have the nonfree of OpenCV; but I have CUDA installed.
imgRight:
imgLeft:
This should be your friend http://docs.opencv.org/2.4/modules/video/doc/motion_analysis_and_object_tracking.html#calcopticalflowpyrlk
All you need to do is to find feature points inside this rectangle and pass them to the cv::calcopticalflowpyrlk to get there peers in the second image. You may need to make some filtering for the points to make sure that the tracking was perfect like for example pass them to cv::findHomography using CV_RANSAC flag and check the mask output.
The operation is fast and real-time. There is also a CUDA version of this method.

Project image onto notebook using OpenCV

I am trying to implement an application that projects an image onto a page of a notebook, using OpenCV, a webcam and a projector. To achieve that, I am doing the following steps:
I am using a webcam to detect the four corners points of a page.
A homography is learned between the four corner points of the camera image and their projections on my desk, as seen in the camera. By using the inverse transformation, I will be able to know where I should draw something on my camera image, so that the projection "ends up" at a desired location.
I am applying the inverse transformation to the detected four corners points of the page.
I am warping the desired image to the new, transformed set of points.
So far it works well, if the notebook is on my desk and wide open. Like in this picture:
But if I try to close one side (or both), the following happens:
See the problem? In the first picture the image is perfectly aligned with the edges of the page and remains so if you rotate or translate the notebook, while keeping it on the desk. But that doesn't happen in the second image, where the the top edge of the image is no longer parallel to the top edge of the page (the image becomes more and more skewed).
Can anyone explain why I get this projection problem or at least point me to some resources where I can read about it? I need to mention that the projector and the webcam are placed above and to the left of the notebook, not right above them.
Any tips or suggestions are welcome. Thank you!
You want an effect that is called a key stone correction. The problem you are experiencing is most probably due to the fact that optical axes, positions, and focal lengths of a web camera and a projector are different. I suggest to calibrate your setup so you would know their relative pose and incorporate it in your inverse Homography.

OpenCV Image Comparison for Surface Damage detection

We are planning to create a surface damage detection prototype for ceramic tiles with surface discoloration as a specific damage through the use of OpenCV. We would like to know what method should we consider using. We are new into developing these types of object recognition/object tracking programs. We've read about methods such as the Histogram method and the one where the Hue saturation value was being tracked, but still we are confused.
Also, we would like to know whether it is possible to detect the Hue saturation value of an object without the use of track bars.
Any relevant and helpful response will be greatly appreciated.
I think you can do it in sequence:
1) find tile region. Use corners detector, hough lines, etc.
2) find SIFT (or other descriprors) and recognize what image must be on this tile (find it in you tiles images database).
3) align images carefully. For example find homograpy between found in DB image and image of tile from camera (using SIFT features).
4) find color distance between every pixel in tile image from camera and tile image from database.
5) threshold differences by some value -> get problematic regions
And think about lighting. You have to provide equal lighting conditions for you measurements.

Finding a grid in an image

Having a match-3 game screenshot (for example http://www.gameplay3.com/images/games/jewel-quest-ii-01S.jpg), what would be the correct way to find the bound box for the grid (table with tiles)? The board doesn't have to be a perfect rectangle (as can be seen in the screenshot), but each cell is completely square.
I've tried several games, and found that there are some per-game image transformations that can be done to enhance the tiles inside the grid (for example in this game it's enough to take the V channel out of HSV color space). Then I can enlarge the tiles so that they overlap, find the largest contour of the image and get the bound box from it.
The problem with above approach is that every game (or even level inside the same game) may need a different transformation to get hold of the tiles. So the question is - is there a standard way to enhance either tiles inside the grid or grid's lines (I've tried finding lines with Hough transform, but, although the grid seems pretty visible to the eye, Hough doesn't find it)?
Also, what if the screenshot is obtained using the phone camera instead of taking a screenshot of a desktop? From my experience, captured images have less defined colors (which depends on lighting), and also can be distorted a little, as there is no way to hold the phone exactly in front of the screen.
I would go with the following approach for a screenshot:
Find corners in the image using for example a canny like edge detector.
Perform a hough line transform. This should work quite nicely on the edge image.
If you have some information about size of the tiles you could eliminate false positive lines using some sort of spatial model of the grid (eg. lines only having a small angle to x/y axis of the image and/or distance/angle of tile borders.
Identifiy tile borders under the found hough lines by looking for edges found by canny under/next to the lines.
Which implementation of the hough transform did you use? How did you preprocess the image?
Another approach would be to use some sort of machine learning approach. As you are working in OpenCV you could use either a Haar like feature detector. An example for face detection using Haar like features can be found here:
OpenCV Haar Face Detector example
Another machine learning approach would be to follow a Histogram of Oriented Gradients (Hog) approach in combination with a Support Vector Machine (SVM). An example is located here:
HOG example
You can find general information about HoG detection at:
Hog detection

Using OpenCV to correct stereo images

I intend to make a program which will take stereo pair images, taken by a single camera, and then correct and crop them so that when the images are viewed side by side with the parallel or cross eye method, the best 3D effect will be achieved. The left image will be the reference image, the right image will be modified for corrections. I believe OpenCV will be the best software for these purposes. So far I believe the processing will occur something like this:
Correct for rotation between images.
Correct for y axis shift.
Doing so will I imagine result in irregular black borders above and below the right image so:
Crop both images to the same height to remove borders.
Compute stereo-correspondence/disparity
Compute optimal disparity
Correct images for optimal disparity
Okay, so that's my take on what needs doing and the order it occurs in, what I'm asking is, does that seem right, is there anything I've missed, anything in the wrong order etc. Also, which specific functions of OpenCV would I need to use for all the necessary steps to complete this project? Or is OpenCV not the way to go? Much thanks.
OpenCV is great for this.
There is a whole chapter in:
And all the sample code for this in the book ships with the opencv distribution
edit: Roughly the steps are:
Remap each image to remove lens distortions and rotate/translate views to image center.
Crop pixels that don't appear in both views (optional)
Find matching objects in each view (stereoblock matching) create disparity map
Reproject disparity map into 3D model

Resources