Sample images for feature matching in opencv - opencv

So , I was following this this code sample from opencv about surf and homography and I was interested in the train sample that was required to such experiment . I downloaded the two images at the bottom box.png and box_in_scene.png to validate the correctness of this code , I was alright . Now , I went to test this code with my own image , on the left is an image of a flash drive , and on the right is an image of a scissor with an usb drive . I failed to get any rectangular box on the test image ( the scissor and usb drive) .
.
However I know the code is working when I take different train sample for example this one with a paper box on the left and paper box in the mix with bed sheet .
.
Now my question is , what sort of training images should I rely on to give a good response , or is it something to do with the scenery that I choose as my test sample. Also had I chosen a video sample as my test case , would I be able to receive more responsive result .
Thanks .

If you think your second test is good, you are mistaken. Normal you can see in their site
See on keypoints on your two pictures, they are matched wrong. I think matching is the most hard in this work. Now I try to impove this mathematically, but still no good results :(
You can googling the most popular case of matching sample, but to get good result need something better.
About requirements: only one object may be on scene. Good if you have on sample only object without background. Although the algorithm is invariant to scale, if sample is very small and scene is very big you'll have problem at least with the number of keypoints.

There is nothing wrong with the sample ; however , the scenery to which the sample is to be matched needs to be dynamic , i.e a live stream . Drawing homography is not as simple as that . In order to draw that green rectangle , enough inliers are needed which is clearly missing in the usb and scissors examples .

Related

OCR: scan specific part of image

I'm quite new in computer vision, currently learning about google cloud vision SDK using Go. And right now I have one problem.
So I have an image scanned using the DetectTexts() method. The result was great! all of the texts are scanned.
However, I don't actually need all of those texts. I only need some of it. Below is the image I use as a sample. What I want to get is the two blocks highlighted in red.
Images
Result
WE-2
Sam WHO
Time
PM 1:57
SYS
mmHg
mmHg
DIA
mmHg
90
62
82
mmHg
PUL
/MIN
MR AVGA
SET
START
STOP
MEM
I do not know what is the best approach to do it. What's in my mind right now is these approaches:
split the images that are highlighted in red, then perform OCR scan on those new images
or, get all of the texts, and then use some algorithm (NLP maybe?) to get the highlighted texts.
Can somebody please help what is the correct and best approach to solves this problem?
You mentioned that you were using Go, which unfortunately I dont have any experience with, but I have approached this problem in other languages like Python and C#. What I would recommend is just create an ROI, or Region of Interest. Basically what that means is you would be cropping the image to only the highlighted region that you want to detect text from. Like I said, I'm not entirely sure if you can do that in Go, so you might have to do some raw pixel manipulation rather than just using a member function. I assumed that the position of the regions that you wanted to detect text from would remain the same. If your open to it, you could just create a simple python script that generates a ROI, and pipes the cropped image to GO.
import cv2
img = cv2.imread('inputImg.png')
output = img[c1:c1+25,r1:r1+25]
#You could do something like this
cv2.imwrite("path/to/output/outputimg.png", output)

TensorFlow video processing, changes detection

I'm newbie with machine learning, and I have only basic knowledge in neural networks.
I have pretty clear task:
1. Video stream shows static picture (white area with yellow squares)
(in different videos squares located in different places)
2. In some moment content of the video changes, and starts to show white area without some of the yellow squares.
3. I need to create mechanism which can determines and somehow indicates that changes.
I'm going to use for that task TensorFlow framework. Could anybody push me in right direction? Or I'll be very happy to see list of steps to overcome the problem.
Thanks in advance.
If you know how the static picture looks beforehand, may be some background-subtraction would work? Basically you just subtract the static picture from every frame and check the content of the result. If the resulting picture is empty (zeros or close to it up to some threshold) there is no change to detect. If the resulting picture contains a region that is non-zero (may be above or below a certain manually tuned threshold), you detected a change in that region.

OpenCv Issue of Image Subtraction?

i am trying to subtract 2 image using the function cvAbsDiff(img1, img2, dest);
it working but sometimes when i bring my hand before my head or body the hand is not clear and background comes into picture... the background image(head) overlays my foreground.(hand)..
it works correctly on plain surfaces i.e when the background is even like a wall.
please check out my image...so that you can better understand my problem...!!!!
http://www.2shared.com/photo/hJghiq4b/bg_overlays_foreground.html
if you have any solution/hint please help me.......
There's nothing wrong with your code . Background subtraction is not a preffered way for motion detection or silhoutte detection because its not very robust.The problem is coming because both the background and the foreground are similar in colour at many regions which on subtractions pushes the foreground to back . You might try using
- optical flow for motion detection
- If your task is just detecting silhoutte or hand try training a HOG classifier over it
In case you do not want to try a new approach . You may try around playing with the threshold value(in your case 30).So when you subtract similar colour image there difference is less than 30 . And later you threshold with 30 so it just blacks out. Also you may try HSV or some other colourspace as well .
Putting in the relevant code would help. Also knowing what you're actually trying to achieve.
Which two images are you subtracting? I've done subtracting subsequent images (so, images taken with a delay of a fraction of a second), and the background subtraction generally results in the edges of moving objects, for example the edges of a hand, and not the entire silhouette of a hand. I'm guessing you're taking the difference of the current frame and a static startup frame. It's possible that parts aren't different enough (skin+skin).
I've got some computer problems tonight, I'll test it out tomorrow (pls put up at least the steps you actually carry thorough though) and let you know.
I'm still not sure what your ultimate goal is, although I'm guessing you want to do some gesture-recognition (since you have a vector called "fingers").
As Manpreet said, your biggest problem is robustness, and that is from the subjects having similar color.
I reproduced your image by having my face in the static comparison image, then moving it. If I started with only background, it was already much more robust and in anycase didn't display any "overlaying".
Quick fix is, make sure to have a clean subject-free static image.
Otherwise, you'll want to have dynamic comparison image, simplest would be comparing frame_n with frame_n-1. This will generally give you just the moving edges though, so if you want the entire silhouette you can either:
1) Use a different segmenting algorithm (what I recommend. Background subtraction is fast and you can use it to determine a much smaller ROI in which to search, and then use a different algorithm for more robust segmentation.)
2) Try to make a compromise between the static and dynamic comparison image, for example as an average of the past 10 frames or something like that. I don't know how well this works, but would be quite simple to implement, worth a try :).
Also, try with CV_THRESH_OTSU instead of 30 for your threshold value, see if you like that better.
Also, I noticed often the output flares (regions which haven't changed switch from black to white). Checking with the live stream, I'm quite certain it because of the webcam autofocusing/adjusting white balance etc.. If you're getting that too, turning off the autofocus etc. should help (which btw isn't done through openCV but depends on the camera. Possibly check this: How to programatically disable the auto-focus of a webcam?)

to rotate template image and perform template matching

I want to rotate given template image at different angles (eg. 30, 60, 90, ...) and then I want to match the rotated images with a source image to detect objects using opencv functions (I'm writing C code)...
How can I do this using opencv functions? Or is there any other solution?
ya i'd searched SOF and that function is not passing rotated image to the main progrm. . . . .
and the other code given in SOF continuously rotating the image. so using this we cant do teplate matching.
is there any other codes to solve this problem?
Template matching is not a good choice to match rotated targets.
You better check the openCV module Features2D.
You'll want to take a special look at the examples for the Feature Matching and Homography. Both contains the functional source.
For furthers details and a great explanation on the topic you can check Innuendo's answer to a similar question here:
scale and rotation Template matching

Parsing / Scraping information from an image

I am looking for a library that would help scrape the information from the image below.
I need the current value so it would have to recognise the values on the left and then estimate the value of the bottom line.
Any ideas if there is a library out there that could do something like this? Language isn't really important but I guess Python would be preferable.
Thanks
I don't know of any "out of the box" solution for this and I doubt one exists. If all you have is the image, then you'll need to do some image processing. A simple binarization method (like Otsu binarization) would make it easier to process:
The binarization makes it easier because now the pixels are either "on" or "off."
The locations for the lines can be found by searching for some number of pixels that are all on horizontally (5 on in a row while iterating on the x axis?).
Then a possible solution would be to pass the image to an OCR engine to get the numbers (tesseractOCR is an open source OCR engine hosted at Google (C++): tesseractOCR). You'd still have to find out where the numbers are in the image by iterating through it.
Then, you'd have to find where the lines are relative to the keys on the left and do a little math and you can get your answer.
OpenCV is a beefy computer vision library that has things like the binarization. It is also a C++ library.
Hope that helps.

Resources