OpenCV: goodFeaturesToTrack and calcOpticalFlowPyrLK for moving camera - opencv

I tried the code written here:
http://i-vizon.blogspot.ch/2013/03/optical-flow-using-opencv-library-on.html
It works pretty well, but does not perform well in the case of moving camera, because all features are removed on scene change.
Fundamentally the code is so composed:
First frame: goodFeaturesToTrack(grayFrames,points1,MAX_COUNT,0.01,5,Mat(),3,0,0.04);
Other frames:
calcOpticalFlowPyrLK(prevGrayFrame,grayFrames,points2,points1,status,err,winSize,3,termcrit,0,0.001);
goodFeaturesToTrack(grayFrames,points1,MAX_COUNT,0.01,10,Mat(),3,0,0.04);
with the swapping of points and copy of the current frame in the previous.
Problem:
when I use it with camera, handling it on my hand, when I change the first scenes no optical flow is produced, I suppose because initial features are not more contained in the new frames.
How I can refresh the feature point in this code to continue working?
Which is a good refresh condition? Based for example on the the number of features?
Thank you very much.

Related

How to distinguish between different license plates using OpenCV

Currently working on licentiate detection system and need some guidance on how to proceed.
I can capture (via video playback) and with the help of an open source library called OpenALPR display the license plates directly to the terminal, now the issue is it capture on a frame by frame basis so it capture the same license plate multiple times. I added a frame skip variable and now it skips however many number of frames I want it to but the issue is still there.
Furthermore, I'd like to distinguish between different license plates if possible but don't know how to work around that, I've attempted employing basic object detection and detection but failed miserably.
Below is an image of the program running, as seen it detects a single license plate and display multiple instance of it, now the issue is I expect it to move on to the next car and display Plate#1, unfortunately it does not and continues feeding into Plate #0
Program Running
Program Running
The function that actually helps display the license plate text is below, really the first line does all the work. OpenALPR is a pretty powerful.
results = alpr.recognize_ndarray(frame)
for i, plate in enumerate(results['results']):
best_candidate = plate['candidates'][0]
print('Plate #{}: {:} ({:}%)'.format(i,
best_candidate['plate'].upper(),
best_candidate['confidence']))
I'd like some guidance towards how I can solve this problem? Which is basically distinguish between different license plates.
It is a general problem without general solution, because it highly depends on context. Some thoughts:
If it is a video feed you can track the plate movement, the track will "jump" when it detects another plate. Let say the maximum optical flow velocity is 100 px/frame, if it jumps more than this threshold, you can suppose it is a new plate.
Depending on you video quality and detector, may there be spurious jumps, I would add a Kalman filter or any simple filter.
Perhaps there is a minimum time lapse between a plate goes out the image and the next arrives. You can use a time threshold to trigger the "changed plate alert" event.

How to detect text in a photo

I am researching into the best way to detect test in a photo using open source libraries.
I think the standard way is as follows (note: steps 1 - 4 all use OpenCV):
1) detect outline of document
2) transform document so it's flat and cropped, using said outline
3) Make the background of document white, using a filter
4) Feed resulting image to Tesseract
Is this the optimum process, or is there a better way, or better tools?
Also, what happens for case if the photo doesn't have a document outline (It's possible that step 1 & 2 are redundant)?
Is there anyway to automatically detect document orientation (i.e. portrait / landscape)?
I think your process is fine. I've used a similar process for an Android project.
I think that the only way you can discover if a document is portrait/landscape is to reason with the length of the sides of the bounding box of your outline.
I don't think there's an automatic way to do this, maybe you can find the most external contour approximable with a 4 segment polyline (all doable in opencv). In order to get this you'll have to work with contour hierarchy and contous approximation (see cv2.approxPolyDP).
This is how I would go for automatic outline detection. As I said, the rest of your algorithm seems just fine to me.
PS. I'll leave my Android project GitHub link. I don't know if it can be useful to you, but here I specify the outline by dragging some handles, then transform the image and feed it to Tesseract, using Java and OpenCV. Yeah It's a very bad idea to do that in the main thread of an Android app and yeah, the app is not finished. I just wanted to experiment with OCR, so I didn't care much of performance and usability, since this was not intended to use, but just for studying.
Look up the uniform width transform.
What this does is detect edges which have more or less the same width with respect to their opposite edge. So things like drainpipes (which can be eliminated at a later pass) but also the majority of text. Whilst conceptually it's similar to a distance transform, the published method uses rather ad hoc normal projection methods and Canny edge detection.

OpenCv Issue of Image Subtraction?

i am trying to subtract 2 image using the function cvAbsDiff(img1, img2, dest);
it working but sometimes when i bring my hand before my head or body the hand is not clear and background comes into picture... the background image(head) overlays my foreground.(hand)..
it works correctly on plain surfaces i.e when the background is even like a wall.
please check out my image...so that you can better understand my problem...!!!!
http://www.2shared.com/photo/hJghiq4b/bg_overlays_foreground.html
if you have any solution/hint please help me.......
There's nothing wrong with your code . Background subtraction is not a preffered way for motion detection or silhoutte detection because its not very robust.The problem is coming because both the background and the foreground are similar in colour at many regions which on subtractions pushes the foreground to back . You might try using
- optical flow for motion detection
- If your task is just detecting silhoutte or hand try training a HOG classifier over it
In case you do not want to try a new approach . You may try around playing with the threshold value(in your case 30).So when you subtract similar colour image there difference is less than 30 . And later you threshold with 30 so it just blacks out. Also you may try HSV or some other colourspace as well .
Putting in the relevant code would help. Also knowing what you're actually trying to achieve.
Which two images are you subtracting? I've done subtracting subsequent images (so, images taken with a delay of a fraction of a second), and the background subtraction generally results in the edges of moving objects, for example the edges of a hand, and not the entire silhouette of a hand. I'm guessing you're taking the difference of the current frame and a static startup frame. It's possible that parts aren't different enough (skin+skin).
I've got some computer problems tonight, I'll test it out tomorrow (pls put up at least the steps you actually carry thorough though) and let you know.
I'm still not sure what your ultimate goal is, although I'm guessing you want to do some gesture-recognition (since you have a vector called "fingers").
As Manpreet said, your biggest problem is robustness, and that is from the subjects having similar color.
I reproduced your image by having my face in the static comparison image, then moving it. If I started with only background, it was already much more robust and in anycase didn't display any "overlaying".
Quick fix is, make sure to have a clean subject-free static image.
Otherwise, you'll want to have dynamic comparison image, simplest would be comparing frame_n with frame_n-1. This will generally give you just the moving edges though, so if you want the entire silhouette you can either:
1) Use a different segmenting algorithm (what I recommend. Background subtraction is fast and you can use it to determine a much smaller ROI in which to search, and then use a different algorithm for more robust segmentation.)
2) Try to make a compromise between the static and dynamic comparison image, for example as an average of the past 10 frames or something like that. I don't know how well this works, but would be quite simple to implement, worth a try :).
Also, try with CV_THRESH_OTSU instead of 30 for your threshold value, see if you like that better.
Also, I noticed often the output flares (regions which haven't changed switch from black to white). Checking with the live stream, I'm quite certain it because of the webcam autofocusing/adjusting white balance etc.. If you're getting that too, turning off the autofocus etc. should help (which btw isn't done through openCV but depends on the camera. Possibly check this: How to programatically disable the auto-focus of a webcam?)

openCV: is it possible to time cvQueryFrame to synchronize with a projector?

When I capture camera images of projected patterns using openCV via 'cvQueryFrame', I often end up with an unintended artifact: the projector's scan line. That is, since I'm unable to precisely time when 'cvQueryFrame' captures an image, the image taken does not respect the constant 30Hz refresh of the projector. The result is that typical horizontal band familiar to those who have turned a video camera onto a TV screen.
Short of resorting to hardware sync, has anyone had some success with approximate (e.g., 'good enough') informal projector-camera sync in openCV?
Below are two solutions I'm considering, but was hoping this is a common enough problem that an elegant solution might exist. My less-than-elegant thoughts are:
Add a slider control in the cvWindow displaying the video for the user to control a timing offset from 0 to 1/30th second, then set up a queue timer at this interval. Whenever a frame is needed, rather than calling 'cvQueryFrame' directly, I would request a callback to execute 'cvQueryFrame' at the next firing of the timer. In this way, theoretically the user would be able to use the slider to reduce the scan line artifact, provided that the timer resolution is sufficient.
After receiving a frame via 'cvQueryFrame', examine the frame for the tell-tale horizontal band by looking for a delta in HSV values for a vertical column of pixels. Naturally this would only work when the subject being photographed contains a fiducial strip of uniform color under smoothly varying lighting.
I've used several cameras with OpenCV, most recently a Canon SLR (7D).
I don't think that your proposed solution will work. cvQueryFrame basically copies the next available frame from the camera driver's buffer (or advances a pointer in a memory mapped region, or blah according to your driver implementation).
In any case, the timing of the cvQueryFrame call has no effect on when the image was captured.
So as you suggested, hardware sync is really the only route, unless you have a special camera, like a point grey camera, which gives you explicit software control of the frame integration start trigger.
I know this has nothing to do with synchronizing but, have you tried extending the exposure time? Or doing so by intentionally "blending" two or more images into one?

recognize the moving objects and differentiate them from the background?

iam working in a project that i take a vedio by a camera and convert this vedio to frames (this part of project is done )
what iam facing now is how to detect moving object in these frames and differentiate them from the background so that i can distinguish between them ?
I recently read an awesome CodeProject article about this. It discusses several approaches to the problem and then walks you step by step through one of the solutions, with complete code. It's written at a very accessible level and should be enough to get you started.
One simple way to do this (if little noise is present, I recommend smoothing kernel thought) is to compute the absolute difference of two consecutive frames. You'll get an image of things that have "moved". The background needs to be pretty static in order to work. If you always get the abs diff from the current frame to the nth frame you'll have a grayscale image with the object that moved. The object has to be different from the background color or it will disappear...

Resources