recognize the moving objects and differentiate them from the background? - image-processing

iam working in a project that i take a vedio by a camera and convert this vedio to frames (this part of project is done )
what iam facing now is how to detect moving object in these frames and differentiate them from the background so that i can distinguish between them ?

I recently read an awesome CodeProject article about this. It discusses several approaches to the problem and then walks you step by step through one of the solutions, with complete code. It's written at a very accessible level and should be enough to get you started.

One simple way to do this (if little noise is present, I recommend smoothing kernel thought) is to compute the absolute difference of two consecutive frames. You'll get an image of things that have "moved". The background needs to be pretty static in order to work. If you always get the abs diff from the current frame to the nth frame you'll have a grayscale image with the object that moved. The object has to be different from the background color or it will disappear...

Related

Detect if any random object added in frame & keep track of them

I am really new to image processing. Currently I am using openCV for processing my video stream.
I am trying to detect if something was added in the frame & if added is there way to keep track of it. I already had tried to use yolo but It is not limited to some object I might have any random object coming in frame.
Secondly, I tried to use background subtraction method, but I have some object which keeps moving.
Thirdly, I tried to use contours but the are not that much accurate enough.
Please guide. I already had invested a month in this task. I have no clue what to do.

How to detect text in a photo

I am researching into the best way to detect test in a photo using open source libraries.
I think the standard way is as follows (note: steps 1 - 4 all use OpenCV):
1) detect outline of document
2) transform document so it's flat and cropped, using said outline
3) Make the background of document white, using a filter
4) Feed resulting image to Tesseract
Is this the optimum process, or is there a better way, or better tools?
Also, what happens for case if the photo doesn't have a document outline (It's possible that step 1 & 2 are redundant)?
Is there anyway to automatically detect document orientation (i.e. portrait / landscape)?
I think your process is fine. I've used a similar process for an Android project.
I think that the only way you can discover if a document is portrait/landscape is to reason with the length of the sides of the bounding box of your outline.
I don't think there's an automatic way to do this, maybe you can find the most external contour approximable with a 4 segment polyline (all doable in opencv). In order to get this you'll have to work with contour hierarchy and contous approximation (see cv2.approxPolyDP).
This is how I would go for automatic outline detection. As I said, the rest of your algorithm seems just fine to me.
PS. I'll leave my Android project GitHub link. I don't know if it can be useful to you, but here I specify the outline by dragging some handles, then transform the image and feed it to Tesseract, using Java and OpenCV. Yeah It's a very bad idea to do that in the main thread of an Android app and yeah, the app is not finished. I just wanted to experiment with OCR, so I didn't care much of performance and usability, since this was not intended to use, but just for studying.
Look up the uniform width transform.
What this does is detect edges which have more or less the same width with respect to their opposite edge. So things like drainpipes (which can be eliminated at a later pass) but also the majority of text. Whilst conceptually it's similar to a distance transform, the published method uses rather ad hoc normal projection methods and Canny edge detection.

TensorFlow video processing, changes detection

I'm newbie with machine learning, and I have only basic knowledge in neural networks.
I have pretty clear task:
1. Video stream shows static picture (white area with yellow squares)
(in different videos squares located in different places)
2. In some moment content of the video changes, and starts to show white area without some of the yellow squares.
3. I need to create mechanism which can determines and somehow indicates that changes.
I'm going to use for that task TensorFlow framework. Could anybody push me in right direction? Or I'll be very happy to see list of steps to overcome the problem.
Thanks in advance.
If you know how the static picture looks beforehand, may be some background-subtraction would work? Basically you just subtract the static picture from every frame and check the content of the result. If the resulting picture is empty (zeros or close to it up to some threshold) there is no change to detect. If the resulting picture contains a region that is non-zero (may be above or below a certain manually tuned threshold), you detected a change in that region.

OpenCv Issue of Image Subtraction?

i am trying to subtract 2 image using the function cvAbsDiff(img1, img2, dest);
it working but sometimes when i bring my hand before my head or body the hand is not clear and background comes into picture... the background image(head) overlays my foreground.(hand)..
it works correctly on plain surfaces i.e when the background is even like a wall.
please check out my image...so that you can better understand my problem...!!!!
http://www.2shared.com/photo/hJghiq4b/bg_overlays_foreground.html
if you have any solution/hint please help me.......
There's nothing wrong with your code . Background subtraction is not a preffered way for motion detection or silhoutte detection because its not very robust.The problem is coming because both the background and the foreground are similar in colour at many regions which on subtractions pushes the foreground to back . You might try using
- optical flow for motion detection
- If your task is just detecting silhoutte or hand try training a HOG classifier over it
In case you do not want to try a new approach . You may try around playing with the threshold value(in your case 30).So when you subtract similar colour image there difference is less than 30 . And later you threshold with 30 so it just blacks out. Also you may try HSV or some other colourspace as well .
Putting in the relevant code would help. Also knowing what you're actually trying to achieve.
Which two images are you subtracting? I've done subtracting subsequent images (so, images taken with a delay of a fraction of a second), and the background subtraction generally results in the edges of moving objects, for example the edges of a hand, and not the entire silhouette of a hand. I'm guessing you're taking the difference of the current frame and a static startup frame. It's possible that parts aren't different enough (skin+skin).
I've got some computer problems tonight, I'll test it out tomorrow (pls put up at least the steps you actually carry thorough though) and let you know.
I'm still not sure what your ultimate goal is, although I'm guessing you want to do some gesture-recognition (since you have a vector called "fingers").
As Manpreet said, your biggest problem is robustness, and that is from the subjects having similar color.
I reproduced your image by having my face in the static comparison image, then moving it. If I started with only background, it was already much more robust and in anycase didn't display any "overlaying".
Quick fix is, make sure to have a clean subject-free static image.
Otherwise, you'll want to have dynamic comparison image, simplest would be comparing frame_n with frame_n-1. This will generally give you just the moving edges though, so if you want the entire silhouette you can either:
1) Use a different segmenting algorithm (what I recommend. Background subtraction is fast and you can use it to determine a much smaller ROI in which to search, and then use a different algorithm for more robust segmentation.)
2) Try to make a compromise between the static and dynamic comparison image, for example as an average of the past 10 frames or something like that. I don't know how well this works, but would be quite simple to implement, worth a try :).
Also, try with CV_THRESH_OTSU instead of 30 for your threshold value, see if you like that better.
Also, I noticed often the output flares (regions which haven't changed switch from black to white). Checking with the live stream, I'm quite certain it because of the webcam autofocusing/adjusting white balance etc.. If you're getting that too, turning off the autofocus etc. should help (which btw isn't done through openCV but depends on the camera. Possibly check this: How to programatically disable the auto-focus of a webcam?)

Extracting slides from video lectures using OpenCV

I would like to extract out all the slides from a video lecture, using OpenCV. Here is an example of a lecture: http://www.youtube.com/watch?v=-hxOpz9c0bY.
What approaches would you recommend? So far, I've tried:
Comparing the change in grayscale intensity from frame to frame. This can have problems when an object in the foreground moves around. For example, in this lecture, there's a hand that moves around: http://www.youtube.com/watch?v=mNzu42FrlHo#t=07m00s.
Using SURF features and doing comparisons frame by frame. This approach seems kind of slow.
Does anyone have other ideas?
Most of this work is most likely already done by video encoder. You just need to extract key-frames and check how well compressed are frames between them.
It should be also fairly easy to distinguish still images. You can save lot of time by examining just the key-frames. Slides are likely to have high contrast, solid shapes, solid background. Lecture hall has blurry shapes and low contrast.
What you need is a scene change detection. After that, you'll have to classify scenes as "lecture hall" or "presentation". As for the problem with hands - you could use background subtraction with an adaptive background (just make sure you mask the foreground... you don't want the foreground to become a part of the background).
You could try an edge detection and look for a rectangular object - the slides (above a certain area threshold). You could further reduce FPs by looking for some text within the rectangle.
There are several reasons to extract slides/frames from a video presentation, especially in the case of education or conference related videos. It allows you to access the study notes without watching the whole video.
I have faced this issue several times, so I decided to create a solution for it myself using python. I have made the code open-source, you can easily set up this tool and run it in few simple steps.
Refer to this for a youtube video tutorial. Steps on how to use this tool.
Clone this project video2pdfslides
Set up your environment by running "pip install -r requirements.txt"
Copy your video path
Run "python video2pdfslides.py <video_path>"
Boom! the pdf slides will be available in the output folder Make notes and enjoy!

Resources