Hi I am currently using OpenCV implementation of HOG and Haar Cascade to perform pedestrian detection and bounding them on a video feed.
However, I want to assign an unique id (number) for every pedestrian entering the video feed with the id remains the same until the pedestrian leaves the video feed. Since frames are processed one after another without regard of previous frame I wasn't sure how to implement this in the simplest but effective way possible.
Do I really need to use tracking algorithm like camshift or Kalman in which I have no knowledge about and could really use some help. Or is there any simpler way to achieve what I want?
P/S: This video is what I wanted to achieve. In fact I posted a similar question here before but that was more towards the detection techniques and this is towards the next step of assigning the unique identifier.
A simple solution:
Keep Track of your Objects in a Vector.
If you compute a new frame, for every Object: search for the nearest Object stored in your Vector. If the distance between the stored object and your current Object is below a certain threshold it is the same Object.
If no Match is found the Object is new. At the end delete all Objects in your Vector that are not associated with an Object of the current frame.
When you will use detectMultiScale to get the matches, you will have a std:Vector<cv:Rect> structure which will have all the detected pedestrians. While iterating through them for drawing, you can assign a number to each unique cv::Rect being detected (you may need to write a slightly deeper test for this, to check for overlapping rectangles) which you can then draw (let's say on the top) of the corresponding rectangle.
HTH
Related
I had been using LK algorithm in detecting corners and interested point for tracking.
However, I am stucked at this point where I need to have something like a rectangle box to follow the tracked object. All I have now was just a lot of points showing my moving objects.
Is there any methods or suggestions for that? Also, any idea on adding counter into the window so that my object moving in and out the screen can be counted as well?
Thank you
There are lots of options! Within OpenCV, I'd suggest using CamShift as a starting point, since it is a relatively easy to use. CamShift uses mean shift to iteratively search for an object in consecutive frames.
Note that you need to seed the tracker with some kind of input. You could have the user draw a rectangle around the object, or use a detector to get the initial input. If you want to track faces, for example, OpenCV has a cascade classifier and training data for a face detector included.
I have given a task to create an application, in which the image was given and i have to detect which object (out of list of finite objects) is present in that image..
Only one object is present in one image or no object in image.
the application should able to identify the object if present(any of the listed objects)
It would also be suffice if application(program) can calculate that what is probability that particular object is present in image (from the list of objects).
Can anyone suggest how to approach this problem ? opencv ?
Actually the task was to identify the logo(of some company like coke, pepsi, dell etc) from the image(if present any from the list of logos(which is finite say 100))
How can i do this project ? please help.!!!!
There are many ways of doing that but the one I like the most is building a feature set for each object and then match it in the image.
You can use SIFT for building the keypoints vector for each object. By aplying SIFT to each picture yo will get a set of descriptors for each picture (say picture, object,...).
When you get the image you want to process, use FAST for detecting points, and do cvMatchTemplate() for each different set of descriptors. The one with highest probability will tell you which objected you detected. If all probabilities are too low, then you probably don't have any object on the image.
This is just one approach I like, but it is quite state-of-the-art, precise, fast.
I recommend you googling and reading on the subject before trying to do stuff.
You want to perform object recognition, or logo recognition. There are already SO questions about this.
Here is a starting point for Opencv
The whole process took me half a minute to search for. Perhaps this is what you should start searching for
I am using some functions such as color contour tracking and image matching which are already available in OpenCV .. I am trying to identify a pink duck, more specifically the head of the duck, but these two functions don't give me the outcome I am expecting for some reasons such as :
the color thing don't always work perfect because the change in the lightning , which accordingly would change the color seen by the camera.
when I use the image matching thing, I use one image of the duck which I took from a specific position and it can identify the duck only when he is in that position, but I want to identify it even when I rotate the duck or play around with it.
Does anyone have an ideas about a better way to track a certain object ?
Thank you
Have you tried converting the image into the hsv colourspace? This colourspace tries to remove the effects of lighting so might be able to improve your colour-based segmentation.
To identify the head of the duck, once you have identified the duck as a whole you could perhaps identify the orientation (using template matching with a set of templates from different viewpoints, or haar cascades, or ...) and then use the known orientation and an empirical rule to determine where the head is located. For example, if you detect the duck in an upright position within a certain bounding box, the head is assumed to be located in the top third of that bounding box.
I think it might just take little more than what OpenCV provides straight forward way.
Given your specific question, you might just want to try shape descriptors of some sort.
Basically, try to take Duck's head's pictures shape from various angles and capture the shapes from it.
Now, you can find a likelihood model (forgive me for not a very accurate term) that can validate the hypothesis that a given captured shape indeed belongs to the class of Duck's head or not. Color can just be an additional feature that might help.
If you are a new person in this field - try catch hold of Duda and Hart: Pattern Classification. This doesn't have solution to find-the-duck-problem but will shape your thinking.
As a part of my thesis work, I need to build a program for human tracking from video or image sequence like the KTH or IXMAS dataset with the assumptions:
Illumination remains unchanged
Only one person appear in the scene
The program need to perform in real-time
I have searched a lot but still can not find a good solution.
Please suggest me a good method or an existing program that is suitable.
Case 1 - If camera static
If the camera is static, it is really simple to track one person.
You can apply a method called background subtraction.
Here, for better results, you need a bare image from camera, with no persons in it. It is the background. ( It can also be done, even if you don't have this background image. But if you have it, better. I will tell at end what to do if no background image)
Now start capture from camera. Take first frame,convert both to grayscale, smooth both images to avoid noise.
Subtract background image from frame.
If the frame has no change wrt background image (ie no person), you get a black image ( Of course there will be some noise, we can remove it). If there is change, ie person walked into frame, you will get an image with person and background as black.
Now threshold the image for a suitable value.
Apply some erosion to remove small granular noise. Apply dilation after that.
Now find contours. Most probably there will be one contour,ie the person.
Find centroid or whatever you want for this person to track.
Now suppose you don't have a background image, you can find it using cvRunningAvg function. It finds running average of frames from your video which you use to track. But you can obviously understand, first method is better, if you get background image.
Here is the implementation of above method using cvRunningAvg.
Case 2 - Camera not static
Here background subtraction won't give good result, since you can't get a fixed background.
Then OpenCV come with a sample for people detection sample. Use it.
This is the file: peopledetect.cpp
I also recommend you to visit this SOF which deals with almost same problem: How can I detect and track people using OpenCV?
One possible solution is to use feature points tracking algorithm.
Look at this book:
Laganiere Robert - OpenCV 2 Computer Vision Application Programming Cookbook - 2011
p. 266
Full algorithm is already implemented in this book, using opencv.
The above method : a simple frame differencing followed by dilation and erosion would work, in case of a simple clean scene with just the motion of the person walking with absolutely no other motion or illumination changes. Also you are doing a detection every frame, as opposed to tracking. In this specific scenario, it might not be much more difficult to track either. Movement direction and speed : you can just run Lucas Kanade on the difference images.
At the core of it, what you need is a person detector followed by a tracker. Tracker can be either point based (Lucas Kanade or Horn and Schunck) or using Kalman filter or any of those kind of tracking for bounding boxes or blobs.
A lot of vision problems are ill-posed, some some amount of structure/constraints, helps to solve it considerably faster. Few questions to ask would be these :
Is the camera moving : No quite easy, Yes : Much harder, exactly what works depends on other conditions.
Is the scene constant except for the person
Is the person front facing / side-facing most of the time : Detect using viola jones or train one (adaboost with Haar or similar features) for side-facing face.
How spatially accurate do you need it to be, will a bounding box do it, or do you need a contour : Bounding box : just search (intelligently :)) in neighbourhood with SAD (sum of Abs Differences). Contour : Tougher, use contour based trackers.
Do you need the "tracklet" or only just the position of the person at each frame, temporal accuracy ?
What resolution are we speaking about here, since you need real time :
Is the scene sparse like the sequences or would it be cluttered ?
Is there any other motion in the sequence ?
Offline or Online ?
If you develop in .NET you can use the Aforge.NET framework.
http://www.aforgenet.com/
I was a regular visitor of the forums and I seem to remember there are plenty of people using it for tracking people.
I've also used the framework for other non-related purposes and can say I highly recommend it for its ease of use and powerful features.
I'm looking for the fastest and more efficient method of detecting an object in a moving video. Things to note about this video: It is very grainy and low resolution, also both the background and foreground are moving simultaneously.
Note: I'm trying to detect a moving truck on a road in a moving video.
Methods I've tried:
Training a Haar Cascade - I've attempted training the classifiers to identify the object by taking copping multiple images of the desired object. This proved to produce either many false detects or no detects at all (the object desired was never detected). I used about 100 positive images and 4000 negatives.
SIFT and SURF Keypoints - When attempting to use either of these methods which is based on features, I discovered that the object I wanted to detect was too low in resolution, so there were not enough features to match to make an accurate detection. (Object desired was never detected)
Template Matching - This is probably the best method I've tried. It's the most accurate although the most hacky of them all. I can detect the object for one specific video using a template cropped from the video. However, there is no guaranteed accuracy because all that is known is the best match for each frame, no analysis is done on the percentage template matches the frame. Basically, it only works if the object is always in the video, otherwise it will create a false detect.
So those are the big 3 methods I've tried and all have failed. What would work best is something like template matching but with scale and rotation invariance (which led me to try SIFT/SURF), but i have no idea how to modify the template matching function.
Does anyone have any suggestions how to best accomplish this task?
Apply optical flow to the image and then segment it based on flow field. Background flow is very different from "object" flow (which mainly diverges or converges depending on whether it is moving towards or away from you, with some lateral component also).
Here's an oldish project which worked this way:
http://users.fmrib.ox.ac.uk/~steve/asset/index.html
This vehicle detection paper uses a Gabor filter bank for low level detection and then uses the response to create the features space where it trains an SVM classifier.
The technique seems to work well and is at least scale invariant. I am not sure about rotation though.
Not knowing your application, my initial impression is normalized cross-correlation, especially since I remember seeing a purely optical cross-correlator that had vehicle-tracking as the example application. (Tracking a vehicle as it passes using only optical components and an image of the side of the vehicle - I wish I could find the link.) This is similar (if not identical) to "template matching", which you say kind of works, but this won't work if the images are rotated, as you know.
However, there's a related method based on log-polar coordinates that will work regardless of rotation, scale, shear, and translation.
I imagine this would also enable tracking that the object has left the scene of the video, too, since the maximum correlation will decrease.
How low resolution are we talking? Could you also elaborate on the object? Is it a specific color? Does it have a pattern? The answers affect what you should be using.
Also, I might be reading your template matching statement wrong, but it sounds like you are overtraining it (by testing on the same video you extracted the object from??).
A Haar Cascade is going to require significant training data on your part, and will be poor for any adjustments in orientation.
Your best bet might be to combine template matching with an algorithm similar to camshift in opencv (5,7MB PDF), along with a probabilistic model (you'll have to figure this one out) of whether the truck is still in the image.