how to recognize the object I detect in video frames is a people or a car - image-processing

I have a problem to detect object in images or video frames.
I have a task that is detect some people or something who enter into the sight of web camera, and then my system will be alarm.
Next step is recognize which kind of thing the object is, in this phase I know use Hough transform to detect line, circle, even rectangle. But when a people come into the sight of camera, people's profile is more complex than line, circle and rectangle. How can i recognize the object is people not a car.
I need help to know that.
thanks in advance

I suggest you look at the paper "Histograms of Oriented Gradients for Human Detection" by Dalal and Triggs. They used Histograms of Oriented Gradients to detect humans in images

I think one method is to use Bayesian analysis on your image and see how that matches with a database of known images. I believe some people run a wavelet transform to emphasize more noticeable features.

Related

Using the coca cola logo as the calibration pattern for camera

I am looking for camera calibration techniques with OpenCV and saw the chessboard and circles methods, but I wanted to calibrate the camera with something that is in the real world and you don't have to print (printers are also not very accurate in what they print).
Is it possible to do calibration with complex shapes like the Coca Cola logo on the cans? Is it a problem that the surface is curved?
Thanks
Depending on what you want to achieve this is not at all necessarily a bad idea, and you are not the first one who had it. There was a technology that uses a CD, which is a strongly standardised object which at least used to exist on most households, for a simple camera calibration task. (There is little technical to be found online about this, as the technology was proprietary. This is business document, where the use of the CD is mentioned. Algorithmically, however, it is not difficult if you know camera calibration.)
The question is whether the precision you get is sufficient for your application. Don't expect any miracles here. Generally you can use almost any object you like to learn something about a camera, as long as you can detect it reliably and you know its geometry. Almost certainly you will have to take several pictures of the object. Curved surfaces are no problem per see. I regularly used a cylinder (larger than a beverage can, though, with a simple to detect pattern) to calibrate a complete camera rig of 12 SLRs.
Don't expect to find out of the box solutions and don't expect implementation to be trivial. You will have to work your way through the math. I recommend the book by Hartley and Zisserman, Multiple View Geometry for Computer vision. This paper describes an analysis-by-synthesis approach to calibration, which is the way to go for here (it does not describe exactly what you want, but the approach should generalise to arbitrary objects as long as you can detect them).
i can understand your wish, but it's a bad idea.
the calibration algorithm works by comparing real world points from the cam with a synthetical model ( yes, you have to supply that , too! ). so, while it's easy to calculate a 2d chessboard grid on the fly and use that, it will be very hard to do for your tin can, or any arbitrary household item you grab.
just give in, and print a rectangular chessbord grid to a piece of paper
(opencv comes with a pdf for that already).
don't use a real-life chessboard, a quadratic one is ambiguous to 90° rotation.
interesting idea.
What about displaying a checkerboard pattern (or sth else) on an lcd screen display and use that display as calibration pattern?? You would have to know the displaying size of the pattern though.
Googling I found this paper:
CAMERA CALIBRATION BASED ON LIQUID CRYSTAL DISPLAY (LCD)
ZHAN Zongqian
http://www.isprs.org/proceedings/XXXVII/congress/3b_pdf/04.pdf
comment: this doesn't answer the question about the coca-cola can but gives and idea for a solution to the grounding problem: camera calibration with a common object.

Image Processing - Determine if someone looking into camera

with image processing libraries like opencv you can determine if there are faces recognized in an image or even check if those faces have a smile on it.
Would it be possible to somehow determine, if the person is looking directly into the camera? As it is hard even for the human eye to determine is someone is looking into the camera or to a close point, i think that this will be very tricky.
Can someone agree?
thanks
You can try using an eye detection program, I remember doing back a few years ago, and it wasn't that strong, so when we tilt our head slightly off the camera, or close our eyes, the eyes can't be detected.
Is it is not clear, what I really meant was our face must be facing straight at the camera with our eyes open before it can detect our eyes. You can try doing something similar with a bit of tweaks here and there.
Off the top of my head, split the image to different sections, for each ROI, there are different eye classifiers, for example, upper half of the image, u can train the a specific classifiers of how eyes look like when they look downwards, lower half of image, train classifiers of how eyes look like when they look upwards. and for the whole image, apply the normal eye detection in case the user move their head along while looking at the camera.
But of course, this will be based on extremely strong classifiers and ultra clear quality images, video due to when the eye is looking at. Making detection time, extremely slow even if my method is successful.
There maybe other ideas available too that u can explore. It's slightly tricky, but it not totally impossible. If openCV can't satisfy, openGL? so many libraries, etc available. I wish you best of luck!

Histogram comparision

Is it possible to compare two intensity histograms (derived from gray-scale images) and obtain a likeness factor? In other words, I'm trying to detect the presence or absence of an soccer ball in an image. I've tried feature detection algorithms (such as SIFT/SURF) but they are not reliable enough for my application. I need something very reliable and robust.
Many thanks for your thoughts everyone.
This answer (Comparing two histograms) might help you. Generally, intensity comparisons are quite sensitive as e.g. white during day time is different from white during night time.
I think you should be able to derive something from compareHist() in openCV (http://docs.opencv.org/doc/tutorials/imgproc/histograms/histogram_comparison/histogram_comparison.html) to suit your needs if compareHist() does fit your purpose.
If not, this paper http://www.researchgate.net/publication/222417618_Tracking_the_soccer_ball_using_multiple_fixed_cameras/file/32bfe512f7e5c13133.pdf
tracks the ball from multiple cameras and you might get some more ideas from that even though you might not be using multiple cameras.
As kkuilla have mentioned, there is an available method to compare histogram, such as compareHist() in opencv
But I am not certain if it's really applicable for your program. I think you will like to use HoughTransfrom to detect circles.
More details can be seen in this paper:
https://files.nyu.edu/jb4457/public/files/research/bristol/hough-report.pdf
Look for the part with coins for the circle detection in the paper. I did recall reading up somewhere before of how to do ball detection using Hough Transform too. Can't find it now. But it should be similar to your soccer ball.
This method should work. Hope this helps. Good luck(:

Object recognition methods in OpenCV

I am using some functions such as color contour tracking and image matching which are already available in OpenCV .. I am trying to identify a pink duck, more specifically the head of the duck, but these two functions don't give me the outcome I am expecting for some reasons such as :
the color thing don't always work perfect because the change in the lightning , which accordingly would change the color seen by the camera.
when I use the image matching thing, I use one image of the duck which I took from a specific position and it can identify the duck only when he is in that position, but I want to identify it even when I rotate the duck or play around with it.
Does anyone have an ideas about a better way to track a certain object ?
Thank you
Have you tried converting the image into the hsv colourspace? This colourspace tries to remove the effects of lighting so might be able to improve your colour-based segmentation.
To identify the head of the duck, once you have identified the duck as a whole you could perhaps identify the orientation (using template matching with a set of templates from different viewpoints, or haar cascades, or ...) and then use the known orientation and an empirical rule to determine where the head is located. For example, if you detect the duck in an upright position within a certain bounding box, the head is assumed to be located in the top third of that bounding box.
I think it might just take little more than what OpenCV provides straight forward way.
Given your specific question, you might just want to try shape descriptors of some sort.
Basically, try to take Duck's head's pictures shape from various angles and capture the shapes from it.
Now, you can find a likelihood model (forgive me for not a very accurate term) that can validate the hypothesis that a given captured shape indeed belongs to the class of Duck's head or not. Color can just be an additional feature that might help.
If you are a new person in this field - try catch hold of Duda and Hart: Pattern Classification. This doesn't have solution to find-the-duck-problem but will shape your thinking.

single person tracking from video sequence

As a part of my thesis work, I need to build a program for human tracking from video or image sequence like the KTH or IXMAS dataset with the assumptions:
Illumination remains unchanged
Only one person appear in the scene
The program need to perform in real-time
I have searched a lot but still can not find a good solution.
Please suggest me a good method or an existing program that is suitable.
Case 1 - If camera static
If the camera is static, it is really simple to track one person.
You can apply a method called background subtraction.
Here, for better results, you need a bare image from camera, with no persons in it. It is the background. ( It can also be done, even if you don't have this background image. But if you have it, better. I will tell at end what to do if no background image)
Now start capture from camera. Take first frame,convert both to grayscale, smooth both images to avoid noise.
Subtract background image from frame.
If the frame has no change wrt background image (ie no person), you get a black image ( Of course there will be some noise, we can remove it). If there is change, ie person walked into frame, you will get an image with person and background as black.
Now threshold the image for a suitable value.
Apply some erosion to remove small granular noise. Apply dilation after that.
Now find contours. Most probably there will be one contour,ie the person.
Find centroid or whatever you want for this person to track.
Now suppose you don't have a background image, you can find it using cvRunningAvg function. It finds running average of frames from your video which you use to track. But you can obviously understand, first method is better, if you get background image.
Here is the implementation of above method using cvRunningAvg.
Case 2 - Camera not static
Here background subtraction won't give good result, since you can't get a fixed background.
Then OpenCV come with a sample for people detection sample. Use it.
This is the file: peopledetect.cpp
I also recommend you to visit this SOF which deals with almost same problem: How can I detect and track people using OpenCV?
One possible solution is to use feature points tracking algorithm.
Look at this book:
Laganiere Robert - OpenCV 2 Computer Vision Application Programming Cookbook - 2011
p. 266
Full algorithm is already implemented in this book, using opencv.
The above method : a simple frame differencing followed by dilation and erosion would work, in case of a simple clean scene with just the motion of the person walking with absolutely no other motion or illumination changes. Also you are doing a detection every frame, as opposed to tracking. In this specific scenario, it might not be much more difficult to track either. Movement direction and speed : you can just run Lucas Kanade on the difference images.
At the core of it, what you need is a person detector followed by a tracker. Tracker can be either point based (Lucas Kanade or Horn and Schunck) or using Kalman filter or any of those kind of tracking for bounding boxes or blobs.
A lot of vision problems are ill-posed, some some amount of structure/constraints, helps to solve it considerably faster. Few questions to ask would be these :
Is the camera moving : No quite easy, Yes : Much harder, exactly what works depends on other conditions.
Is the scene constant except for the person
Is the person front facing / side-facing most of the time : Detect using viola jones or train one (adaboost with Haar or similar features) for side-facing face.
How spatially accurate do you need it to be, will a bounding box do it, or do you need a contour : Bounding box : just search (intelligently :)) in neighbourhood with SAD (sum of Abs Differences). Contour : Tougher, use contour based trackers.
Do you need the "tracklet" or only just the position of the person at each frame, temporal accuracy ?
What resolution are we speaking about here, since you need real time :
Is the scene sparse like the sequences or would it be cluttered ?
Is there any other motion in the sequence ?
Offline or Online ?
If you develop in .NET you can use the Aforge.NET framework.
http://www.aforgenet.com/
I was a regular visitor of the forums and I seem to remember there are plenty of people using it for tracking people.
I've also used the framework for other non-related purposes and can say I highly recommend it for its ease of use and powerful features.

Resources