I and my partner decided to implement a traffic light recognition program as a student project.
But we are absolute beginners with computer vision and have no idea how to start with this. (What only we know is to use OpenCV)
Should we firstly learn image recognition or just start with object tracking?
Our ideal production is to recognize traffic light in a video but not just an image.
In my opinion, you should take a serious course about computer vision before going deeper.
The video is just a sequence of picture. So you could use opencv to read each image then process them.
For you current project, a simple object detection using hog feature should be more than enough.
There's tutorial at http://www.hackevolve.com/create-your-own-object-detector/ . It's very easy to understand and source code is also available, so you can move quick.
Good luck.
Related
I would like to use computer vision to do the following:
A camera is mounted outside a building, capturing a videostream of the street below. The camera is installed approximately 5-6 meters above the street.
Whenever a person wearing a certain kind of hat(white, round) is captured by the camera, an event should be triggered.
Which algorithm should I look into to implement this kind of behavior ?
Is this best achieved through training the algorithm with sample data or is there another way to tell it to look for this type of hat ?
Also, how do I use multiple frames of video to increase the quality of detection ?
Edit: Added a picture of the hat
Before we do everything in comments I will start an answer here.
The first link you posted describes a simple color-based detection. You can try that, but it will fail if there are other pixel clusters of similar color in the image. Your idea of combining it with tracking is good: Identify clusters, build trajectories over several images, and only accept plausible trajectories as a hit. For robust tracking you may want to look into Kalman filtering. A problem you will most likely encounter is that a "white" hat will hardly be "white" in the images your camera delivers.
The second link you refer to - boosted Classifiers Based on Haar-like Features - is for detection of more complex objects. It probably won't help you find white blobs. Invest your time and energy in learning about tracking.
I'm happy to repeat myself here: "Solving a computer vision problem" is not something like "sorting an array". OpenCV is not the C++ Standard Library. You can use an std::map without knowing anything about a red-black tree. But (IMHO) you can't use Vision APIs without knowing a good deal of the math and theory. Working solutions Computer Vision are typically heavily tuned towards the specific problem scenario. Sorry if that sounds pedantic, but it explains why your question got beaten.
I went through the Kinect SDK and Toolkit provided by Microsoft. Tested the Face Detection Sample, it worked successfully. But, how to recognize the faces ? I know the basics of OpenCV (VS2010). Is there any Kinect Libraries for face recognition? if no, what are the possible solutions? Are there, any tutorials available for face recognition using Kinect?
I've been working on this myself. At first I just used the Kinect as a webcam and passed the data into a recognizer modeled after this code (which uses Emgu CV to do PCA):
http://www.codeproject.com/Articles/239849/Multiple-face-detection-and-recognition-in-real-ti
While that worked OK, I thought I could do better since the Kinect has such awesome face tracking. I ended up using the Kinect to find the face boundaries, crop it, and pass it into that library for recognition. I've cleaned up the code and put it out on github, hopefully it'll help someone else:
https://github.com/mrosack/Sacknet.KinectFacialRecognition
I've found project which could be a good source for you - http://code.google.com/p/i-recognize-you/ but unfortunetly(for you) its homepage is not in english. The most important parts:
-project(with source code) is at http://code.google.com/p/i-recognize-you/downloads/list
-in bibliography author mentioned this site - http://www.shervinemami.info/faceRecognition.html. This seems to be a good start point for you.
There are no built in functionality for the Kinect that will provide face recognition. I'm not aware of any tutorials out there that will do it, but someone I'm sure has tried. It is on my short list; hopefully time will allow soon.
I would try saving the face tracking information and doing a comparison with that for recognition. You would have a "setup" function that would ask the user the stare at the Kinect, and would save the points the face tracker returns to you. When you wish to recognize a face, the user would look at the screen and you would compare the face tracker points to a database of faces. This is roughly how the Xbox does it.
The big trick is confidence levels. Numbers will not come back exactly as they did previously, so you will need to include buffers of values for each feature -- the code would then come back with "I'm 93% sure this is Bob".
I want to build a video based tracking software. I can manage the control and display quite easily but the actual object tracking in a video stream is very difficult (color tracking is not an option).
Solutions like openCV would probably require a very long learning curve which I can't afford ATM.
Are there professional packages which expose a simple API for object tracking? C# and C++ are the preferred languages but other would be fine as well. Price is also less of an issue.
Computer Vision System Toolbox for MATLAB provides tracking functionality. Please check out the following examples:
Tracking a face
Tracking multiple objects
Generally, a lot depends on the specific problem you are trying to solve. Is the camera moving or stationary? Do you need to track a single object or multiple objects? Does your object have a distinctive color or texture? Does your object move in some predictable way?
Use OpenTLD. It tracks almost anything, but at a time, track only one thing. And code is in matlab.
I'm planning on doing my Final Year Project of my degree on Augmented Reality. It will be using markers and there will also be interaction between virtual objects. (sort of a simulation).
Do you recommend using libraries like ARToolkit, NyARToolkit, osgART for such project since they come with all the functions for tracking, detection, calibration etc? Will there be much work left from the programmers point of view?
What do you think if I use OpenCV and do the marker detection, recognition, calibration and other steps from scratch? Will that be too hard to handle?
I don't know how familiar you are with image or video processing, but writing a tracker from scratch will be very time-consuming if want it to return reliable results. The effort also depends on which kind of markers you plan to use. Artoolkit e.g. compares the marker's content detected from the video stream to images you earlier defined as markers. Hence it tries to match images and returns a value of probability that a certain part of the video stream is a predefined marker. Depending on the threshold you are going to use and the lighting situation, markers are not always recognized correctly. Then there are other markers like datamatrix, qrcode, framemarkers (used by QCAR) that encode an id optically. So there is no image matching required, all necessary data can be retrieved from the video stream. Then there are more complex approaches like natural feature tracking, where you can use predefined images, given that they offer enough contrast and points of interest so they can be recognized later by the tracker.
So if you are more interested in the actual application or interaction than in understanding how trackers work, you should base your work on an existing library.
I suggest you to use OpenCV, you will find high quality algorithms and it is fast. They are continuously developing new methods so soon it will be possible to run it real-time in mobiles.
You can start with this tutorial here.
Mastering OpenCV with Practical Computer Vision Projects
I did the exact same thing and found Chapter 2 of this book immensely helpful. They provide source code for the marker tracking project and I've written a framemarker generator tool. There is still quite a lot to figure out in terms of OpenGL, camera calibration, projection matrices, markers and extending it, but it is a great foundation for the marker tracking portion.
I want to detect one kind of object such as person in the picture,
who can tell me how to training a kind of people classifier for use,so we can use the classifier to detect people in any picture.
It's a pretty vague question but i think you're looking for a good computer vision library. The gold standard is OpenCV (Open Source Computer Vision). That'll get you started, and there's lots of people that have done facial recognition with it.
If you want to tell two people apart, that's a hugely more complicated problem. You'll likely use some of the same tools, but you'll need much more complicated algos.
you can take a look at viola-jones framework: Viola-Jones Object Detection Framework at Wikipedia
it's cvHaarDetectObjects() in OpenCV.