Short introduction: project of augmented reality
Goal: load 3D hairstyles templates on the head of someone.
So I am using OpenCV to track the face of the person, then I have to track the cap (We assume that the user has a cap and we can decide of a landmark or everything we need on the cap to detect it) of the user. Once I detected the landmark, I have to get the coordinates of the landmark, and then send it to the 3D engine to launch/update the 3D object.
So, to detect precisely the landmark(s) of the cap I tested a few methods first:
cvFindChessBoardCorner ... very efficient on a plan surface but not a cap (http://dsynflo.blogspot.com/2010/06/simplar-augmented-reality-for-opencv.html)
color detection (http://www.aishack.in/2010/07/tracking-colored-objects-in-opencv/) ... not really efficient. If the luminosity change, the colors change ...
I came to you today to think about it with you. Do I need special landmark on the cap ? (If yes Which one ? If not, How can I do ?)
Is it a good idea to mix up the detection of colors and detection of form ?
.... Am i on the right way ? ^^
I appreciate any advice regarding the use of a cap to target the head of the user, and the different functions I have to use in OpenCV library.
Sorry for my english if it's not perfect.
Thank you a lot !
A quick method, off the top of my head, is to combine both method.
Color tracking using histograms and mean shift
Here is an alternative color detection method using histogram:
Robust Hand Detection via Computer Vision
The idea is this:
For a cap of known color, say bright green/blue (like the kind of colors you see for image matting screen), you can pre-compute a histogram using just the hue and saturation color channels. We deliberately exclude the lightness channel to make it more robust to lighting variations. Now, with the histogram, you can create a back projection map i.e. a mask with a probability value at each pixel in the image indicating the probability that the color there is the color of the cap.
Now, after obtaining the probability map, you can run the meanshift or camshift algorithms (available in OpenCV) on this probability map (NOT the image), with the initial window placed somewhere above the face you detected using OpenCV's algorithm. This window will eventually end up at the mode of the probability distribution i.e. the cap.
Details are in the link on Robust Hand Detection I gave above. For more details, you should consider getting the official OpenCV book or borrowing it from your local library. There is a very nice chapter on using meanshift and camshift for tracking objects. Alternatively, just search the web using any queries along meashift/camshift for object tracking.
Detect squares/circles to get orientation of head
If in addition you wish to further confirm this final location, you can add 4 small squares/circles on the front of the cap and use OpenCV's built in algorithm to detect them only in this region of interest (ROI). It is sort of like detecting the squares in those QR code thing. This step further gives you information on the orientation of the cap and hence, the head, which might be useful when you render the hair. E.g. after locating 2 adjacent squares/circles, you can compute the angle between them and the horizontal/vertical line.
You can detect squares/corners using the standard corner detectors etc in OpenCV.
For circles, you can try using the HoughCircle algorithm: http://docs.opencv.org/modules/imgproc/doc/feature_detection.html#houghcircles
Speeding this up
Make extensive use of Region of Interests (ROIs)
To speed things up, you should, as often as possible, run your algorithm on small regions of the image(ROIs) (also the probability map). You can extract ROI from OpenCV image, which are themselves images, and run OpenCV's algorithms on them the same way you would run them on whole images. For e.g., you could compute the probability map for an ROI around the detected face. Similarly, the meanshift/camshift algorithm should only be run on this smaller map. Likewise for the additional step to detect squares or circles. Details can be found in the OpenCV book as well as quick search online.
Compile OpenCV with TBB and CUDA
A number of OpenCV's algorithm can achieve significant speed ups WITHOUT the programmer needing to do any additional work simply by compiling your OpenCV library with TBB (Thread building blocks) and CUDA support switched on. In particular, the face detection algorithm in OpenCV (Viola Jones) will run a couple times faster.
You can switch on these options only after you have installed the packages for TBB and CUDA.
TBB: http://threadingbuildingblocks.org/download
CUDA: https://developer.nvidia.com/cuda-downloads
And then compile OpenCV from source: http://docs.opencv.org/doc/tutorials/introduction/windows_install/windows_install.html#windows-installation
Lastly, I'm not sure whether you are using the "C version" of OpenCV. Unless strictly necessary (for compatibility issues etc.), I recommend using the C++ interface of OpenCV simply because it is more convenient (from my personal experience at least). Now let me state upfront that I don't intend this statement to start a flame war on the merits of C vs C++.
Hope this helps.
Related
I am working on a hand detection project. There are many good project on web to do this, but what I need is a specific hand pose detection. It needs a totally open palm and the whole palm face to outwards, like the image below:
The first hand faces to inwards, so it will not be detected, and the right one faces to outwards, it will be detected. Now I can detect hand with OpenCV. but how to tell the hand orientation?
Problem of matching with the forehand belongs to the texture classification, it's a classic pattern recognition problem. I suggest you to try one of the following methods:
Gabor filters: it is good to detect the orientation and pixel intensities (as forehand has different features), opencv has getGaborKernel function, the very important params of this function is theta (orientation) and lambd: (frequencies). To make it simple you can apply this process on a cropped zone of palm (as you have already detected it, it would be easy to crop for example the thumb, or a rectangular zone around the gravity center..etc). Then you can convolute it with a small database of images of the same zone to get the a rate of matching, or you can use the SVM classifier, where you have to train your SVM on a set of images by constructing the training matrix needed for SVM (check this question), this paper
Local Binary Patterns (LBP): it's an important feature descriptor used for texture matching, you can apply it on whole palm image or on a cropped zone or finger of image, it's easy to use in opencv, a lot of tutorials with codes are available for this method. I recommend you to read this paper talking about Invariant Texture Classification
with Local Binary Patterns. here is a good tutorial
Haralick Texture: I've read that it works perfectly when a set of features quantifies the entire image (Global Feature Descriptors). it's not implemented in opencv but easy to be implemented, check this useful tutorial
Training Models: I've already suggested a SVM classifier, to be coupled with some descriptor, that can works perfectly.
Opencv has an interesting FaceRecognizer class for face recognition, it could be an interesting idea to use it replacing the face images by the palm ones, (do resizing and rotation to get an unique pose of palm), this class has three methods can be used, one of them is Local Binary Patterns Histograms, which is recommended for texture recognition. and why not to try the other models (Eigenfaces and Fisherfaces ) , check this tutorial
well if you go for a MacGyver way you can notice that the left hand has bones sticking out in a certain direction, while the right hand has all finger lines and a few lines in the hand palms.
These lines are always sort of the same, so you could try to detect them with opencv edge detection or hough lines. Due to the dark color of the lines, you might even be able to threshold them out of it. Then gather the information from those lines, like angles, regressions, see which features you can collect and train a simple decision tree.
That was assuming you do not have enough data, if you have then you go into deeplearning, just take a basic inceptionV3 model and retrain the last dense layer to classify between two classes with a softmax, or to predict the probablity if the hand being up/down with sigmoid. Check this link, Tensorflow got your back on the training of this one, pure already ready code to execute.
Questions? Ask away
Take a look at what leap frog has done with the oculus rift. I'm not sure what they're using internally to segment hand poses, but there is another paper that produces hand poses effectively. If you have a stereo camera setup, you can use this paper's methods: https://arxiv.org/pdf/1610.07214.pdf.
The only promising solutions I've seen for mono camera train on large datasets.
use Haar-Cascade classifier,
you can get the classifier model file then use it here.
Just search for 'Haarcascade detection of Palm in Google' or use below code.
import cv2
cam=cv2.VideoCapture(0)
ccfr2=cv2.CascadeClassifier('haar-cascade-files-master/palm.xml')
while True:
retval,image=cam.read()
grey=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
palm=ccfr2.detectMultiScale(grey,scaleFactor=1.05,minNeighbors=3)
for x,y,w,h in palm:
image=cv2.rectangle(image,(x,y),(x+w,y+h),(256,256,256),2)
cv2.imshow("Window",image)
if cv2.waitKey(1) & 0xFF==ord('q'):
cv2.destroyAllWindows()
break
del(cam)
Best of Luck for your experience using HaarCascade.
For a project of mine, I'm required to process images differences with OpenCV. The goal is to detect an intrusion in a zone.
To be a little more clear, here are the inputs and outputs:
Inputs:
An image of reference
A second image from approximately the same point of view (can be an error margin)
Outputs:
Detection of new objects in the scene.
Bonus:
Recognition of those objects.
For me, the most difficult part of it is to take off small differences (luminosity, camera position margin error, movement of trees...)
I already read a lot about OpenCV image processing (subtraction, erosion, threshold, SIFT, SURF...) and have some good results.
What I would like is a list of steps you think is the best to have a good detection (humans, cars...), and the algorithms to do each step.
Many thanks for your help.
Track-by-Detect, human tracker:
You apply the Hog detector to detect humans.
You draw a respective rectangle as foreground area on the foreground mask.
You pass this mask to "The OpenCV Video Surveillance / Blob Tracker Facility"
You can, now, group the passing humans based on their blob.{x,y} values into public/restricted areas.
I had to deal with this problem the last year.
I suggest an adaptive background-foreground estimation algorithm which produced a foreground mask.
On top of that, you add a blob detector and tracker, and then calculate if an intersection takes place between the blobs and your intrusion area.
Opencv comes has samples of all of these within the legacy code. Ofcourse, if you want you can also use your own or other versions of these.
Links:
http://opencv.willowgarage.com/wiki/VideoSurveillance
http://experienceopencv.blogspot.gr/2011/03/blob-tracking-video-surveillance-demo.html
I would definitely start with a running average background subtraction if the camera is static. Then you can use findContours() to find the intruding object's location and size. If you want to detect humans that are walking around in a scene, I would recommend looking at using the built-in haar classifier:
http://docs.opencv.org/doc/tutorials/objdetect/cascade_classifier/cascade_classifier.html#cascade-classifier
where you would just replace the xml with the upperbody classifier.
I've built an imaging system with a webcam and feature matching such that as I move the camera around; I can track the camera's motion. I am doing something similar to here, except with the webcam frames as the input.
It works really well for "good" images, but when taking images in really low light lots of noise appears (camera high gain), and that messes with the feature detection and matching. Basically, it doesn't detect any good features, and when it does, it cannot match them correctly between frames.
Does anyone know a good solution for this? What other methods are used for finding and matching features?
Here are two example images with very low features:
I think phase correlation is going to be your best bet here. It is designed to tell you the phase shift (i.e., translation) between two images. It is much more resilient (but not immune) to noise than feature detection because it operates in frequency space; whereas, feature detectors operate spatially. Another benefit is, it is very fast when compared with feature detection methods. I have an implementation available in the OpenCV trunk that is sub-pixel accurate located here.
However, your images are pretty much "featureless" with the exception of the crease in the middle, so even phase correlation may have some trouble with it. Think of it like trying to detect translation in a snow storm. If all you can see is white, you can't tell that you have translated at all, thus the term whiteout. In your case, the algorithm might suffer from "greenout" :)
Can you adjust the camera settings to work better in low-light conditions. Have you fully opened the iris? Can you live with lower framerates? Setting a longer exposure time will allow the camera to gather more light, thus giving you more features at the cost of adding motion blur. Or, if low-light is your default environment you probably want something designed for this like an IR camera, but those can be expensive. Other than that, a big lens and long exposures are your friend :)
Histogram equalization may be of interest in improving the image contrast. But, sometimes it can just enhance the noise. OpenCV has a global histogram equalization function called equalizeHist. For a more localized implementation, you'll want to look at Contrast Limited Adaptive Histogram Equalization or CLAHE for short. Here is a good article on it. This page has some nice examples, and some code.
Have OpenCV implementation of shape context matching? I've found only matchShapes() function which do not work for me. I want to get from shape context matching set of corresponding features. Is it good idea to compare and find rotation and displacement of detected contour on two different images.
Also some example code will be very helpfull for me.
I want to detect for example pink square, and in the second case pen. Other examples could be squares with some holes, stars etc.
The basic steps of Image Processing is
Image Acquisition > Preprocessing > Segmentation > Representation > Recognition
And what you are asking for seems to lie within the representation part os this general algorithm. You want some features that descripes the objects you are interested in, right? Before sharing what I've done for simple hand-gesture recognition, I would like you to consider what you actually need. A lot of times simplicity will make it a lot easier. Consider a fixed color on your objects, consider background subtraction (these two main ties to preprocessing and segmentation). As for representation, what features are you interested in? and can you exclude the need of some of these features.
My project group and I have taken a simple approach to preprocessing and segmentation, choosing a green glove for our hand. Here's and example of the glove, camera and detection on the screen:
We have used a threshold on defects, and specified it to find defects from fingers, and we have calculated the ratio of a rotated rectangular boundingbox, to see how quadratic our blod is. With only four different hand gestures chosen, we are able to distinguish these with only these two features.
The functions we have used, and the measurements are all available in the documentation on structural analysis for OpenCV, and for acces of values in vectors (which we've used a lot), can be found in the documentation for vectors in c++
I hope you can use the train of thought put into this; if you want more specific info I'll be happy to comment, Enjoy.
I'm new to OpenCV (am actually using Emgu CV C# wrapper) and am attempting to do some object detection.
I'm attempting to determine if an object matches a predefined set of objects (that I will have to define). The background is well lit and does not move. My objects that I am starting with are bottles and cans.
My current approach is:
Do absDiff with a previously taken background image to separate the background.
Then dilate 4x to make the lighter areas (in labels) shrink.
Then I do a binary threshold to get a big blog, followed by finding contours in this image.
I then take the largest contour and draw it, which becomes my shape to either save to the accepted set or compare with the accepted set.
Currently I'm using cvMatchShapes, but the double return value seems to vary widely. I'm guessing it is because it doesn't take into account rotation.
Is this approach a good one? It isn't working well for glass bottles since the edges are hard to find...
I've read about haar classifiers, but thinking that might be overkill for my task.
Maybe this link is useful too. You have the code and library for SIFT, you just need to compile it. Good luck.
http://blogs.oregonstate.edu/hess/sift-library-places-2nd-in-acm-mm-10-ossc/#more-176