I would like to know more about different ways of detecting smile on image. As far as I know, there are many libraries that are allowing to detect face and smile. The ones that I've tried are:
FaceSDK from Luxand
OpenCV
OpenIMAJ
Instead of just using them, I'm curious about how they are working. I know that OpenCV and OpenIMAJ are working based on Haar classifiers. I don't really follow, how FaceSDK is doing face and smile detection though.
I can imagine, that you can get two different ways of smile detection:
Perform fully emotional detection. You can find eyes, nose, mouth and other features on face and then compute emotion based on those informations. If you get "happy" emotion, you can assume, that there is smile (or something like that, just finding mouth and checking curve on lower lip?).
Similiar to Haar cascades, search image and try to find object that is similiar to the one that you are searching (having many negative and positive samples). This one seems to be faster, but less trustworthy if not used with some "helpers".
Is there any other way? Do you guys have some articles on one of those ways?
Related
So I want to count how many people appear in a facebook profile picture.
Typically there are 0-2 people (sometimes there are 4-5+ but that's more rare).
A sample dataset (and a few tries using python) can be found here:
https://github.com/yoniker/FaceDetect
I've tried different methods, none of them give reasonable results (all of those methods are wrong most of the time),I've tried the following:
-Face detection- http://docs.opencv.org/trunk/d7/d8b/tutorial_py_face_detection.html
It usually doesn't find anyone (that happens at around 75% of the pictures)- and I have tried different Haar filters and parameters.
-Pedestrian Detection http://www.pyimagesearch.com/2015/11/09/pedestrian-detection-opencv/
Again it doesn't find people most of the time.
OpenFace:Probably this face recognition algo doesn't truly help with face detection (see https://groups.google.com/forum/#!topic/cmu-openface/X6erXKckk0Q).
And finally I've looked at different StackOverflow questions such as
Count the number of people in the video but none of them are relevant!
I've tried for half a day now- so help will be super appreciated!!
For me, dlib has given better results than using OpenCV's haar face detector. It has python bindings too. You can find quick-start code to do face detection here.
It would be possible to help better if you post an Image in which faces are not detected properly.
Having said that, to improve face detection apart from using dlib, you can experiment with these ideas:
Use histogram equalisation(equalizeHist on opencv) on gray scale image before passing it to face detector. (i.e preprocess your images)
If faces are tilted to left or right, more often face detection fails. To solve this rotate the images in steps of 5 degrees upto 30 degrees and apply face detection. At each rotation you might detect new faces.
Most face detectors which are not using deep learning detect mostly frontal faces. Not much could be done about this apart from using deep learning or train your own side profile face detector using HOG or HAAR features.
Hope this helps you to improve your face detection.
There's always the cascade classifier in OpenCV for all your face detection needs. If you could feed it with some nice features it would give you all the results.
I am trying to detect and track hand in real time using opencv. I thought haar cascade classifiers would yield a fair result. After training with 10k and 20k positive and negative images respectively, I obtained a classifier xml file. Unfortunately, it detects hand only in certain positions, proving that it works best only for rigid objects. So I am now thinking of adopting another algorithm that can track hand, once detected through haar classifier.
My question is,if I make sure that haar classifier detects hand in a certain frame, certain position, what method would yield robust tracking of hand further?
I searched web a bit, and have understood I can go for optical flow of the detected hand , or kalman filter or particle filter, but also have come across their own disadvantages.
also, If I incorporate stereo vision, would it help me, as I can possibly reconstruct hand in 3d.
You concluded rightly about Haar features - they aren't that useful when it comes to non-rigid objects.
Take a look at the following papers which use skin colour to detect hands.
Interaction between hands and wearable cameras
Markerless inspection of augmented reality objects
and this paper that uses KLT features to track the hand after the first detection:
Fast 2D hand tracking with flocks of features and multi-cue integration
I would say that a stereo camera will not help your cause much, as 3D reconstruction of non-rigid objects isn't straightforward and would require a whole lot of innovation and development. However, you can take a look at the papers in the hand pose estimation section of this page if you wish to pursue 3D tracking.
EDIT: Also take a look at this recent paper, which seems to get good results.
Zhang et al.'s Real-time Compressive Tracking does a reasonable job of tracking an object, once it has been detected by some other method, provided that the motion is not too fast. They have an OpenCV implementation (but it would need a bit of work to reuse).
This research paper describes a method to track hands, without using gloves by using a stereo camera setup.
there have been similar questions on stack overflow...
have a look at my answer and that of others: https://stackoverflow.com/a/17375647/1463143
you can for certain get better results by avoiding haar training and detection for deformable entities.
CamShift algorithm is generally fast and accurate, if you want to track the hand as a single entity. OpenCV documentation contains a good, easy-to-understand demo program that you can easily modify.
If you need to track fingers etc., however, further modeling will be needed.
Information:
I would like to use OpenCV's HOG detection to identify objects that can be seen in a variety of orientations. The only problem is, I can't seem to find a reasonable feature detector or classifier to detect this in a rotation and scale invaraint way (as is needed by objects such as forearms).
Prior Work:
Lets focus on forearms for this discussion. A forearm can have multiple orientations, the primary distinct features probably being its contour edges. It is possible to have images of forearms that are pointing in any direction in an image, thus the complexity. So far I have done some in depth research on using HOG descriptors to solve this problem, but I am finding that the variety of poses produced by forearms in my positives training set is producing very low detection scores in actual images. I suspect the issue is that the gradients produced by each positive image do not produce very consistent results when saved into the Histogram. I have reviewed many research papers on the topic trying to resolve or improvie this, including the original from Dalal & Triggs [Link]: http://lear.inrialpes.fr/people/triggs/pubs/Dalal-cvpr05.pdf It also seems that the assumptions made for detecting whole humans do not necessary apply to detecting individual features (particularly the assumption that all humans are standing up seems to suggest HOG is not a good route for rotation invariant detection like that of forearms).
Note:
If possible, I would like to steer clear of any non-free solutions such as those pertaining to Sift, Surf, or Haar.
Question:
What is a good solution to detecting rotation and scale invariant objects in an image? Particularly for this example, what would be a good solution to detecting all orientations of forearms in an image?
I use hog to detect human heads and shoulders. To train particular part you have to give the location of it. If you use opencv, you can clip samples containing only the training part you want, and make sure all training samples share the same size. For example, I clip images to contain only head and shoulder and resize all them to 64x64. Other opensource codes may require you to pass the location as the input parameter, essentially the same.
Are you trying the Discriminatively trained deformable part model ?http://www.cs.berkeley.edu/~rbg/latent/
you may find answers there.
Can (or maybe should) I use facial recognition software to detect non-faces? For example, let's say I'm trying to find a can of soda in a picture of a bedroom. If I use the haartraining algorithms on many pictures of cans of soda, will the facial recognition feature then find the can of soda in a picture of a bedroom?
Yes you can. Those algorithms are based on extracting features from training sets. If you pass them faces the will be good at detecting features from faces. If you pass them cans they will be good at detecting cans.
A different idea is when you specialize an algorithm on a specific object by restricting the search, but haar are not the case.
I think you are referring to face-detection not face-recognition.
In general, face detection works on discriminating appearance or textures within the face area. They also implicitly assume a mostly planar object.
You can use the same classifiers to detect other mostly-flat objects which are discriminated by appearance, for example, facial features like eyes and mouth, book covers etc.
Soda cans are cylindrical, so you may have to use multiple overlapping rotated view of each can to get it to work.
I'm attempting to implement an easter egg in a mobile app I'm working on. These easter egg will be triggered when a logo is detected in the camera view. The logo I'm trying to detect is this one: .
I'm not quite sure what the best way to approach this is as I'm pretty new to computer vision. I'm currently finding horizontal edges using the Canny algorithm. I then find line segments using the probabilistic Hough transform. The output of this looks as follows (blue lines represent the line segments detected by the probabilistic Hough transform):
The next step I was going to take would be to look for a group of around 24 lines (fitting within a nearly square rectangle), each line would have to be approximately the same length. I'd use these two signals to indicate the potential presence of the logo. I realise that this is probably a very naive approach and would welcome suggestions as to how to better detect this logo in a more reliable manner?
Thanks
You may want to go with SIFT using Rob Hess' SIFT Library. It's using OpenCV and also pretty fast. I guess that easier than your current way of approaching the logo detection :)
Try also looking for SURF, which claims to be faster & robuster than SIFT. This Feature Detection tutorial will help you.
You may just want to use LogoGrab's technology. It's the best out there and offers all sorts of APIs (both mobile and HTTP). http://www.logograb.com/technologyteam/
I'm not quite sure if you would find such features in the logo to go with a SIFT/SURF approach. As an alternative you can try training a Haar-like feature classifier and use it for detecting the logo, just like opencv does for face detection.
You could also try the Tensorflow's object detection API here:
https://github.com/tensorflow/models/tree/master/research/object_detection
The good thing about this API is that it contains State-of-the-art models in Object Detection & Classification. These models that tensorflow provide are free to train and some of them promise quite astonishing results. I have already trained a model for the company I am working on, that does quite amazing job in LOGO detection from Images & Video Streams. You can check more about my work here:
https://github.com/kochlisGit/LogoLens