I am trying to detect hand using OpenCV and C++.
I am able to find the contour of the hand (Positive image) with person hand present in the image. Basically I am finding largest contour and consider it as hand contour. Lets say in the given image the hand is not present then I will take any contour and consider it as the hand.
So I started thinking can I use the haar cascade to determine the rectangle of the hand and focus on that area, but I tried searching online for the xml but I think it is not available like face detection.
So given a image how can I determine from the set of contour which one is of hand?
You can find the best trained cascade xml file from the GitHub...
Here it is...
https://github.com/Aravindlivewire/Opencv/blob/master/haarcascade/aGest.xml
Related
I got an image of an individual with a beard:
Using a mask, I was able to extract the beard:
I want to move the beard on another person's face, such as this one:
I want to do this by getting the nose location of the first person, the nose location of the 2nd person and position the beard accordingly.
What are some ways to accomplish this goal? How can I do this by getting the facial landmarks? Are there any non-Deep Learning methods of doing this?
You could accomplish the goal by training a conv-net that takes as input images of the person's face, and returns the co-ordinates of the tip of the nose for positioning the person's beard. To train a conv-net, you will need data. You can potentially create your own data using an image annotation tool such as this, or find a relevant dataset online such as this one, to train your model. Here is a keypoint detection tutorial to get you started.
There may be non-deep learning methods of doing this. Maybe you could create SIFT descriptors for a person's nose, find the closest matching descriptors in the images you want to detect the tip of the nose in, and use their location to determine the position of the tip of the nose in your image of interest. However, I would recommend you go with the deep learning approach. Keypoint detection is a popular task in deep learning and you are likely to find many resources online for help.
I am working on a hand detection project. There are many good project on web to do this, but what I need is a specific hand pose detection. It needs a totally open palm and the whole palm face to outwards, like the image below:
The first hand faces to inwards, so it will not be detected, and the right one faces to outwards, it will be detected. Now I can detect hand with OpenCV. but how to tell the hand orientation?
Problem of matching with the forehand belongs to the texture classification, it's a classic pattern recognition problem. I suggest you to try one of the following methods:
Gabor filters: it is good to detect the orientation and pixel intensities (as forehand has different features), opencv has getGaborKernel function, the very important params of this function is theta (orientation) and lambd: (frequencies). To make it simple you can apply this process on a cropped zone of palm (as you have already detected it, it would be easy to crop for example the thumb, or a rectangular zone around the gravity center..etc). Then you can convolute it with a small database of images of the same zone to get the a rate of matching, or you can use the SVM classifier, where you have to train your SVM on a set of images by constructing the training matrix needed for SVM (check this question), this paper
Local Binary Patterns (LBP): it's an important feature descriptor used for texture matching, you can apply it on whole palm image or on a cropped zone or finger of image, it's easy to use in opencv, a lot of tutorials with codes are available for this method. I recommend you to read this paper talking about Invariant Texture Classification
with Local Binary Patterns. here is a good tutorial
Haralick Texture: I've read that it works perfectly when a set of features quantifies the entire image (Global Feature Descriptors). it's not implemented in opencv but easy to be implemented, check this useful tutorial
Training Models: I've already suggested a SVM classifier, to be coupled with some descriptor, that can works perfectly.
Opencv has an interesting FaceRecognizer class for face recognition, it could be an interesting idea to use it replacing the face images by the palm ones, (do resizing and rotation to get an unique pose of palm), this class has three methods can be used, one of them is Local Binary Patterns Histograms, which is recommended for texture recognition. and why not to try the other models (Eigenfaces and Fisherfaces ) , check this tutorial
well if you go for a MacGyver way you can notice that the left hand has bones sticking out in a certain direction, while the right hand has all finger lines and a few lines in the hand palms.
These lines are always sort of the same, so you could try to detect them with opencv edge detection or hough lines. Due to the dark color of the lines, you might even be able to threshold them out of it. Then gather the information from those lines, like angles, regressions, see which features you can collect and train a simple decision tree.
That was assuming you do not have enough data, if you have then you go into deeplearning, just take a basic inceptionV3 model and retrain the last dense layer to classify between two classes with a softmax, or to predict the probablity if the hand being up/down with sigmoid. Check this link, Tensorflow got your back on the training of this one, pure already ready code to execute.
Questions? Ask away
Take a look at what leap frog has done with the oculus rift. I'm not sure what they're using internally to segment hand poses, but there is another paper that produces hand poses effectively. If you have a stereo camera setup, you can use this paper's methods: https://arxiv.org/pdf/1610.07214.pdf.
The only promising solutions I've seen for mono camera train on large datasets.
use Haar-Cascade classifier,
you can get the classifier model file then use it here.
Just search for 'Haarcascade detection of Palm in Google' or use below code.
import cv2
cam=cv2.VideoCapture(0)
ccfr2=cv2.CascadeClassifier('haar-cascade-files-master/palm.xml')
while True:
retval,image=cam.read()
grey=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
palm=ccfr2.detectMultiScale(grey,scaleFactor=1.05,minNeighbors=3)
for x,y,w,h in palm:
image=cv2.rectangle(image,(x,y),(x+w,y+h),(256,256,256),2)
cv2.imshow("Window",image)
if cv2.waitKey(1) & 0xFF==ord('q'):
cv2.destroyAllWindows()
break
del(cam)
Best of Luck for your experience using HaarCascade.
I have a collection of face images, with 1 or sometimes 2 faces in each image. What I wanna do, is find the face in each image and then crop It.
I've tested a couple of methods, which are implemented in python using openCV, but the results weren't that good. These methods are:
1- Implementation 1
2- Implementation 2
There's one more model that I've tested, but I'm not allowed to post more than two links.
The problem is that these Haar-Feature based algorithms, are not robust to face size, and when I tried them on images which were taken close to the face, they couldn't find any faces.
Someone mentioned to try deep learning based algorithms, but I couldn't find one corresponding to what I want to do. Basically, I guess I need a pre-trained model, which can give me the coordinates of the face bounding box in the image, or better, a pre-trained model which gives out the cropped face image as output.
You don't need machine learning algorithms, Graph-Algorithms is enough. For example Snapchats face recognition algorithm works as follows:
Create a Graph with Nodes and Edges from a most common Face ("Standard Face").
Deform that Graph / Recoordinate the Nodes to the fitted pixels in the Input Image
voila you got the face recognized in the Input Image.
Easy said, but harder to code. We implemented in our university the Dijkstra Algorithm for example and I can hand you my "Graph" Class if you need it. But I wrote it in C++.
With these graph-algorithm you can crop out the faces more efficient.
I am able to detect eyes, nose and mouth in a given face using Matlab. Now, I want four more points i.e corners of the eyes and nose. how do i get these points??
This is the Image for corner points of nose.
Red point is showing the point, what I'm looking for.(its just to let you know.. there is no point in original image)
Active Appearance Model (AAM) could be useful in your case.
AMM is normally used for matching a statistical model of object shape and appearance to a new image and widely used for extracting face features and for head pose estimation.
I believe this could be helpful for you to start with.
You can try using corner detectors included in the computer vision system toolbox, such as detectHarrisFeatures, detectMinEigenFeatures, or detectFASTFeatures. However they may give you more points than you want, so you will have to do some parameter tweaking.
Having a match-3 game screenshot (for example http://www.gameplay3.com/images/games/jewel-quest-ii-01S.jpg), what would be the correct way to find the bound box for the grid (table with tiles)? The board doesn't have to be a perfect rectangle (as can be seen in the screenshot), but each cell is completely square.
I've tried several games, and found that there are some per-game image transformations that can be done to enhance the tiles inside the grid (for example in this game it's enough to take the V channel out of HSV color space). Then I can enlarge the tiles so that they overlap, find the largest contour of the image and get the bound box from it.
The problem with above approach is that every game (or even level inside the same game) may need a different transformation to get hold of the tiles. So the question is - is there a standard way to enhance either tiles inside the grid or grid's lines (I've tried finding lines with Hough transform, but, although the grid seems pretty visible to the eye, Hough doesn't find it)?
Also, what if the screenshot is obtained using the phone camera instead of taking a screenshot of a desktop? From my experience, captured images have less defined colors (which depends on lighting), and also can be distorted a little, as there is no way to hold the phone exactly in front of the screen.
I would go with the following approach for a screenshot:
Find corners in the image using for example a canny like edge detector.
Perform a hough line transform. This should work quite nicely on the edge image.
If you have some information about size of the tiles you could eliminate false positive lines using some sort of spatial model of the grid (eg. lines only having a small angle to x/y axis of the image and/or distance/angle of tile borders.
Identifiy tile borders under the found hough lines by looking for edges found by canny under/next to the lines.
Which implementation of the hough transform did you use? How did you preprocess the image?
Another approach would be to use some sort of machine learning approach. As you are working in OpenCV you could use either a Haar like feature detector. An example for face detection using Haar like features can be found here:
OpenCV Haar Face Detector example
Another machine learning approach would be to follow a Histogram of Oriented Gradients (Hog) approach in combination with a Support Vector Machine (SVM). An example is located here:
HOG example
You can find general information about HoG detection at:
Hog detection