Logo detection using OpenCV - opencv

I'm attempting to implement an easter egg in a mobile app I'm working on. These easter egg will be triggered when a logo is detected in the camera view. The logo I'm trying to detect is this one: .
I'm not quite sure what the best way to approach this is as I'm pretty new to computer vision. I'm currently finding horizontal edges using the Canny algorithm. I then find line segments using the probabilistic Hough transform. The output of this looks as follows (blue lines represent the line segments detected by the probabilistic Hough transform):
The next step I was going to take would be to look for a group of around 24 lines (fitting within a nearly square rectangle), each line would have to be approximately the same length. I'd use these two signals to indicate the potential presence of the logo. I realise that this is probably a very naive approach and would welcome suggestions as to how to better detect this logo in a more reliable manner?
Thanks

You may want to go with SIFT using Rob Hess' SIFT Library. It's using OpenCV and also pretty fast. I guess that easier than your current way of approaching the logo detection :)
Try also looking for SURF, which claims to be faster & robuster than SIFT. This Feature Detection tutorial will help you.

You may just want to use LogoGrab's technology. It's the best out there and offers all sorts of APIs (both mobile and HTTP). http://www.logograb.com/technologyteam/

I'm not quite sure if you would find such features in the logo to go with a SIFT/SURF approach. As an alternative you can try training a Haar-like feature classifier and use it for detecting the logo, just like opencv does for face detection.

You could also try the Tensorflow's object detection API here:
https://github.com/tensorflow/models/tree/master/research/object_detection
The good thing about this API is that it contains State-of-the-art models in Object Detection & Classification. These models that tensorflow provide are free to train and some of them promise quite astonishing results. I have already trained a model for the company I am working on, that does quite amazing job in LOGO detection from Images & Video Streams. You can check more about my work here:
https://github.com/kochlisGit/LogoLens

Related

How to detect hand palm and its orientation (like facing outwards)?

I am working on a hand detection project. There are many good project on web to do this, but what I need is a specific hand pose detection. It needs a totally open palm and the whole palm face to outwards, like the image below:
The first hand faces to inwards, so it will not be detected, and the right one faces to outwards, it will be detected. Now I can detect hand with OpenCV. but how to tell the hand orientation?
Problem of matching with the forehand belongs to the texture classification, it's a classic pattern recognition problem. I suggest you to try one of the following methods:
Gabor filters: it is good to detect the orientation and pixel intensities (as forehand has different features), opencv has getGaborKernel function, the very important params of this function is theta (orientation) and lambd: (frequencies). To make it simple you can apply this process on a cropped zone of palm (as you have already detected it, it would be easy to crop for example the thumb, or a rectangular zone around the gravity center..etc). Then you can convolute it with a small database of images of the same zone to get the a rate of matching, or you can use the SVM classifier, where you have to train your SVM on a set of images by constructing the training matrix needed for SVM (check this question), this paper
Local Binary Patterns (LBP): it's an important feature descriptor used for texture matching, you can apply it on whole palm image or on a cropped zone or finger of image, it's easy to use in opencv, a lot of tutorials with codes are available for this method. I recommend you to read this paper talking about Invariant Texture Classification
with Local Binary Patterns. here is a good tutorial
Haralick Texture: I've read that it works perfectly when a set of features quantifies the entire image (Global Feature Descriptors). it's not implemented in opencv but easy to be implemented, check this useful tutorial
Training Models: I've already suggested a SVM classifier, to be coupled with some descriptor, that can works perfectly.
Opencv has an interesting FaceRecognizer class for face recognition, it could be an interesting idea to use it replacing the face images by the palm ones, (do resizing and rotation to get an unique pose of palm), this class has three methods can be used, one of them is Local Binary Patterns Histograms, which is recommended for texture recognition. and why not to try the other models (Eigenfaces and Fisherfaces ) , check this tutorial
well if you go for a MacGyver way you can notice that the left hand has bones sticking out in a certain direction, while the right hand has all finger lines and a few lines in the hand palms.
These lines are always sort of the same, so you could try to detect them with opencv edge detection or hough lines. Due to the dark color of the lines, you might even be able to threshold them out of it. Then gather the information from those lines, like angles, regressions, see which features you can collect and train a simple decision tree.
That was assuming you do not have enough data, if you have then you go into deeplearning, just take a basic inceptionV3 model and retrain the last dense layer to classify between two classes with a softmax, or to predict the probablity if the hand being up/down with sigmoid. Check this link, Tensorflow got your back on the training of this one, pure already ready code to execute.
Questions? Ask away
Take a look at what leap frog has done with the oculus rift. I'm not sure what they're using internally to segment hand poses, but there is another paper that produces hand poses effectively. If you have a stereo camera setup, you can use this paper's methods: https://arxiv.org/pdf/1610.07214.pdf.
The only promising solutions I've seen for mono camera train on large datasets.
use Haar-Cascade classifier,
you can get the classifier model file then use it here.
Just search for 'Haarcascade detection of Palm in Google' or use below code.
import cv2
cam=cv2.VideoCapture(0)
ccfr2=cv2.CascadeClassifier('haar-cascade-files-master/palm.xml')
while True:
retval,image=cam.read()
grey=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
palm=ccfr2.detectMultiScale(grey,scaleFactor=1.05,minNeighbors=3)
for x,y,w,h in palm:
image=cv2.rectangle(image,(x,y),(x+w,y+h),(256,256,256),2)
cv2.imshow("Window",image)
if cv2.waitKey(1) & 0xFF==ord('q'):
cv2.destroyAllWindows()
break
del(cam)
Best of Luck for your experience using HaarCascade.

How can i match gestures and compare them?

I am developing a gesture recognition project. My goal is that the webcam captures my gestures and matches them with the existing gestures in my database. I have been able to capture hand gestures and store them in my project folder. Now, how exactly do i compare them? I am clueless about this part. I have gone through so many youtube links and most of them just show them how it works and none of them explains what algorithm they have used. I am completely stuck and all i want is some ideas or any possible link which can help me understand this matching part. Thanks
There are many different approaches that you can follow here.
If your images are of good quality, then you could detect feature points in your input image, and then match them with a "prior/template" representation of a similar gesture. This would be a brute-force search. Here, you can use SIFT to detect keypoints and generate descriptors for each image, and then match them based on the BFMatcher or FLANN. All of the above are implemented in OpenCV. Just read the documentation.
Docs here: detect/match
On the other hand, you could use a Bag-Of-Words approach. A good primer for that approach is here: BoW
You can use a classification machine learning algorithm like logistic regression.
This algorithm tries to minimize the cost function to predict a picture input similarity to all classes (all gestures in your case) and it'll pick the most similar class and give you that. for pictures you should use each pixel as a feature for your data.
After feeding your algorithm with enough training set it can classify your picture into one of the gestures, and as you said you are working with webcam images the running time wouldn't be that much.
Here is a great video for learning logistic regression by professor Andrew Ng of Stanford.

Detecting "city" background versus "desert" background in images using image processing/computer vision

I'm searching for algorithms/methods that are used to classify or differentiate between two outdoor environments. Given an image with vehicles, I need to be able to detect whether the vehicles are in a natural desert landscape, or whether they're in the city.
I've searched but can't seem to find relevant work on this. Perhaps because I'm new at computer vision, I'm using the wrong search terms.
Any ideas? Is there any work (or related) available in this direction?
I'd suggest reading Prince's Computer Vision: Models, Learning, and Inference (free PDF available). It covers image classification, as well as many other areas of CV. I was fortunate enough to take the Machine Vision course at UCL which the book was designed for and it's an excellent reference.
Addressing your problem specifically, a simple MAP or MLE model on pixel colours will probably provide a reasonable benchmark. From there you could look at more involved models and feature engineering.
Seemingly complex classifications similar to "civilization" vs "nature" might be able to be solved simply with the help of certain heuristics along with classification based on color. Like Gilevi said, city scenes are sure to contain many flat lines and right angles, while desert scenes are dominated by rolling dunes and so on.
To address this directly, you could use OpenCV's hough - lines algorithm on the images (tuned for this problem of course) and look at:
a) how many lines are fit to the image at a given threshold
b) of the lines that are fit what is the expected angle between two of them; if the angles are uniformly distributed then chances are its nature, but if the angles are clumped up around multiples of pi/2 (more right angles and straight lines) then it is more likely to be a cityscape.
Color components, textures, and degree of smoothness(variation or gradient of image) may differentiate the desert and city background. You may also try Hough transform, which is used for line detection that can be viewed as city feature (building, road, bridge, cars,,,etc).
I would recommend you this research very similar with your project. This article presents a comparison of different classification techniques to obtain the scene classifier (urban, highway, and rural) based on images.
See my answer here: How to match texture similarity in images?
You can use the same method. I already solved in the past problems like the one you described with this method.
The problem you are describing is that of scene categorization. Search for works that use the SUN database.
However, you only working with two relatively different categories, so I don't think you need to kill yourself implementing state-of-the-art algorithms. I think taking GIST features + color features and training a non-linear SVM would do the trick.
Urban environments is usually characterized with a lot of horizontal and vertical lines, GIST captures that information.

Hand sign detection

I am trying to identify static hand signs. Confused with the libraries and algorithms I can use for the project.
What need to it identify hand signs and convert in to text. I managed to get the hand contour.
Can you please tell me what is the best method to classify hand signs.
Is it haar classifier, adaboost classifier, convex hull, orientation histograms, SVM, shift algorithm, or any thing else.
And also pls give me some examples as well.
I tried opencv and emugcv both for image processing. what is best c++ or c# for a real time system.
Any help is highly appreciated.
Thanks
I have implemented a handtracking for web applications in my master deggree. Basically, you should follow those steps:
1 - Detect features of skin color in a Region of Interest. Basically, put a frame in the screen and ask for the user put the hand.
2 - You should have a implementation of a lucas kanade tracker method. Basically, this alghorithm will ensure that your features are not lost through the frames.
3 - Try get more features for each 3 frames interval.
The people use many approaches, so I cannot give a unique. You could make some research using Google Scholar and use the keywords "hand sign", "recognition" and "detection".
Maybe you find some code with the help of Google. An example, the HandVu: http://www.movesinstitute.org/~kolsch/HandVu/HandVu.html
The haar classifier (method of Viola-Jones) help to detect hand, not to recognize them.
Good luck in your research!
I have made the following with OpenCV. Algorithm:
Skin detection made in HSV
Thinning (if pixel has zero neighbor than set zero)
Thicking (if pixel has neighbor nonzero then set it nonzero)
See this Wikipedia page for the details of these.
You can find the best trained cascade to detect hand using OpenCV from the GitHub...
https://github.com/Aravindlivewire/Opencv/blob/master/haarcascade/aGest.xml
Good luck...

fingertip detection and tracking

i am working on a project detecting and tracking fingers. Though i find there is quiet a lot resource on this task, i haven't found a effective one yet :(.
So far i have thought of methods to detect hands as follow:
Haar training. But firstly we don't have a trained set(xml) as that in the face detection. Secondly, if we do the training ourselves, we don't have enough samples (i am still a college student)
skin color detection in HSV space. I have tried this one but the result has a lot of noises so cannot helps me continue the further detection on fingertip.
3.use Handvu. But i have heart that this lib is hard to set up and used in windows...
So in a word, can anyone give me any suggestions on how to detect hands effectively? (After that i may consider about detecting fingertips..)
Thanks!!
Here is a pretty in-depth paper on finger segmentation using Zernike moments. Here is a good paper on using Zernike moments for image recognition as a basis for the first paper.
Can you explain more about your experimental setup? Are you trying to track fingers against a cluttered background, or a plain cardboard sheet?
Haar like features perform very well for face recognition (the Viola Jones paper being a classic example) however I would not recommend them for your task. Although they can be computed fast using the integral image, they work well using a CASCADED Adaboost classification framework.
For skin colour detection, it depends on your setup. As a first step you could try doing background subtraction: simply learn the distribution (histogram) of pixels for foreground (ie. the hand) and the background and use these to do image segmentation.
I don't know what Handvu is
Zernike moments are also very good shape descriptors that are rotation invariant and can be made to be both scale and translation invariant.
I hope this helps!

Resources