ORB feature descriptor - opencv

I need to program my bot so that it is able to find an object that is it asked to pickup and bring it to the commanded position. I have tried simple img processing techniques like filtering, contour finding. that doesn't seems to work well. I want to use ORB feature extractor. here are a sample images. object of interest is the ball. In short how do I train my bot to pickup balls or other objects any sample program will be helpful. how to use ORB. provide an example if possible. thanx in advance
http://i.stack.imgur.com/spobV.jpg
http://i.stack.imgur.com/JNH1T.jpg

You can try the learning based algorithms like Haar-classifier to detect any object. Thanks to OpenCV all the training process is very streamlined. All you have to do is to is to train your classifier with some true-image(image of the object) and false images(any possible image not having the object.).
Below are some links for your refrence.
Haar trainer for Ball-Pen detection: http://opencvuser.blogspot.com/2011/08/creating-haar-cascade-classifier-aka.html
Haar Trainer for Banana Detection :) :http://coding-robin.de/2013/07/22/train-your-own-opencv-haar-classifier.html

Related

Need advice for object detection and motion classification on real time video

I'm in research for my final project, i want to make object detection and motion classification like amazon go, i have read lot of research like object detection with SSD or YOLO and video classification using CNN+LSTM, i want to propose training algorithm like this:
Real time detection for multiple object (in my case: person) with SSD/YOLO
Get the boundary object and crop the frame
Feed cropped frame info to CNN+LSTM algo to make motion prediction (if the person's walking/taking items)
is it possible to make it in real-time environment?
or is there any better method for real-time detection and motion classification
If you want to use it in real-time application, several other things must be considered which are not appeared before implementation of algorithm in real environment.
About your 3-step proposed method, it already could be result in a good method, but the first step would be very accurate. I think it is better to combine the 3 steps in one step. Because the motion type of person is a good feature of a person. Because of that, I think all steps could be gathered in one step.
My idea is as follows:
1. a video classification dataset which just tag the movement of person or object
2. cnn-lstm based video classification method
This would solve your project properly.
This answer need to more details, if u interested in, I can answer u in more details.
Had pretty much the same problem. Motion prediction does not work that well in complex real-life situations. Here is a simple one:
(See in action)
I'm building a 4K video processing tool (some examples). Current approach looks like the following:
do rough but super fast segmentation
extract bounding box and shape
apply some "meta vision magic"
do precise segmentation within identified area
(See in action)
As of now the approach looks way more flexible comparing to motion tracking.
"Meta vision" intended to properly track shape evolution:
(See in action)
Let's compare:
Meta vision disabled
Meta vision enabled

people detection with haar cascade

I am working on a project in my school to detect how many students are in the classroom. Like in this picture.
I have been trying to use Haar Cascade in opencv for face detection to detect people, but the result is very bad. Like this:
I took thousands of pictures in classroom, and cropped the picture with people manually. There are about 4000 positive samples and 12000 negative samples. I was wondering what did I do wrong?
When I crop the image, should I only crop only head like this?
Or like this with body?
I think I had enough training samples, and I follow the exact procedure with this post:
http://note.sonots.com/SciSoftware/haartraining.html#v6f077ba
which should be working.
Or should I use a different algorithm like HOG or SVM. Any suggestion would be great for me, I have been stuck in this for months and don't have any clue. Thanks a lot!
Haar is better for human face. Hog with SVM is classic for human detection and there've been lots of source and blogs about them, it's not hard to train a classifier. For your scene, I think 'head and shoulder' is better than 'head alone'. But your multi-view samples increase the difficulty. A facing cam would be better. Add more hard neg samples if you always have much more false positive alarms.
This paper may help:
http://irip.buaa.edu.cn/~zxzhang/papers/icip2009-1.pdf
Normally, with Haar cascade, the result is very different when we change the parameters when we train the classifier. In my case, that object is very simple, but it cannot detect too:
opencv haar cascade for Varikont detection
When I changed the parameters, it can detect very nice.
You can have a look here: https://inspirit.github.io/jsfeat/sample_haar_face.html
Or more special and more professional, you can research about Bag of Visual Words (SIFT/SURF + SVM or SGD).
Personally, I think you don't need to use the method complex for person detection.
Regards,

HOG for "detecting object" opencv

I would like to know, if there is any code or any good documentation available for implementing HOG features? I tried to read the documentation here but it's quite difficult to understand and it needs SVM..
What I need is just to implement a HOG detector for objects.... Like what it does SIFT or SURF
Btw, I'm not interesting in this work.
Thank you..
you can take a look at
http://szproxy.blogspot.com/2010/12/testtest.html
he also published "tutorial" for HOG on source forge here:
http://sourceforge.net/projects/hogtrainingtuto/?_test=beta
I know this since I'm having the same problem as you. The tutorial though isn't what i would call a tutorial, its a bunch of source codes, no documentation, but I assume that it works and can at least get you somewhere.
At the end and simplifying a bit, all that you need to detect specific objects in image is:
Localize "points of interest" to extract the patches:
In order to get points of interest, you can use some algorithms like Harris corner detector, randomly or something simply like sliding windows.
From these points get patches:
You will have to take the decission of the patch size.
From these patches compute the feature descriptor. (like HOG).
Instead of HOG you can use another feature descriptor like SIFT, SURF...
HOG's implementation is not too hard. You have to calculate the gradients of the extracted patch doing applying Sobel X and Y kernels, after that you have to divide the patch in NxM cells, 8x8 for instance, and compute an histogram of gradients, angle and magnitude. In the following link you can see it more detailed explanation:
HOG Person Detector Tutorial
Check your feature vector in the previously trained classifier
Once you got this vector, check if it is the desired object or not with a previously trained classifier like SMV. Instead SVM you could use NeuralNetworks for instance.
SVM implementation is more dificult, but there are some libraries like opencv that you can use.
There is a function extractHOGFeatures in the Computer Vision System Toolbox for MATLAB.

Head (and shoulder) detection using OpenCV

(Appology in advance if I am asking a too nwbie question. I am a beginner with OpenCV. I have done some tutorials yet I have not a good grasp of it's concepts.)
Question: How to do head detection (not face detection) using OpenCV - For example in a photo of inside a bus or a room?
Note: I do not want to do face detection; just head detection to figure out number of people in the photo. Unfortunately - for me - those tutorials and documents that I'v found are about face detection and not head detection.
Thank you
Look at all the Haar boosted classifiers that are available with OpenCV and the dedicated class CascadeClassifier to use it. Here are a list of what the classifiers have locally:
haarcascade_eye.xml
haarcascade_lefteye_2splits.xml
haarcascade_mcs_righteye.xml
haarcascade_eye_tree_eyeglasses.xml
haarcascade_lowerbody.xml
haarcascade_mcs_upperbody.xml
haarcascade_frontalface_alt.xml
haarcascade_mcs_eyepair_big.xml
haarcascade_profileface.xml
haarcascade_frontalface_alt2.xml
haarcascade_mcs_eyepair_small.xml
haarcascade_righteye_2splits.xml
haarcascade_frontalface_alt_tree.xml
haarcascade_mcs_lefteye.xml
haarcascade_upperbody.xml
haarcascade_frontalface_default.xml
haarcascade_mcs_mouth.xml
haarcascade_fullbody.xml
haarcascade_mcs_nose.xml
The two I bolded may be of special interest to you. Try those as a start for your project. As Alessandro Vermeulen commented, head detection classifiers may also be interesting, as what they find is usely connected to shoulders :-)
You can as well create your own cascade classifier to detect heads. The upper body is not the head at all, but just heads is not that accurate. You need to crop several number of positive samples and negative ones. Prepare list of these example in text.txt, opencv_createsamples.exe(prepare input vector for training) and use opencv_traincascade.exe command line utility to create opencv classifier for cascade detect multiscale.
It is easy but the creating the dataset is time consuming. My head LBP cascade is available here for free download link to my blog for head, car and people cascade . It is compatible with detectMultiscale, but not that accurate.

How to search the image for an object with SIFT and OpenCV?

i am working on a simple playing card detecting programme.
For now i have a working Sift Algorithmus from here.
And i have created some bounding boxes around the cards.
Then i used Sift on the card to be searched for and saved the descriptors.
But what to do next? Do i have to make a mask of the object and run with it through the bounding box while running Sift in every step?
Couldn't find any tutorial on how to do that exactly.
Hope someone can help me!
Greets Max
edit: I want to recognize every card, so i can say like: it's a heart 7 or so.
SIFT is just the beginning.
SIFT is a routine to obtain interest points on object. You have to use Bag of Words approach. Cluster the SIFT features you collected and represent each feature in terms of your cluster means. Represent each card as histogram of these cluster means (aka. bag of words).
Once you have the representation of the cards ready (what #nimcap says), you then you need to do the recognition itself. You can try nearest neighbors, SVM, etc.
Also, for a better description (more technical) of what to do you might want to look at Lowe's original 2004 SIFT paper.
Is SIFT the best approach for something like this ?
As opposed to Haar classifiers or just simple template matching.
eg http://digital.liby.waikato.ac.nz/conferences/ivcnz07/papers/ivcnz07-paper51.pdf

Resources