Can Haar Cascade be too accurate to be useful in this situation? - opencv

I'm making a program to detect shapes from an r/c plane for a competition. I have no real images of the targets, but I do have computer generated examples of them on the rules.
My question is, can I train my program to detect real world objects based on computer generated shapes or should I find a different method to complete this task?
I would like to know before I foolishly generate 5k samples and find them useless in the end.
EDIT: I also don't know the exact color of the objects. If I feed the program samples of varying color, will it be a problem?
Thanks in advance!!
Edit2: Here's what groups from my school detected in previous years
As you can see, the detected images are not nearly as flawless as what would appear in real life. If you can suggest a better method, that would help.

If you think that the real images will have unique colors with simple geometric shapes then you could probably try to create a normalized Hue-histogram. Use it to train SVM classifier. The benefit of using Hue-histogram is that it will be rotational and scale invariant.
You can take the few precautions in mind:
Don't forget to remove the illumination affects.
Sometimes, White and black pixels create some problem in hue-histogram calculation so try to remove them from calculation by considering only those pixel which have S>0 and V>0 in S & V channels of HSV image.

I would rather suggest you to use the real world images because the performance is largely dependent upon training (my personal experience). And why don't you try to use SIFT/SURF descriptors for training to SVM (support vector machine) as SIFT/SURF are scale as well as rotational invariant.

Related

Tips on building a program detecting pupil in images

I am working on a project that aims to build a program which automatically gives a relatively accurate detection of pupil region in eye pictures. I am currently using simplecv in Python, given that Python is easier to experiment with. Since I just started, the eye pictures I am working with are fairly standardized. However, the size of iris and pupil as well as the color of iris can vary. And the position of the eye can shift a little among pictures. Here's a picture from wikipedia that is similar to the pictures I am using:
"MyStrangeIris.JPG" by Epicstessie is licensed under CC BY-SA 3.0
I have tried simple thresholding. Since different eyes have different iris colors, a fixed thresholding would not work on all pictures.
In addition, I tried simplecv's build-in sobel and canny edge detection, it's not working especially for eyes with darker iris. I also doubt that sobel or canny alone can solve the problem, given sometimes there are noises on the edge of the pupil (e.g., reflection of eyelash)
I have entry-level knowledge about image processing and machine learning. Right now, I am thinking about three possibilities:
Do a regression on the threshold value base on some variables
Make a specific mask only for edge detection for the pupil
classification on each pixel (this looks like lots of work to build the training set)
Am I on the right track? I would like to reach out to anyone with more experience on this type of problem. Any tips/suggestions are more than welcome. Thanks!
I think that for start you should put aside the machine learning. You have so much more to try in "regular" computer vision.
You need to try and describe a model for your problem. A good way to do this is to sit and think how you as a person detect iris. For example, i can think of:
It is near the center of image.
It is is Brown/green/blue circle, with distinct black center, surrounded by mostly white ellipse.
You have a skin color around the white ellipse.
It can't be too small or too large (depends on your images..)
After you build your model, try to find better ways to find these features. Hard to point on specific stuff, but you can start from: HSV color space, Correlation, Hough transform, Morphological operations..
Only after you feel you have exhausted all conventional tools, start thinking on features extraction and machine learning..
And BTW, because you are not the first person that try to detect iris, you can look at other projects for ideas.
I have written a small matlab code for image (link you have provided), function which i have used is hough transform for circle detection, which has also implemented in opencv, so porting will not create problem, i just want to know that i am on write way or not.
my result and code is as follows:
clc
clear all
close all
im = imresize(imread('irisdet.JPG'),0.5);
gray = rgb2gray(im);
Rmin = 50; Rmax = 100;
[centersDark, radiiDark] = imfindcircles(gray,[Rmin Rmax],'ObjectPolarity','dark');
figure,imshow(im,[])
viscircles(centersDark, radiiDark,'EdgeColor','b');
Input Image:
Result of Algorithm:
Thank You
Not sure about iris classification, but I've done written digit recognition from photos. I would recommend tuning up the contrast and saturation, then use a k-nearest neighbour algorithm to classify your images. Depending on your training set, you can get as high as 90% accuracy.
I think you are on the right track. Do image preprocessing to make classification easier, then train an algorithm of your choice. You would want to treat each image as one input vector though, instead of classifying each pixel!
I think you can try Active Shape Modelling or if you want a really feature rich modelling and do not care about the time it takes execute the algorithm you can try Active appearance modelling. You might want to look into these papers for better understanding:
Active Shape Models: Their Training and Application
Statistical Models of Appearance for Computer Vision - In Depth

How can HOG be used to detect individual body parts

Information:
I would like to use OpenCV's HOG detection to identify objects that can be seen in a variety of orientations. The only problem is, I can't seem to find a reasonable feature detector or classifier to detect this in a rotation and scale invaraint way (as is needed by objects such as forearms).
Prior Work:
Lets focus on forearms for this discussion. A forearm can have multiple orientations, the primary distinct features probably being its contour edges. It is possible to have images of forearms that are pointing in any direction in an image, thus the complexity. So far I have done some in depth research on using HOG descriptors to solve this problem, but I am finding that the variety of poses produced by forearms in my positives training set is producing very low detection scores in actual images. I suspect the issue is that the gradients produced by each positive image do not produce very consistent results when saved into the Histogram. I have reviewed many research papers on the topic trying to resolve or improvie this, including the original from Dalal & Triggs [Link]: http://lear.inrialpes.fr/people/triggs/pubs/Dalal-cvpr05.pdf It also seems that the assumptions made for detecting whole humans do not necessary apply to detecting individual features (particularly the assumption that all humans are standing up seems to suggest HOG is not a good route for rotation invariant detection like that of forearms).
Note:
If possible, I would like to steer clear of any non-free solutions such as those pertaining to Sift, Surf, or Haar.
Question:
What is a good solution to detecting rotation and scale invariant objects in an image? Particularly for this example, what would be a good solution to detecting all orientations of forearms in an image?
I use hog to detect human heads and shoulders. To train particular part you have to give the location of it. If you use opencv, you can clip samples containing only the training part you want, and make sure all training samples share the same size. For example, I clip images to contain only head and shoulder and resize all them to 64x64. Other opensource codes may require you to pass the location as the input parameter, essentially the same.
Are you trying the Discriminatively trained deformable part model ?http://www.cs.berkeley.edu/~rbg/latent/
you may find answers there.

find mosquitos' head in the image

I have images of mosquitos similar to these ones and I would like to automatically circle around the head of each mosquito in the images. They are obviously in different orientations and there are random number of them in different images. some error is fine. Any ideas of algorithms to do this?
This problem resembles a face detection problem, so you could try a naïve approach first and refine it if necessary.
First you would need to recreate your training set. For this you would like to extract small images with examples of what is a mosquito head or what is not.
Then you can use those images to train a classification algorithm, be careful to have a balanced training set, since if your data is skewed to one class it would hit the performance of the algorithm. Since images are 2D and algorithms usually just take 1D arrays as input, you will need to arrange your images to that format as well (for instance: http://en.wikipedia.org/wiki/Row-major_order).
I normally use support vector machines, but other algorithms such as logistic regression could make the trick too. If you decide to use support vector machines I strongly recommend you to check libsvm (http://www.csie.ntu.edu.tw/~cjlin/libsvm/), since it's a very mature library with bindings to several programming languages. Also they have a very easy to follow guide targeted to beginners (http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf).
If you have enough data, you should be able to avoid tolerance to orientation. If you don't have enough data, then you could create more training rows with some samples rotated, so you would have a more representative training set.
As for the prediction what you could do is given an image, cut it using a grid where each cell has the same dimension that the ones you used on your training set. Then you pass each of this image to the classifier and mark those squares where the classifier gave you a positive output. If you really need circles then take the center of the given square and the radius would be the half of the square side size (sorry for stating the obvious).
So after you do this you might have problems with sizes (some mosquitos might appear closer to the camera than others) , since we are not trained the algorithm to be tolerant to scale. Moreover, even with all mosquitos in the same scale, we still might miss some of them just because they didn't fit in our grid perfectly. To address this, we will need to repeat this procedure (grid cut and predict) rescaling the given image to different sizes. How many sizes? well here you would have to determine that through experimentation.
This approach is sensitive to the size of the "window" that you are using, that is also something I would recommend you to experiment with.
There are some research may be useful:
A Multistep Approach for Shape Similarity Search in Image Databases
Representation and Detection of Shapes in Images
From the pictures you provided this seems to be an extremely hard image recognition problem, and I doubt you will get anywhere near acceptable recognition rates.
I would recommend a simpler approach:
First, if you have any control over the images, separate the mosquitoes before taking the picture, and use a white unmarked underground, perhaps even something illuminated from below. This will make separating the mosquitoes much easier.
Then threshold the image. For example here i did a quick try taking the red channel, then substracting the blue channel*5, then applying a threshold of 80:
Use morphological dilation and erosion to get rid of the small leg structures.
Identify blobs of the right size to be moquitoes by Connected Component Labeling. If a blob is large enough to be two mosquitoes, cut it out, and apply some more dilation/erosion to it.
Once you have a single blob like this
you can find the direction of the body using Principal Component Analysis. The head should be the part of the body where the cross-section is the thickest.

Using flipped images for machine learning dataset

I'v got a binary classification problem. I'm trying to train a neural network to recognize objects from images. Currently I've about 1500 50x50 images.
The question is whether extending my current training set by the same images flipped horizontally is a good idea or not? (images are not symetric)
Thanks
I think you can do this to a much larger extent, not just flipping the images horizontally, but changing the angle of the image by 1 degree. This will result in 360 samples for every instance that you have in your training set. Depending on how fast your algorithm is, this may be a pretty good way to ensure that the algorithm isn't only trained to recognize images and their mirrors.
It's possible that it's a good idea, but then again, I don't know what's the goal or the domain of the image recognition. Let's say the images contain characters and you're asking the image recognition software to determine if an image contains a forward slash / or a back slash \ then flipping the image will make your training data useless. If your domain doesn't suffer from such issues, then I'd think it's a good idea to flip them and even rotate with varying degrees.
I have used flipped images in AdaBoost with great success in the course: http://www.csc.kth.se/utbildning/kth/kurser/DD2427/bik12/Schedule.php
from the zip "TrainingImages.tar.gz".
I know there are some information on pros/cons with using flipped images somewhere in the slides (at the homepage) but I can't find it. Also a great resource is http://www.csc.kth.se/utbildning/kth/kurser/DD2427/bik12/DownloadMaterial/FaceLab/Manual.pdf (together with the slides) going thru things like finding things in different scales and orientation.
If the images patches are not symmetric I don't think its a good idea to flip. Better idea is to do some similarity transforms to the training set with some limits. Another way to increase the dataset is to add gaussian smoothed templates to it. Make sure that the number of positive and negative samples are proportional. Too many positive and too less negative might skew the classifier and give bad performance on testing set.
It depends on what your NN is based on. If you are extracting rotation invariant features or features that do not depend on the spatial position within the the image (like histograms or whatever) and train your NN with these features, then rotating will not be a good idea.
If you are training directly on pixel values, then it might be a good idea.
Some more details might be useful.

how to recognize an same image with different size ?

We as human, could recognize these two images as same image :
In computer, it will be easy to recognize these two image if they are in the same size, so we have to make Preprocessing stage or step before recognize it, like scaling, but if we look deeply to scaling process, we will know that it's not an efficient way.
Now, could you help me to find some way to convert images into objects that doesn't deal with size or pixel location, to be input for recognition method ?
Thanks advance.
I have several ideas:
Let the image have several color thresholds. This way you get large
areas of the same color. The shapes of those areas can be traced with
curves which are math. If you do this for the larger and the smaller
one and see if the curves match.
Try to define key spots in the area. I don't know for sure how
this works but you can look up face detection algoritms. In such
an algoritm there is a math equation for how a face should look.
If you define enough object in such algorithms you can define
multiple objects in the images to see if the object match on the
same spots.
And you could see if the predator algorithm can accept images
of multiple size. If so your problem is solved.
It looks like you assume that human's brain recognize image in computationally effective way, which is rather not true. this algorithm is so complicated that we did not find it. It also takes a large part of your brain to deal with visual data.
When it comes to software there are some scale(or affine) invariant algorithms. One of such algorithms is LeNet 5 neural network.

Resources