I classified large number of images captured by the same sensor so images look to be similar in spectral response.
However, the result I get is different for the same class in two neighboring images (see the images).
classification result
I don't know if this is the problem of classifier or pixel values.
I checked pixel values which look consistent and the classifier is well tested but still the problem in there. Any idea where the problem might be?
Thank you
Related
i'm trying to detect some trees in a picture and I tryed different ways. At the end I chose a BOW+SVM approach, by creating a dictionary and then applying it to the .train(...) of my SVM. I found out that this isn't enough for what I have to do, since classification gives me wrong result. So I tried a treshold on my test images by applying a clustering for colors, but of course, grass is detected too, so it's not very helpful. My training images (for dictionary and SVM training) are all .png images (so tress without background) and that's why I'm trying to segment my test images (which instead have background) to isolate trees. I have tried some basic OpenCV segmentation tecniques but they don't seem to work. One method I tried is that to cluster the image by colors, keeping only the green part: then I tried to eliminate grass by deleting all the rows that have more than 70% of the pixels colored, but this strategy is ok for only some test images that have a big part of grass. Is there any other strategy I can follow? Thank you
I've done feature extraction using GLCM and k-nn for classification. What I need to do now is troubleshooting, to analyse why the images been classified wrongly. I want to display the nearest neighbor of the testing data, but not just points like below:
I want to display the images that nearest to that image(test), so that is easy to know why is that the images nearest to each other(visually). But here is my problem, I didn't know how to call back the images which been extracted before, since those are presented in array of numbers only.
What should I do?
The KNeighborsClassifier of Scikit-Learn has function kneighbors which returns the k-nearest-neigbors' distances and indices. It may help you to find your nearest images for each test image.
I'm implementing, for the first time, a sw for objects detection for static images. My first goal is to detect simple circles, then I'll move to more complex object. Unfortunately it seems I have problem when validating my classifier.
My choice was to use a HOG descriptor (using OpenCv) and a svm as classifier (using svmlight). The code compiles and works but there is something that sounds odd to me, probably concerning the svm.
I have:
a training set composed by 5 images 48x48px of different circles and 5 images 48x48px of non-circles (I know there are too few of them in order to have a solid classifier but, up to know, it's to test that everything works)
a test set composed by 4 images 48x48px (with circles as big as the ones used for the training) and 1 image much bigger (765x600px) with multiple size circles and other geometric forms.
What happens is that:
the circles in the test set are not detected when the images are 48x48, even if in the test set there are some images used in the training phase.
in the image 765x800 (which contains circles of any size) the circles which are of the same size of the training set, or bigger, are correctly identified.
I'm using the following parameters:
hog: winSize=48x48px, winStride=4x4px, cellSize=4px, blockSize=8px, blockStride=4x4px
classifier: svm regression with a linear classifier with C=0.01. (RBF results are worse than linear)
This is the api which performs the detections with the parameters I'm using.
vector<Rect> found;
double hitThreshold = 0.; // tolerance
Size padding(Size(32, 32));
double scale = 1.05;
int groupThreshold = 2;
hog.detectMultiScale(testImg, found, hitThreshold, win_stride, padding, scale, groupThreshold);
Is there any reason why the circles in the images 48x48px are not detected and the circles in the bigger image are detected? I expect 48x48px images to be correctly classified in order to validate the classifier. I have added the bigger image when nothing where detected in 48x48px images.
Besides, what sounds stranger is the fact that in the 48x48ps test set there are some images used in the training set and I think they must be identified, instead they are not! (I know that the training set and the test set must be different but I did that when nothing were detected.)
This is my first experience with hog descriptors and svm so it might not work because of a configuration error or the choice of the images..
Any help is welcome!
Thanks in advance :)
I have images of mosquitos similar to these ones and I would like to automatically circle around the head of each mosquito in the images. They are obviously in different orientations and there are random number of them in different images. some error is fine. Any ideas of algorithms to do this?
This problem resembles a face detection problem, so you could try a naïve approach first and refine it if necessary.
First you would need to recreate your training set. For this you would like to extract small images with examples of what is a mosquito head or what is not.
Then you can use those images to train a classification algorithm, be careful to have a balanced training set, since if your data is skewed to one class it would hit the performance of the algorithm. Since images are 2D and algorithms usually just take 1D arrays as input, you will need to arrange your images to that format as well (for instance: http://en.wikipedia.org/wiki/Row-major_order).
I normally use support vector machines, but other algorithms such as logistic regression could make the trick too. If you decide to use support vector machines I strongly recommend you to check libsvm (http://www.csie.ntu.edu.tw/~cjlin/libsvm/), since it's a very mature library with bindings to several programming languages. Also they have a very easy to follow guide targeted to beginners (http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf).
If you have enough data, you should be able to avoid tolerance to orientation. If you don't have enough data, then you could create more training rows with some samples rotated, so you would have a more representative training set.
As for the prediction what you could do is given an image, cut it using a grid where each cell has the same dimension that the ones you used on your training set. Then you pass each of this image to the classifier and mark those squares where the classifier gave you a positive output. If you really need circles then take the center of the given square and the radius would be the half of the square side size (sorry for stating the obvious).
So after you do this you might have problems with sizes (some mosquitos might appear closer to the camera than others) , since we are not trained the algorithm to be tolerant to scale. Moreover, even with all mosquitos in the same scale, we still might miss some of them just because they didn't fit in our grid perfectly. To address this, we will need to repeat this procedure (grid cut and predict) rescaling the given image to different sizes. How many sizes? well here you would have to determine that through experimentation.
This approach is sensitive to the size of the "window" that you are using, that is also something I would recommend you to experiment with.
There are some research may be useful:
A Multistep Approach for Shape Similarity Search in Image Databases
Representation and Detection of Shapes in Images
From the pictures you provided this seems to be an extremely hard image recognition problem, and I doubt you will get anywhere near acceptable recognition rates.
I would recommend a simpler approach:
First, if you have any control over the images, separate the mosquitoes before taking the picture, and use a white unmarked underground, perhaps even something illuminated from below. This will make separating the mosquitoes much easier.
Then threshold the image. For example here i did a quick try taking the red channel, then substracting the blue channel*5, then applying a threshold of 80:
Use morphological dilation and erosion to get rid of the small leg structures.
Identify blobs of the right size to be moquitoes by Connected Component Labeling. If a blob is large enough to be two mosquitoes, cut it out, and apply some more dilation/erosion to it.
Once you have a single blob like this
you can find the direction of the body using Principal Component Analysis. The head should be the part of the body where the cross-section is the thickest.
I'v got a binary classification problem. I'm trying to train a neural network to recognize objects from images. Currently I've about 1500 50x50 images.
The question is whether extending my current training set by the same images flipped horizontally is a good idea or not? (images are not symetric)
Thanks
I think you can do this to a much larger extent, not just flipping the images horizontally, but changing the angle of the image by 1 degree. This will result in 360 samples for every instance that you have in your training set. Depending on how fast your algorithm is, this may be a pretty good way to ensure that the algorithm isn't only trained to recognize images and their mirrors.
It's possible that it's a good idea, but then again, I don't know what's the goal or the domain of the image recognition. Let's say the images contain characters and you're asking the image recognition software to determine if an image contains a forward slash / or a back slash \ then flipping the image will make your training data useless. If your domain doesn't suffer from such issues, then I'd think it's a good idea to flip them and even rotate with varying degrees.
I have used flipped images in AdaBoost with great success in the course: http://www.csc.kth.se/utbildning/kth/kurser/DD2427/bik12/Schedule.php
from the zip "TrainingImages.tar.gz".
I know there are some information on pros/cons with using flipped images somewhere in the slides (at the homepage) but I can't find it. Also a great resource is http://www.csc.kth.se/utbildning/kth/kurser/DD2427/bik12/DownloadMaterial/FaceLab/Manual.pdf (together with the slides) going thru things like finding things in different scales and orientation.
If the images patches are not symmetric I don't think its a good idea to flip. Better idea is to do some similarity transforms to the training set with some limits. Another way to increase the dataset is to add gaussian smoothed templates to it. Make sure that the number of positive and negative samples are proportional. Too many positive and too less negative might skew the classifier and give bad performance on testing set.
It depends on what your NN is based on. If you are extracting rotation invariant features or features that do not depend on the spatial position within the the image (like histograms or whatever) and train your NN with these features, then rotating will not be a good idea.
If you are training directly on pixel values, then it might be a good idea.
Some more details might be useful.