I used SIFT keypoint descriptors for detecting objects in an image. For that, I used best matches and calculated homography matrix.
Using this homography matrix, I found where the object lies in test image.
Now, for samples where object could not be found which has to be checked manually, what could be the measure which can help to distinguish between negative and positive samples.
Presently, using determinant of homography matrix we are separating the samples. Is there a better measure ?
You may use the number of point correspondences(filtered) as a measure which can help to distinguish between negative and positive samples.
Because positive samples always have much more point correspondences than negative samples.
Related
1) In eigenface approach the eigenfaces is a combination of elements from different faces. What are these elements?
2) The output face is an image composed of different eigenfaces with different weights. What does the weights of eigenfaces exactly mean? I know that the weight is percentage of eigenfacein the image, but what does it mean exactly, is mean the number of selected pixels?
Please study about PCA to understand what is the physical meaning of eigenfaces, when PCA is applied to an image. The answer lies in the understanding of eigenvectors and eigenvalues associated with PCA.
EigenFaces is based on Principal Component Analysis
Principal Component Analysis does dimensionality reduction and finds unique features in the training images and removes the similar features from the face images
By getting unique features our recognition task gets simpler
By using PCA you calculate the eigenvectors for your face image data
From these eigenvectors you calculate EigenFace of every training subject or you can say calculating EigenFace for every class in your data
So if you have 9 classes then the number of EigenFaces will be 9
The weight usually means how important something is
In EigenFaces weight of a particular EigenFace is a vector which just tells you how important that particular EigenFace is in contributing the MeanFace
Now if you have 9 EigenFaces then for every EigenFace you will get exactly one Weight vector which will be of N dimension where N is number of eigenvectors
So every element out N elements in one weight vector will tell you how important that particular eigenvector is for that corresponding EigenFace
The facial Recognition in EigenFaces is done by comparing the weights of training images and testing images with some kind of distance function
You can refer this github link: https://github.com/jayshah19949596/Computer-Vision-Course-Assignments/blob/master/EigenFaces/EigenFaces.ipynb
The code on the above link is a good documented code so If you know the basics you will understand the code
I have a feature vector in matrix notation, and I have data points in 2D plane. How to find whether my data points are linearly separable with that feature vector?
One can check whether there exists a line divides the data points into two. If there isn't a line, how to check for linear separability in higher dimensions?
A theoretical answer
If we assume the samples of the two classes are distributed according to a Gaussian, we will get a quadratic function describing the decision boundary in the general case.
If the covariance matrices are identical we get a linear decision boundary.
A practical answer
See the SO discussion here.
I am using SIFT feature detector and descriptor. I am matching the points between two images. I am using findHomography() function of OpenCV with the RANSAC method.
When I read about the RANSAC algorithm, it is said that adjusting a threshold parameter for RANSAC can improve the results. But I don't want to hardcode any parameter.
I know RANSAC is removing outliers in matches. Can anyone let me know if removing outliers (not all of them) with basic methods before applying homography improves the result of homography?
If so, how can we apply an operation before RANSAC to remove outliers?
What is your definition of a good result? RANSAC is about a tradeoff between the number of points and their precision, so there is no uniform definition of good: you have more inliers if their accuracy is worse and vice versa.
The parameter you are talking about is probably an outlier threshold and it may be just badly tuned so you have too many approximate inliers or too few super accurate inliers. Now, if you pre-filter your outliers you will just speed up your RANSAC but unlikely to improve the solution. Eventually the speed of RANSAC with Homography boils down to the probability of selecting 4 inliers and when their proportion is higher the convergence is more speedy.
The other methods to sort out outliers before applying RANSAC is to look at simpler constraints such as ordering of points, straight lines still being straight lines, cross-ratio and other invariants of Homography transformation. Finally you may want to use higher level features such as lines to calculate homography. Note, that in homogeneous coordinates when points are transformed as p2=H*p1, the lines are transformed as l2 = H-t * l1. This can actually increase the accuracy (since lines are macro features and are less noisy than point of interests) while straight lines can be detected via a Hough transform.
No, the whole point of RANSAC and related algorithms is to remove outliers.
However, it is possible to refine the algorithm in ways that avoid the definition of somewhat arbitrary threshold.
A good starting point is Torr's old MLESAC paper
I'm trying to implement the original and circular Local Binary Pattern (LBP) with uniform pattern mapping for face recognition application.
I've done with LBP descriptors extraction and spatial histogram construction steps so far. Now I have to work on the face classification and recognition phases. As the original paper in the subject suggest, the simplest classifier uses Chi-square statistic as a dissimilarity measure between 2 histograms of 2 face images. The formula seems straightforward, but I don't know how I can classify 2 histograms are representations of the same face or of different faces based on the resulting value of Chi-square dissimilarity measure. So my question is: What is the optimal threshold value which I can use as the border line between the same faces and different faces? How can I determine that value?
I've come across some source code on the internet and they set LBP threshold to 180.0. I have no idea where this value came from.
I would gratefully appreciate your helps. Thanks for your reading.
In the same/not-same setting, you learn the optimal threshold from the training set. Given, say 1000 same and 1000 not same pairs for training, run a for loop on the threshold. For each threshold value, calculate the precision as 0.5 * (percent of same pairs with distance < current threshold) + 0.5 * (percent of not same pairs with distance >= currentThreshold). Then, keep track of the optimal threshold.
By the way, for same/not-same setting, I would recommend considering using one-shot-similarity
In the original paper of HOG (Histogram of Oriented Gradients) http://lear.inrialpes.fr/people/triggs/pubs/Dalal-cvpr05.pdf there are some images, which shows the hog representation of an image (Figure 6).In this figure the f, g part says "HOG descriptor weighted by respectively the positive and the negative SVM weights".
I don't understand what does this mean. I understand that when I train a SVM, I get a Weigth vector, and to classify, I have to use the features (HOG descriptors) as the input of the function. So what do they mean by positive and negative weigths? And how would I plot them like the paper?
Thanks in Advance.
The weights tell you how significant a specific element of the feature vector is for a given class. That means that if you see a high value in your feature vector you can lookup the corresponding weight
If the weight is a high positiv number it's more likely that your object is of the class
If your weight is a high negative number it's more likely that your object is NOT of the class
If your weight is close to zero this position is mostly irrelavant for the classification
Now your using those weights to scale the feature vector you have where the length of the gradients are mapped to the color-intensity. Because you can't display negative color intensities they decided to split the positive and negative visualization. In the visualizations you can now see which parts of the input-image contributes to the class (positiv) and which don't (negative).