Using SVM with opencv? - opencv

I have feature vector of size 100 .Total training samples are 500 in which there are 10 samples of each class.I want to design a separate svm classifier for each class.That is classifier of each class will be fed with 10 positive(for that class) and 490 negative instances.
My opencv code is as follows
For training:
Mat trainingDataMat(500, 100, CV_32FC1, trainingData);//trainingData is 2D MATRIX
Mat labelsMat(500, 1, CV_32FC1, labels);//10 positive and 490 -ve labels
CvSVMParams params;
params.svm_type = SVM::C_SVC;
params.kernel_type = SVM::RBF;
CvSVM SVM;
SVM.train_auto(trainingDataMat, labelsMat, Mat(), Mat(), params,5);
SVM.save(name);
For testing
Mat sampleMat(1, size, CV_32FC1, testing_vector);// testing_vector is 1D vector
CvSVM SVM;
SVM.load(name);
float response = SVM.predict(sampleMat);
The problem is that the classifier for class outputs -1 even when I give positive testing sample from the training set and same is the case for other testing samples.
I also tried ONE_CLASS svm but it gives 0 for every testing sample.
Where am I going wrong or what svm type should I use?Please explain with code if possible.
Thank you in advance.

It seems you've missed the normalization step. SVM classifier in OpenCV uses the same as libsvm, and if you read the documentation of libsvm it says you should normalize your train data in the interval [-1,1] and get scale parameters. Then use those scale parameters to scale your test data. This might be the one problem. Or it can be because of non-equivalent number of positive and negative samples. Did tried to classify your train data as a cross validation, after you have trained the SVM?

Try use linear kernel and approximately equal positives and negatives, for each class. You can ajust precision/recall by setting values of gamma and cost parameters. Take a look at: The gamma and cost parameter of SVM

Related

TensorFlow: Implementing a class-wise weighted cross entropy loss?

Assuming after performing median frequency balancing for images used for segmentation, we have these class weights:
class_weights = {0: 0.2595,
1: 0.1826,
2: 4.5640,
3: 0.1417,
4: 0.9051,
5: 0.3826,
6: 9.6446,
7: 1.8418,
8: 0.6823,
9: 6.2478,
10: 7.3614,
11: 0.0}
The idea is to create a weight_mask such that it could be multiplied by the cross entropy output of both classes. To create this weight mask, we can broadcast the values based on the ground_truth labels or the predictions. Some mathematics in my implementation:
Both labels and logits are of shape [batch_size, height, width, num_classes]
The weight mask is of shape [batch_size, height, width, 1]
The weight mask is broadcasted to the num_classes number of channels of the multiplication between the softmax of the logit and the labels to give an output shape of [batch_size, height, width, num_classes]. In this case, num_classes is 12.
Reduce sum for each example in a batch, then perform reduce mean for all examples in one batch to get a single scalar value of loss.
In this case, should we create the weight mask based on the predictions or the ground truth?
If we build it based on the ground_truth, then it means no matter what the predicted pixel labels are, they get penalized based on the actual labels of the class, which doesn't seem to guide the training in a sensible way.
But if we build it based on the predictions, then for whatever logit predictions that are produced, if the predicted label (from taking the argmax of the logit) is dominant, then the logit values for that pixel will all be reduced by a significant amount.
--> Although this means the maximum logit will still be the maximum since all of the logits in the 12 channels will be scaled by the same value, the final softmax probability of the label predicted (which is still the same before and after scaling), will be lower than before scaling (did some simple math to estimate). --> a lower loss is predicted
But the problem is this: If a lower loss is predicted as a result of this weighting, then wouldn't it contradict the idea that predicting dominant labels should give you a greater loss?
The impression I get in total for this method is that:
For the dominant labels, they are penalized and rewarded much lesser.
For the less dominant labels, they are rewarded highly if the predictions are correct, but they're also penalized heavily for a wrong prediction.
So how does this help to tackle the issue of class-balancing? I don't quite get the logic here.
IMPLEMENTATION
Here is my current implementation for calculating the weighted cross entropy loss, although I'm not sure if it is correct.
def weighted_cross_entropy(logits, onehot_labels, class_weights):
if not logits.dtype == tf.float32:
logits = tf.cast(logits, tf.float32)
if not onehot_labels.dtype == tf.float32:
onehot_labels = tf.cast(onehot_labels, tf.float32)
#Obtain the logit label predictions and form a skeleton weight mask with the same shape as it
logit_predictions = tf.argmax(logits, -1)
weight_mask = tf.zeros_like(logit_predictions, dtype=tf.float32)
#Obtain the number of class weights to add to the weight mask
num_classes = logits.get_shape().as_list()[3]
#Form the weight mask mapping for each pixel prediction
for i in xrange(num_classes):
binary_mask = tf.equal(logit_predictions, i) #Get only the positions for class i predicted in the logits prediction
binary_mask = tf.cast(binary_mask, tf.float32) #Convert boolean to ones and zeros
class_mask = tf.multiply(binary_mask, class_weights[i]) #Multiply only the ones in the binary mask with the specific class_weight
weight_mask = tf.add(weight_mask, class_mask) #Add to the weight mask
#Multiply the logits with the scaling based on the weight mask then perform cross entropy
weight_mask = tf.expand_dims(weight_mask, 3) #Expand the fourth dimension to 1 for broadcasting
logits_scaled = tf.multiply(logits, weight_mask)
return tf.losses.softmax_cross_entropy(onehot_labels=onehot_labels, logits=logits_scaled)
Could anyone verify whether my concept of this weighted loss is correct, and whether my implementation is correct? This is my first time getting acquainted with a dataset with imbalanced class, and so I would really appreciate it if anyone could verify this.
TESTING RESULTS: After doing some tests, I found the implementation above results in a greater loss. Is this supposed to be the case? i.e. Would this make the training harder but produce a more accurate model eventually?
SIMILAR THREADS
Note that I have checked a similar thread here: How can I implement a weighted cross entropy loss in tensorflow using sparse_softmax_cross_entropy_with_logits
But it seems that TF only has a sample-wise weighting for loss but not a class-wise one.
Many thanks to all of you.
Here is my own implementation in Keras using the TensorFlow backend:
def class_weighted_pixelwise_crossentropy(target, output):
output = tf.clip_by_value(output, 10e-8, 1.-10e-8)
with open('class_weights.pickle', 'rb') as f:
weight = pickle.load(f)
return -tf.reduce_sum(target * weight * tf.log(output))
where weight is just a standard Python list with the indexes of the weights matched to those of the corresponding class in the one-hot vectors. I store the weights as a pickle file to avoid having to recalculate them. It is an adaptation of the Keras categorical_crossentropy loss function. The first line simply clips the value to make sure we never take the log of 0.
I am unsure why one would calculate the weights using the predictions rather than the ground truth; if you provide further explanation I can update my answer in response.
Edit: Play around with this numpy code to understand how this works. Also review the definition of cross entropy.
import numpy as np
weights = [1,2]
target = np.array([ [[0.0,1.0],[1.0,0.0]],
[[0.0,1.0],[1.0,0.0]]])
output = np.array([ [[0.5,0.5],[0.9,0.1]],
[[0.9,0.1],[0.4,0.6]]])
crossentropy_matrix = -np.sum(target * np.log(output), axis=-1)
crossentropy = -np.sum(target * np.log(output))

CvSVM does not train when svm_type is NU_SVC

I am extracting SURF feature descriptors from face images.
Num of Classes = 29 (So class_labels in trainingData are from 1-29)
Num of training images = 3000+
I want to use CvSVM::NU_SVC in OpenCV 2.4.8 to train SVM classifier on the 64-D feature vectors.
But when I call the "train" function, it returns "false". Basically, it doesnt do the training.
I followed a procedure very similar to
http://docs.opencv.org/doc/tutorials/ml/introduction_to_svm/introduction_to_svm.html
Am I missing anything?

How to classify text with scikit's SVM?

I have a text classification task. By now i only tagged a corpus and extracted some features in a bigram format (i.e bigram = [('word', 'word'),...,('word', 'word')]. I would like to classify some text, as i understand SVM algorithm only can receive vectors in orther to classify, so i use some vectorizer in scikit as follows:
bigram = [ [('load', 'superior')
('point', 'medium'), ('color', 'white'),
('the load', 'tower')]]
fh = FeatureHasher(input_type='string')
X = fh.transform(((' '.join(x) for x in sample)
for sample in bigram))
print X
the output is a sparse matrix:
(0, 226456) -1.0
(0, 607603) -1.0
(0, 668514) 1.0
(0, 715910) -1.0
How can i use the previous sparse matrix X to classify with SVC?, assuming that i have 2 classes and a train and test sets.
As others have pointed out, your matrix is just a list of feature vectors for the documents in your corpus. Use these vectors as features for classification. You just need classification labels y and then you can use SVC().fit(X, y).
But... the way that you have asked this makes me think that maybe you don't have any classification labels. In this case, I think you want to be doing clustering rather than classification. You could use one of the clustering algorithms to do this. I suggest sklearn.cluster.MiniBatchKMeans to start. You can then output the top 5-10 words for each cluster and form labels from those.

Label Propagation in sklearn is classifying every vector as 1

I have 2000 labelled data (7 different labels) and about 100K unlabeled data and I am trying to use sklearn.semi_supervised.LabelPropagation. The data has 1024 dimensions. My problem is that the classifier is labeling everything as 1. My code looks like this:
X_unlabeled = X_unlabeled[:10000, :]
X_both = np.vstack((X_train, X_unlabeled))
y_both = np.append(y_train, -np.ones((X_unlabeled.shape[0],)))
clf = LabelPropagation(max_iter=100).fit(X_both, y_both)
y_pred = clf.predict(X_test)
y_pred is all ones. Also, X_train is 2000x1024 and X_unlabeled is a subset of the unlabeled data which is 10000x1024.
I also get this error upon calling fit on the classifier:
/usr/local/lib/python2.7/site-packages/sklearn/semi_supervised/label_propagation.py:255: RuntimeWarning: invalid value encountered in divide
self.label_distributions_ /= normalizer
Have you tried different values for the gamma parameter ? As the graph is constructed by computing an rbf kernel, the computation includes an exponential and the python exponential functions return 0 if the value is a too big negative number (see http://computer-programming-forum.com/56-python/ef71e144330ffbc2.htm). And if the graph is filled with 0, the label_distributions_ is filled with "nan" (because of normalization) and a warning appears. (be careful, the gamma value in scikit implementation is multiplied to the euclidean distance, it's not the same thing as in the Zhu paper.)
The LabelPropagation will finally be fixed in version 0.19

Feeding extracted HoG features into CvSVM's train function

This is a silly question since I'm quite new to SVM,
I've managed to extract features and locations using OpenCV's HoGDescriptor:
vector< float > features;
vector< Point > locations;
hog_descriptors.compute( image, features, Size(0, 0), Size(0, 0), locations );
Then I proceed to use CvSVM to train the SVM based on the features I've extracted.
Mat training_data( features );
CvSVM svm;
svm.train( training_data, labels, Mat(), Mat(), params );
Which gave me an error:
OpenCV Error: Bad argument (There is only a single class) in cvPreprocessCategoricalResponses, file /opt/local/var/macports/build/
My question is that, how do I convert the vector < features > into appropriate matrix to be fed into CvSVM ? Obviously I am doing something wrong, the OpenCV's tutorial shows that a 2D matrix containing the training data is fed into SVM. So, how do I convert vector < features > into a 2D matrix, what are the values in the 2nd dimension ?
What are these features exactly ? Are they the 9 bins consisting of normalized magnitude histograms ?
I found out the issue, since I was testing whether it is correct to pass feature vectors into the SVM in order to train it, I didn't bother to prepare both negative and positive samples.
Yet, CvSVM requires at least 2 different classes for training, that's why the error it threw.
Thanks a lot anyway !

Resources