I'm using libSVM to train a binary classifier on 38 training instances consisting of ~250 features.
Before training I scale the data to [0,1] and perform a grid search to find the best parameters. However, I get the same accuracy results for all combinations of settings. I'm wondering if this indicates a problem, and if so what problem?
Thanks in advance!
Related
I am working on a Time Series Dataset where i want to do forcasting and prediction both. So, if you have any suggestion please share. Thank You!
T-Smote
This allows one to both impute fully missing observations to allow uniform time series classification across the entire data and, in special cases, to impute individually missing features. To do so, we slightly generalize the well-known class imbalance algorithm SMOTE to allow component wise nearest neighbor interpolation that preserves correlations when there are no missing features. We visualize the method in the simplified setting of 2-dimensional uncoupled harmonic oscillators. Next, we use tSMOTE to train an Encoder/Decoder long-short term memory (LSTM) model with Logistic Regression for predicting and classifying distinct trajectories of different 2D oscillators.
I have train dataset and test dataset from two different sources. I mean they are from two different experiments but the results of both of them are same biological images. I want to do binary classification using deep CNN and I have following results on test accuracy and train accuracy. The blue line shows train accuracy and the red line shows test accuracy after almost 250 epochs. Why the test accuracy is almost constant and not raising? Is that because Test and Train dataset are come from different distributions?
Edited:
After I have add dropout layer, reguralization terms and mean subtraction I still get following strange results which says the model is overfitting from the beginning!
There could be 2 reasons. First you overfit on the training data. This can be validated by using the validation score as a comparison metric to the test data. If so you can use standard techniques to combat overfitting, like weight decay and dropout.
The second one is that your data is too different to be learned like this. This is harder to solve. You should first look at the value spread of both images. Are they both normalized. Matplotlib normalizes automatically for plotted images. If this still does not work you might want to look into augmentation to make your training data more similar to the test data. Here I can not tell you what to use, without seeing both the trainset and the testset.
Edit:
For normalization the test set and the training set should have a similar value spread. If you do dataset normalization you calculate mean and std on training set. But you also need to use those calculated values on the test set and not calculate the test set values from the test set. This only makes sense if the value spread is similar for both the training and test set. If this is not the case you might want to do per sample normalization first.
Other augmentation that are commonly used for every dataset are oversampling, random channel shifts, random rotations, random translation and random zoom. This makes you invariante to those operations.
I am new to Text Mining. I am working on Spam filter. I did text cleaning, removed stop words. n-grams are my features. So I build a frequency matrix and build model using Naive Bayes. I have very limited set of training data, so I am facing the following problem.
When a sentence comes to me for classification and if none of its features match with the existing features in training then my frequency vector has only zeros.
When I send this vector for classification, I obviously get a useless result.
What can be ideal size of training data to expect better results?
Generally, the more data you have, the better. You will get diminishing returns at some point. It is often a good idea to see if your training set size is a problem by plotting the cross validation performance while varying the size of the training set. In scikit-learn has an example of this type of "learning curve."
Scikit-learn Learning Curve Example
You may consider bringing in outside sample posts to increase the size of your training set.
As you grow your training set, you may want to try reducing the bias of your classifier. This could be done by adding n-gram features, or switching to a logistic regression or SVM model.
When a sentence comes to me for classification and if none of its features match with the existing features in training then my frequency vector has only zeros.
You should normalize your input so that it forms some kind of rough distribution around 0. A common method is to do this tranformation:
input_signal = (feature - feature_mean) / feature_stddev
Then all zeroes would only happen if all features were exactly at the mean.
I am currently working on age estimation and extracting features using Gabor Filters. I have decided to classify ages using a SVM, the training is successful and it does not take a long time since the feature vectors are only about 3000 in dimensions by a 1000 training samples.
The problem is that when I want to predict an image, the result returned is always 18 (I pass the feature vector of the testing image to the predict function of the SVM). But when I predict an age from the training set the result is always correct. I do not know why this is happening. Any help will be appreciated.
Also, an observation is that whether or not I change and/ or include the SVMParams the prediction still outputs the same value.
I am using CvSVM to classify only two types of facial expression. I used LBP(Local Binary Pattern) based histogram to extract features from the images, and trained using cvSVM::train(data_mat,labels_mat,Mat(),Mat(),params), where,
data_mat is of size 200x3452, containing normalized(0-1) feature histogram of 200 samples in row major form, with 3452 features each(depends on number of neighbourhood points)
labels_mat is corresponding label matrix containing only two value 0 and 1.
The parameters are:
CvSVMParams params;
params.svm_type =CvSVM::C_SVC;
params.kernel_type =CvSVM::LINEAR;
params.C =0.01;
params.term_crit=cvTermCriteria(CV_TERMCRIT_ITER,(int)1e7,1e-7);
The problem is that:-
while testing I get very bad result (around 10%-30% accuracy), even after applying with different kernel and train_auto() function.
CvSVM::predict(test_data_mat,true) gives 'NaN' output
I will greatly appreciate any help with this, it's got me stumped.
I suppose, that your classes linearly hard/non-separable in feature space you use.
May be it will be better to apply PCA to your dataset before classifier training step
and estimate effective dimensionality of this problem.
Also I think it will be userful test your dataset with other classifiers.
You can adapt for this purpose standard opencv example points_classifier.cpp.
It includes a lot of different classifiers with similar interface you can play with.
The SVM generalization power is low.In the first reduce your data dimension by principal component analysis then change your SVM kerenl type to RBF.