Feature selection - r-caret

I am trying to find a useful feature selection method on a set of 20000 genes from an expression set(microarray) to get a model with the useful genes only.
I tried using RFE from caret but I got a stackOverflow since backward selection does not support data where n(predictors) > n(samples).
Could anyone suggest a reasonable method to do so, or a solution for this RFE selection method?
Thanks in advance.

did you try using genetic algorithms for feature selection? There are different packages to do this - GA, genalg, caret (in R).
Take a look at this blog, feature selection using genetic algorithms has been explained with example - http://topepo.github.io/caret/GA.html
Hope it helps.

Related

Feature Selection Process For Regression

I am trying to solve a regression problem (determine next month expected revenue).I came to know about different feature selection technique like
Filter Method
Wrapper Method
Embedded Method
Q1: Now the problem is, i think those methods are for classification type problem. So how can we use feature selection for regression problem?
Q2: I came to know about "Regularization". Is it the only way to use feature selection for regression problem?
I don't know these filter selections you mentioned, but you can use:
Scikit-learn.selection_feature.RFE (Recursive Feature Elimination)
or
Scikit-learn.selection_feature.PCA (Principal Component Analisys)
I'm pretty sure you can use them for classification or regression.
Here's an example of use of RFE and LinearRegression: https://towardsdatascience.com/feature-selection-with-pandas-e3690ad8504b

Maxima: Linear fit on data

I am new in Maxima. I have a set of data, (x,y,error) and I want to fit a linear line on it. I found some examples in example by maxima "Chapter 5: 2D Plots and Graphics using qdraw " but honestly I don't know how to download and use "qdraw" package.
anyone can help?
I see that qdraw.mac is linked from the page you mentioned. Maybe you can search for qdraw.mac on that page.
Maxima has some capability to work with linear regression models, but other packages which are specifically devoted to statistics might be more suitable. Have you tried R? (http://www.r-project.org)

What is Weka's InfoGainAttributeEval formula for evaluating Entropy with continuous values?

I'm using Weka's attribute selection function for Information Gain and I'm trying to figure out what the specific formula Weka uses when dealing with continuous data.
I understand the usual formula for Entropy is this for when the values in the data are discrete. I understand that when dealing with continuous data one can either use Differential Entropy or discretize the values. I've tried looking at Weka's explanation to InfoGainAttributeEval and have looking through so many other references, but can't find anything.
Maybe its just me, but would anyone know how Weka implements this case?
Thanks!
I asked the author Mark Hall and he said:
It uses the supervised MDL-based discretization method of Fayad and
Irani. See the javadocs:
http://weka.sourceforge.net/doc.stable-3-8/weka/attributeSelection/InfoGainAttributeEval.html
Also you can see this link for the discretization method:
http://weka.sourceforge.net/doc.stable-3-8/weka/filters/supervised/attribute/Discretize.html

How to remove redundant features using weka

I have around 300 features and I want to find the best subset of features by using feature selection techniques in weka. Can someone please tell me what method to use to remove redundant features in weka :)
There are mainly two types of feature selection techniques that you can use using Weka:
Feature selection with wrapper method:
"Wrapper methods consider the selection of a set of features as a search problem, where different combinations are prepared, evaluated and compared to other combinations. A predictive model us used to evaluate a combination of features and assign a score based on model accuracy.
The search process may be methodical such as a best-first search, it may stochastic such as a random hill-climbing algorithm, or it may use heuristics, like forward and backward passes to add and remove features.
An example if a wrapper method is the recursive feature elimination algorithm." [From http://machinelearningmastery.com/an-introduction-to-feature-selection/]
Feature selection with filter method:
"Filter feature selection methods apply a statistical measure to assign a scoring to each feature. The features are ranked by the score and either selected to be kept or removed from the dataset. The methods are often univariate and consider the feature independently, or with regard to the dependent variable.
Example of some filter methods include the Chi squared test, information gain and correlation coefficient scores." [From http://machinelearningmastery.com/an-introduction-to-feature-selection/]
If you are using Weka GUI, then you can take a look at two of my video casts here and here.

How to extract features from image for classification and object recognition?

I'm confused about the way I should make the "features extraction " method
I want to use SVMs to apply "Object recognition" in images ,
There's a sample in Emgu's examples that holds an XML file contains the features of a cat !
and I've been trying since a week to know how they did it and what methods they used
and I came across this page
http://experienceopencv.blogspot.com/2011/02/learning-deformable-models-with-latent.html
that displays the steps ! It's so complicated plus couldn't do it myself
I'm so lost !! can anyone tell me an appropriate method of "features extraction "Compatible with SVMs learning ?
Accord has SVM example but it's on hand writing and doesn't deal with color images =(
any helping links ?
thanks
all feature extraction methods are compatible with svm... u just need to choose one... select one and get the features and then input these features into svm.... explanation of what is feature extraction is here http://en.wikipedia.org/wiki/Feature_extraction
You need to concentrate on the gabor filter, which is an advanced extractor for face recognition and object recognition.

Resources