Different usage of Machine Learning classifiers [closed] - machine-learning

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I have learned several classifiers in Machine learning - Decision tree, Neural network, SVM, Bayesian classifier, K-NN...etc.
Can anyone please help to understand when I should prefer one of the classifier over other - for example - in which situation(nature of data sets, etc) I should prefer decision tree over neural net OR which situation SVM might work better than Bayesian ??
Sorry if this is not a good place to post this question.
Thanks.

This is EXTREMELY related to the nature of the dataset. There are several meta-learning approaches that will tell you which classifier to use, but generaly there isn't a golden rule.
If you're data is easily separable (easy to distinguish entries from different classes), perhaps decision-trees or SVMs (with a linear kernel) are good enough. However, if your data needs to be transformed into other [higher] dimensional spaces, kernel-based classifiers might work well, such as RBF SVMs. SVMs also work better with non-redundant, independent features. When combinations between features are needed, artificial neural networks and bayesian classifiers work good as well.
Yet again, this is highly subjective and strongly depends on your feature set. For instance, having a single feature that is highly correlated with the class might determine which classifier works best. That said, overall, the no-free-lunch theorem says that no classifier is better for everything, but SVMs are generally regarded as the current best bet on binary classification.

Related

Which supervised machine learning classification method suits for randomly spread classes? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 2 years ago.
Improve this question
If classes are randomly spread or it is having more noise, which type of supervised ML classification model will give better results, and why?
It is difficult to say which classifier will perform best on general problems. It often requires testing of a variety of algorithms on a given problem in order to determine which classifier performs best.
Best performance is also dependent on the nature of the problem. There is a great answer in this stackoverflow question which looks at various scoring metrics. For each problem, one needs to understand and consider which scoring metric will be best.
All of that said, neural networks, Random Forest classifiers, Support Vector Machines, and a variety of others are all candidates for creating useful models given that classes are, as you indicated, equally distributed. When classes are imbalanced, the rules shift slightly, as most ML algorithms assume balance.
My suggestion would be to try a few different algorithms, and tune the hyper parameters, to compare them for your specific application. You will often find one algorithm is better, but not remarkably so. In my experience, often of far greater importance, is how your data are preprocessed and how your features are prepared. Once again this is a highly generic answer as it depends greatly on your given application.

Neural Networks - Difference between deep autoencoder and stacked autoencoder [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Disclaimer: I also posted this question on CrossValidated but it is not receiving any attention. If this is not the place for it I will gladly remove it.
As I understand it, the only difference between them is the way the two networks are trained. Deep autoencoders are trained in the same way as a single-layer neural network, while stacked autoencoders are trained with a greedy, layer-wise approach. Hugo Larochelle confirms this in the comment of this video. I wonder if this is the ONLY difference, any pointers?
The terminology in the field isn't fixed, well-cut and clearly defined and different researches can mean different things or add different aspects to the same terms. Example discussions:
What is the difference between Deep Learning and traditional Artificial Neural Network machine learning? (some people think that 2 layers is deep enough, some mean 10+ or 100+ layers).
Multi-layer perceptron vs deep neural network (mostly synonyms but there are researches that prefer one vs the other).
As for AE, according to various sources, deep autoencoder and stacked autoencoder are exact synonyms, e.g., here's a quote from "Hands-On Machine Learning with Scikit-Learn and TensorFlow":
Just like other neural networks we have discussed, autoencoders can
have multiple hidden layers. In this case they are called stacked
autoencoders (or deep autoencoders).
Later on, the author discusses two methods of training an autoencoder and uses both terms interchangeably.
I would agree that the perception of the term "stacked" is that an autoencoder can extended with new layers without retraining, but this is actually true regardless of how existing layers have been trained (jointly or separately). Also regardless of the training method, the researches may or may not call it deep enough. So I wouldn't focus too much on terminology. It can stabilize some day but not right now.

Dimensionality / noise reduction techniques for regression problems [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
What are some techniques for dimensionality reduction in regression problems? I have tried the only unsupervised techniques I know, PCA and Kernel PCA (using the scikit learn library), but I have not seen any improvements using these. Perhaps these are only suitable for classification problems? What are some other techniques I can try? Preferably, ones that are implemented in sklearn.
This is a very general question, and the suitability of the techniques (or combinations of them), really depends on your problem specifics.
In general, there are several categories of dimension reduction (asides from those you mentioned.
Perhaps the simplest form of dimension reduction is to just use some of the features, in which case we are really talking about feature selection (see sklearn's module).
Another way would be to cluster (sklearn's), and replace each cluster by an aggregate of its components.
Finally, some regressors use l1 penalization and properties of convex optimization to simultaneously select a subset of features; in sklearn, see the lasso and elastic net.
Once again, this a very broad problem. There are entire books and competitions even of feature selection, which is a subset of dimension reduction.
Adding to #AmiTavory 's good answer: PCA principal components analysis can be used here. If you do not wish to perform dimensionality reduction simply retain the same number of eigenvectors from the PCA as the size of the input matrix: in your case 20.
The resulting output will be orthogonal eigenvectors: you may consider them to provide the "transformation" you are seeking as follow: the vectors are ranked by their respective amount of variance they represent wrt the inputs.

ANN and SVM classification [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 1 year ago.
Improve this question
Where is ANN classification (regression) better than SVM? Some real-world examples?
There are many applications where they're better, many applications where they're comparable, many applications where they are worse. It also depends on who you ask. It is hard to say this type of data or that type of data/application.
An example where ANN, in particular convolutional neural networks, work better than SVMs would be digit classification on MNIST. Another such case is the work of Geoff Hinton's group on speech recognition using Deep Belief Networks
Recently I have read a paper of proving the theoretical equivalence between ANN and SVM. However, ANN is usually slower than SVM.
I am just finishing some out-of-the-box comparison between support vector machines and neural networks on several popular regression- and classification datasets - first results in short: svms learn fast and predict slow - neural networks learn slow but predict fast and have very lightweight models. Concerning accuracy/loss, both methods seem to be on par.
It will largely depend as both have different tradeoffs and design criteria. There has been some work to show the relationship and some say equivalence as seen in other answers to this question. Below is another reference which draws links between these two techniques in machine learning:
Ronan Collobert and Samy Bengio. 2004. Links between perceptrons, MLPs
and SVMs. In Proceedings of the twenty-first international
conference on Machine learning (ICML '04). ACM, New York, NY, USA,
23-. DOI: https://doi.org/10.1145/1015330.1015415

Why is Bayesian filtering better than Neural Networks when classifying spam? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
According to several people on StackOverflow Bayesian filtering is better than Neural Networks for detecting spam.
According to the literature I've read that shouldn't be the case. Please explain!
There is no mathematical proof or explanation that can explain why the applications of Neural Networks have not been as good at detecting spam as Bayesian filters. This does not mean that Neural Networks would not produce similar or better results, but the time it would take for one to tweak the Neural Network topology and train it to get even approximately the same results as a Bayesian filter is simply not justified. At the end of the day, people care about results and minimizing the time/effort achieving those results. When it comes to spam detection, Bayesian filters get you the best results with the least amount of effort and time. If the spam detection system using Bayesian filters detects 99% of the spam correctly, then there is very little incentive for people to spend a lot of time adjusting Neural Networks just so they can eek out an extra 0.5% or so.
"According to the literature I've read that shouldn't be the case."
It's technically correct. If properly configured, a Neural Network would get as good or even better results than the Bayesian filters, but its the cost/benefit ratio that makes the difference and ultimately the trend.
Neural Networks works mostly as black box approach. You determine your inputs and outputs. After that finding suitable architecture (2 hidden layer Multi layer perceptron , RBF network etc) is done mostly empirically. There are suggestions for determining architecture but they are, well suggestions.
This is good for some problems since we, domain analyst, do not have enough information about problem itself. Ability of NN to find an answer is a wanted thing.
Bayesian Network is on the other hand is designed mostly by domain analyst. Since spam classification is a well known problem, a domain analyst can tweak architecture more easily. Bayesian network would get better results more easily in this way.
Also most NNs are not very good with changing features therefore almost always need to be RE-trained,
an expensive operation.
Bayesian network on the other hand may only change probabilities.

Resources