Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Recently I learnt the bayesian linear regression model, but what I'm confused is that in which situation we should use the linear regression, and when to use the bayesian version. How about the performance of these two?
And is the bayesian logistic regression and logistic regression the same? I read a paper about using bayesian probit regression to predict ads CTR, I just wonder why using bayesian version?
In your two cases, linear regression and logistic regression, the Bayesian version uses the statistical analysis within the context of Bayesian inference, e.g., Bayesian linear regression.
Per wikipedia,
This (ordinary linear regression) is a frequentist approach, and it assumes that there are enough measurements to say something meaningful. In the Bayesian approach, the data are supplemented with additional information in the form of a prior probability distribution. The prior belief about the parameters is combined with the data's likelihood function according to Bayes theorem to yield the posterior belief about the parameters.
The usual way of Bayesian analysis (adding the Bayesian taste):
Figure out the likelihood function of the data.
Choose a prior distribution over all unknown parameters.
Use Bayes theorem to find the posterior distribution over all parameters.
Why Bayesian version? [1]
Bayesian models more flexible, handles more complex models.
Bayesian model selection probably superior (BIC/AIC).
Bayesian hierarchical models easier to extend to many levels.
Philosophical differences (compared to frequentist analysis).
Bayesian analysis more accurate in small samples (but then may depend on
priors).
Bayesian models can incorporate prior information
This hosts some good lecture slides about Bayesian analysis.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 10 months ago.
Improve this question
I have a dataset on which I need to apply binary classification to predict target value. I applied 5 algorithms on training set such as Logistic regression, Naive Bayes, KNN, SVM, Decision Trees. Out of which Binary Classification using Logistic Regression is giving me highest accuracy but the thing is I did not preprocess my dataset. Now should I again train my model using all five algorithms or Is it sure that Binary Classification using Logistic Regression will again give my highest accuracy after pre processing training dataset?
No one can tell you if the results will be different after pre-processing, since we don't know what pre-processing you will be doing. But best practices are to pre-process that dataset uniformly so that all algorithms are being trained on the same data.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
What are some techniques for dimensionality reduction in regression problems? I have tried the only unsupervised techniques I know, PCA and Kernel PCA (using the scikit learn library), but I have not seen any improvements using these. Perhaps these are only suitable for classification problems? What are some other techniques I can try? Preferably, ones that are implemented in sklearn.
This is a very general question, and the suitability of the techniques (or combinations of them), really depends on your problem specifics.
In general, there are several categories of dimension reduction (asides from those you mentioned.
Perhaps the simplest form of dimension reduction is to just use some of the features, in which case we are really talking about feature selection (see sklearn's module).
Another way would be to cluster (sklearn's), and replace each cluster by an aggregate of its components.
Finally, some regressors use l1 penalization and properties of convex optimization to simultaneously select a subset of features; in sklearn, see the lasso and elastic net.
Once again, this a very broad problem. There are entire books and competitions even of feature selection, which is a subset of dimension reduction.
Adding to #AmiTavory 's good answer: PCA principal components analysis can be used here. If you do not wish to perform dimensionality reduction simply retain the same number of eigenvectors from the PCA as the size of the input matrix: in your case 20.
The resulting output will be orthogonal eigenvectors: you may consider them to provide the "transformation" you are seeking as follow: the vectors are ranked by their respective amount of variance they represent wrt the inputs.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I have a pretty simple question. However I have searched extensively and am unable to find the answer. Is a genetic algorithm considered to be a form of unsupervised learning? I know that the algorithms evolves independently, however the fitness of each individual in the population is regularly measured (supervised?).
The objective of my algorithm is to optimize a set of heuristic weights via a genetic algorithm.
Thank you for your help!
—
Genetic Algorithms can be used for both supervised and unsupervised learning, e.g.:
Unsupervised Genetic Algorithm Deployed for Intrusion Detection, (2008).
Zorana Banković,
Slobodan Bojanić,
Octavio Nieto,
Atta Badii.
If you have labeled training data or tagged examples, then you are using supervised training.
From http://en.wikipedia.org/wiki/Unsupervised_learning
In machine learning, the problem of unsupervised learning is that of trying to find hidden structure in unlabeled data. Since the examples given to the learner are unlabeled, there is no error or reward signal to evaluate a potential solution. This distinguishes unsupervised learning from supervised learning and reinforcement learning.
From which it's pretty clear that genetic algorithms are not unsupervised as they are measured against a fitness criteria. Individual mutations may not be supervised, but the system as a whole is supervised as mutations are either removed or built upon based on the resulting fitness they give the algorithm.
From http://en.wikipedia.org/wiki/Reinforcement_learning
Reinforcement learning is an area of machine learning inspired by behaviorist psychology, concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward. The problem, due to its generality, is studied in many other disciplines, such as game theory, control theory, operations research, information theory, simulation-based optimization, statistics, and genetic algorithms.
Which would sort of suggest that genetic algorithms are considered to fall under reinforcement learning.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I have learned several classifiers in Machine learning - Decision tree, Neural network, SVM, Bayesian classifier, K-NN...etc.
Can anyone please help to understand when I should prefer one of the classifier over other - for example - in which situation(nature of data sets, etc) I should prefer decision tree over neural net OR which situation SVM might work better than Bayesian ??
Sorry if this is not a good place to post this question.
Thanks.
This is EXTREMELY related to the nature of the dataset. There are several meta-learning approaches that will tell you which classifier to use, but generaly there isn't a golden rule.
If you're data is easily separable (easy to distinguish entries from different classes), perhaps decision-trees or SVMs (with a linear kernel) are good enough. However, if your data needs to be transformed into other [higher] dimensional spaces, kernel-based classifiers might work well, such as RBF SVMs. SVMs also work better with non-redundant, independent features. When combinations between features are needed, artificial neural networks and bayesian classifiers work good as well.
Yet again, this is highly subjective and strongly depends on your feature set. For instance, having a single feature that is highly correlated with the class might determine which classifier works best. That said, overall, the no-free-lunch theorem says that no classifier is better for everything, but SVMs are generally regarded as the current best bet on binary classification.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 1 year ago.
Improve this question
Where is ANN classification (regression) better than SVM? Some real-world examples?
There are many applications where they're better, many applications where they're comparable, many applications where they are worse. It also depends on who you ask. It is hard to say this type of data or that type of data/application.
An example where ANN, in particular convolutional neural networks, work better than SVMs would be digit classification on MNIST. Another such case is the work of Geoff Hinton's group on speech recognition using Deep Belief Networks
Recently I have read a paper of proving the theoretical equivalence between ANN and SVM. However, ANN is usually slower than SVM.
I am just finishing some out-of-the-box comparison between support vector machines and neural networks on several popular regression- and classification datasets - first results in short: svms learn fast and predict slow - neural networks learn slow but predict fast and have very lightweight models. Concerning accuracy/loss, both methods seem to be on par.
It will largely depend as both have different tradeoffs and design criteria. There has been some work to show the relationship and some say equivalence as seen in other answers to this question. Below is another reference which draws links between these two techniques in machine learning:
Ronan Collobert and Samy Bengio. 2004. Links between perceptrons, MLPs
and SVMs. In Proceedings of the twenty-first international
conference on Machine learning (ICML '04). ACM, New York, NY, USA,
23-. DOI: https://doi.org/10.1145/1015330.1015415