I am studying machine learning and have been pondering the following question:
Is generative classification or the discriminant function model more accurate provided that the utility function is unknown and why?
Related
In the documentation of Scikit-Learn Decision Trees, it is stated that:
Decision Trees (DTs) are a non-parametric supervised learning method used for classification and regression. The goal is to create a model that predicts the value of a target variable by learning simple decision rules inferred from the data features.
What is meant by non-parametric supervised learning?
non-parametric is on the opposite side of parametric. In a parametric learning model, you can describe the set of the hypothesis (or learning model) as a function of a finite number of parameters such as SVM.
Hence, a non-parametric model can be seen as a model with an infinite number of parameters to be described, i.e., the distribution of data cannot be defined by a finite set of parameters [1].
[2] An easy to understand nonparametric model is the k-nearest neighbors algorithm that makes predictions based on the k most similar training patterns for a new data instance. The method does not assume anything about the form of the mapping function other than patterns that are close are likely to have a similar output variable.
I am a bit confused because about the topic deep learning.
My question: Let's assume that we've got a task to solve. Reviews should be classified where they are positive or negative by usage of Keras deep learning model.
Now: Does this task belong to supervised or unsupervised learning? Why? And how does deep learning and neural network work here? How do they learn? Isn't it better, if a machine learning algorithm is being used for this task?
Supervised learning is the machine learning task of learning a function that maps an input to an output based on example input-output pairs. It infers a function from labeled training data consisting of a set of training examples. In supervised learning, each example is a pair consisting of an input object (typically a vector) and a desired output value (also called the supervisory signal). A supervised learning algorithm analyzes the training data and produces an inferred function, which can be used for mapping new examples.
Unsupervised learning is a type of machine learning algorithm used to draw inferences from datasets consisting of input data without labeled responses. The most common unsupervised learning method is cluster analysis, which is used for exploratory data analysis to find hidden patterns or grouping in data.
(Definitions from wikipedia and mathworks)
There are already labeled datasets (with the actual reviews for each input) for the task you mentioned, hence you can always model it as a supervised learning problem and use a machine learning model such as SVM, Random Forest, or MLP to solve the task.
https://www.kaggle.com/c/sentiment-analysis-on-movie-reviews/data
https://www.kaggle.com/snap/amazon-fine-food-reviews
https://www.kaggle.com/jessicali9530/kuc-hackathon-winter-2018
https://www.kaggle.com/nicapotato/womens-ecommerce-clothing-reviews
https://www.kaggle.com/utathya/imdb-review-dataset
https://www.kaggle.com/datafiniti/hotel-reviews
https://www.kaggle.com/sid321axn/amazon-alexa-reviews
https://www.kaggle.com/bittlingmayer/amazonreviews
https://www.kaggle.com/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews
I have a task of building a multi linear regression model for a prediction problem (input parameters have combination of numerical and categorical variables).
If I use Artifical Neural Networks (ANN) to build a model that does the prediction, can that be multi linear regression model or will that be a deep learning model?
I am confused if I can use ann for building a multi linear regression model.
If you want to build a multi linear regression model with neural networks, you can. That's just a model with no non-linearities/activation functions (no relu, sigmoid).
As such, it's fully linear and thus it's only one layer deep (additional layers would be superfluous) and doesn't qualify as deep learning.
If you look at how regression is done in Tensorflow or Keras, it's really one dense layer with no activation.
I am working on a document which should contain the key differences between using Naive Bayes (generative) and Logistic Regression (discriminative) models for text classification.
During my research, I ran into this definition for Naive Bayes model: https://nlp.stanford.edu/IR-book/html/htmledition/naive-bayes-text-classification-1.html
The probability of a document d being in class c is computed as ... where p(tk|c) is the conditional probability of term tk occurring in a document of class c...
When I got to the part of comparing Generative and Discriminative models, I found this explanation on StackOverflow as accepted: What is the difference between a Generative and Discriminative Algorithm?
A generative model learns the joint probability distribution p(x,y) and a discriminative model learns the conditional probability distribution p(y|x) - which you should read as "the probability of y given x".
At this point I got confused: Naive Bayes is a generative model and uses conditional probabilities, but at the same time the discriminative models were described as if they learned the conditional probabilities as opposed to the joint probabilities of the generative models.
Can someone shed some light on this please?
Thank you!
It is generative in the sense that you don't directly model the posterior p(y|x) but rather you learn the model of the joint probability p(x,y) which can be also expressed as p(x|y) * p(y) (likelihood times prior) and then through the Bayes rule you seek to find the most probable y.
A good read I can recommend in this context is: "On Discriminative vs. Generative classifiers: A comparison of logistic regression and naive Bayes"
(Ng & Jordan 2004)
I would like to load a model I trained before and then update this model with new training data. But I found this task hard to accomplish.
I have learnt from Weka Wiki that
Classifiers implementing the weka.classifiers.UpdateableClassifier interface can be trained incrementally.
However, the regression model I trained is using weka.classifiers.functions.MultilayerPerceptron classifier which does not implement UpdateableClassifier.
Then I checked the Weka API and it turns out that no regression classifier implements UpdateableClassifier.
How can I train a regression model in Weka, and then update the model later with new training data after loading the model?
I have some data mining experience in Weka as well as in scikit-learn and r and updateble regression models do not exist in weka and scikit-learn as far as I know. Some R libraries however do support updating regression models (take a look at this linear regression model for example: http://stat.ethz.ch/R-manual/R-devel/library/stats/html/update.html), so if you are free to switching data mining tool this might help you out.
If you need to stick to Weka than I'm afraid that you would probably need to implement such a model yourself, but since I'm not a complete Weka expert please check with the guys at weka list (http://weka.wikispaces.com/Weka+Mailing+List).
The SGD classifier implementation in Weka supports multiple loss functions. Among them are two loss functions that are meant for linear regression, viz. Epsilon insensitive, and Huber loss functions.
Therefore one can use a linear regression trained with SGD as long as either of these two loss functions are used to minimize training error.