Multinomial Naive Bayes in Text Classification - machine-learning

I have a set of features in which one of the features is a negative value. I intend to use Multinomial Naive Bayes for classification of my text document but the negative feature throws an error. Can I use Gaussian Naive Bayes for this setting of text classification?

Related

Should I use multinomial logistic regression or linear discriminant analysis?

In a classification problem, I cannot use a simple logit model if my data label (aka., dependent variable) has more than two categories. That leaves me with multinomial regression and Linear Discriminant Analysis (LDA) and the likes. Why is it that multinomial logit is not as popular as LDA in machine learning? What is the particular advantage that LDA offers?

How to use Naive Bayes Binary Classification in CoreML?

I am trying to build a number classification model in CoreML and want to use the naive bayes classifier but not able to find how to use it. My algorithm is using naive bayes
At the moment, coremltools support only following types of classifiers:
SVMs (scikitlearn)
Neural networks (Keras, Caffe)
Decision trees and their ensembles (scikitlearn, xgboost)
Linear and logistic regression (scikitlearn)
However, implementing Naïve Bayes in Swift yourself is not that hard, check this implementation, for example.

Calculating Probabilities in Naive Bayes Classification

I have a data set consisting of both categorical and continuous attributes. I want to apply Naive Bayes classification method to classify the data.
How to calculate probabilities for both of these types?
Should I use count method for calculating on categorical data and assume some distribution and calculating from that on continuous data ?
As Naive Bayes assumes independence of each feature obervation given a class label you have
P(cat1, con1|y) = P(cat1|y)P(con1|y)
where cat1 is some categorical variable and con1 is continuous, you model each of these probabilities completely independently. And as you suggested, for categorical you can use simple empirical estimator (however remember about some smoothing techniques so you do not get 0 probabilities) and for continuous you need some more sophisticated estimator (such as MLE using fixed distributions family - for example gaussians; or something more complex - as any probabilistic classifier/model)

Difference between Data Mining algorithms and methods

I read a lot of times in literature that there are several Data Mining methods (for example: decision trees, k-nearest neighbour, SVM, Bayes Classification) and the same for Data Mining algorithms (k-nearest neighbour algorithm, Naive Bayes Algorithm).
Is a DM method using different DM algorithms or is it the same?
An example to clarify - is there any difference between the below?
I'm using the Naive Bayes classification method.
I'm using the Naive Bayes classification algorithm.
Or is "Bayes" the method and "Naive Bayes" the algorithm?

Naive Bayes and Neural Network similarities and choice

I have a large dataset available with 10 different inputs and 1 output. All the outputs and the input are discreet (LOW, MEDIUM, HIGH). I was thinking about creating a neural network for this problem, however when I am designing the network to have 3 different outputs (LOW, MEDIUM, HIGH) and use a softmax neuron I basically get a 'probability'. Am I correct?
That made me think that it is maybe better to try a Naive Bayes classifier, and thus ignoring the possible correlations between the input variables, however in a large dataset Naive Bayes shows promising results.
Is there a reason to pick Neural Networks over Bayes in this case? What is the reason to pick Neural Networks when you want a probability as output (using a softmax function in Neural Networks).
Yes, with softmax activations in the output layer, you can interpret the outputs as probabilities.
A potential reason to pick artificial neural networks (ANN) over Naive Bayes is the possibility you mentioned: correlations between input variables. Naive Bayes assumes that all input variables are independent. If that assumption is not correct, then it can impact the accuracy of the Naive Bayes classifier. An ANN with appropriate network structure can handle the correlation/dependence between input variables.

Resources