I am using weka for classification.
If I use Naive Bayes for classification of datasets, how can I see the backend code of the naive bayes algorithm in weka??
Is there any way??
Weka is open source, so you can see their code at their repository as stated in their website. The naive bayes part is here
Related
I am using scikit-learn to build a multiclass classification model. To this extent, I have tried the RandomForestClassifier, KNeighborsClassifier, LogisticRegression, MultinomialNB, and SVC algorithms. I am satisfied with the generated output. However, I do have a question about the default mechanism used by the algorithms for multiclass classification. I read that all scikit-learn classifiers are capable of multiclass classification, but I could not find any information about the default mechanism used by the algorithms.
One-vs-the-rest or One-vs-all is the most commonly used and a fair default strategy for multiclass classification algorithms. For each classifier, the class is fitted against all the other classes. Check here for more information https://scikit-learn.org/stable/modules/generated/sklearn.multiclass.OneVsRestClassifier.html
I don't have much knowledge about it but there is a way to use a sparse autoencoder in Weka? At this time, I've just used MPLAutoencoder and don't have certain if I can configure it for sparsing too. Thank you.
I've asked the author of MPLAutoencoder and there isn't an implementation of a sparse autoencoder in Weka yet.
I am trying to implement Multiclass classification in WEKA.
I have lot of rows, say bank transactions, and one is tagged as Food,Medicine,Rent,etc. I want to develop a classifier which can be trained with the previous data I have and predict the class it can belong to for future transactions. If I am right this is Multiclass and not multilabel since each transaction can belong to only one class.
Below are a few algorithms I am considering
Naive Bayes
Multinomial Logistic Regression
Multiclass SVM
Max Entropy
Neural Networks (if possible)
In my data Number of features <<< Number of transactions and hence I am thinking of one vs rest binary classifier instead of one vs one.
Are there any other algorithms I should lok into which will help with my goal?
Is there any algos that I put are useless for my goal?
Also,I found that scikit-learn in Python is better than WEKA but I can run scikit-learn only on one processor. Is this true?
Answers to any question would be helpful.
Thanks!
You can look at RandomForest which is a well known classifier and quite efficient.
In scikit-learn, you have some class that can be used over several core like RandomForestClassifier. It has a constructor parameter that can be used to define the number of core or a value that will use every available core. Look at the documentation, constructor that contains n_jobs parameter can be used over several core
I am trying to implement sentiment analysis on Google+ posts using a Naive Bayes classification.
I have been looking for a dataset, and the only one I found was made for twitter. A google+ post and tweet have a lot in common but maybe not the length. I'd like to know if that changes anything for a NB classification.
Also Naive Bayes takes smileys into account but suppose we want to give them more weight in text (since they express unambiguously the emotion of a tweet). Is there a way to do that with naive bayes?
I'm a newbie to Machine Learning. I have a question about how Normal Bayes is implemented in OpenCV.
I have a mis-understanding regarding the terms Normal Bayes and Naive Bayes.
This site tells that Normal Bayes and Naive Bayes mean the same.
The NormalBayes documentation on OpenCV website specifies that the features are Normally distributed and not necessarily independent.
The wikipedia article on Naive Bayes classifier tells us that it is assumed that features are independent. Therefore, Covariance Matrix need not be determined.
However, when I look at the source of the implementation of Normal Bayes classifier, it does calculate Covariance Matrix.
I also found a similar question over here which wasn't answered.
Am I missing something here? or is it that Normal Bayes classifier in OpenCV is not a standard Naive Bayes classifier?
Theoretically, the Naive Bayes model assumes "complete independence between causes of an effect", while the Normal model assumes that "feature vectors from each class are normally distributed (though, not necessarily independently distributed)". Note that both uses mean vectors and covariance matrices, however, the model assumption is different.
In OpenCV "data distribution function is assumed to be a Gaussian mixture, one component per class" and the model does not made an assumption regarding independence of such classes.