What is the difference if we use Decision Tree as Base estimator in AdaBoost algorithm ?
Is Random Forest a special case of AdaBoost?
Most certainly not; Random Forest is a case of bagging ensemble algorithm (short for bootstrap aggregating), which is different from boosting - check here for their differences.
What is the difference if we use Decision Tree as Base estimator in AdaBoost algorithm ?
You don't get a Random Forest, but a Gradient Tree Boosting Machine, available in several packages like xgboost (R/Python), gbm (R), scikit-learn (Python) etc.
Check chapter 8 of the excellent (and freely available) book An Introduction to Statistical Learning for more, or The Elements of Statistical Learning (heavy in math & theory, not for the faint-hearted)...
Related
Naive Bayes Algorithm assumes independence among features. What are some text classification algorithms which are not Naive i.e. do not assume independence among it's features.
The answer will be very straight forward, since nearly every classifier (besides Naive Bayes) is not naive. Features independence is very rare assumption, and is not taken by (among huge list of others):
logistic regression (in NLP community known as maximum entropy model)
linear discriminant analysis (fischer linear discriminant)
kNN
support vector machines
decision trees / random forests
neural nets
...
You are asking about text classification, but there is nothing really special about text, and you can use any existing classifier for such data.
Looking for the best (clearest, shortest, brightest)
concise distinction between the ML terms “Decision Forest" and “Random Forest"?
Note the similar and also unanswered question:
Multiclass Decision Forest vs Random Forest
Random forests or random decision forests is an extension of the decision forests (ensemble of decision trees) combining bagging and random selection of features to construct a collection of decision trees with controlled variance.
Random forest is an extension of random decision forest that includes bagging. For details check the original paper by Breiman or more lightweight description on Wikipedia. Majority of well-known machine learning libraries, like Python's scikit learn, implements Random Forest.
How does Multiclass Decision Forest differ from Random Forest? What factors do they have in common? It appears there is not a clear answer on the web regarding this matter.
Random forests or random decision forests is an extension of the decision forests (ensemble of decision trees) combining bagging and random selection of features to construct a collection of decision trees with controlled variance.
A very good paper from Microsoft research you may consider to look at.
I want to build a machine learning model to regression on continuous output given binary valued features(0,1). the dimension of my problem is around 200.
which of the flowing methods seems suitable for this kind of problem ?
SVR with different Kernels
Regression random forest
MARS
Gradient boosting with regression tree
Kernel regression (Nadya-Watson Kernel regression)
LSR and LARS
Stochastic gradient boosting
Intuitively speaking, anything requiring the calculation of a gradient is going to struggle on binary values. From your list, SVR and Forests would be the first place I'd look for a benchmark solution.
You can also look at expectation maximization for Bernoully mixture models.
It deals with binary input sets. You can find theory in book:
Christopher M. Bishop. "Pattern Recognition and Machine Learning".
What is the exact difference between a Support Vector Machine classifier and a Support Vector Machine regresssion machine?
The one sentence answer is that SVM classifier performs binary classification and SVM regression performs regression.
While performing very different tasks, they are both characterized by following points.
usage of kernels
absence of local minima
sparseness of the solution
capacity control obtained by acting on the margin
number of support vectors, etc.
For SVM classification the hinge loss is used, for SVM regression the epsilon insensitive loss function is used.
SVM classification is more widely used and in my opinion better understood than SVM regression.