why dont we need feature scaling in multiple linear regression [closed] - machine-learning

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 2 years ago.
Improve this question
I am currently learning ML
and I notice that in multiple linear regression we don't need scaling for our independent variable
and I didn't know why?

Whether feature scaling is useful or not depends on the training algorithm you are using.
For example, to find the best parameter values of a linear regression model, there is a closed-form solution, called the Normal Equation. If your implementation makes use of that equation, there is no stepwise optimization process, so feature scaling is not necessary.
However, you could also find the best parameter values with a gradient descent algorithm. This could be a better choice in terms of speed if you have many training instances. If you use gradient descent, feature scaling is recommended, because otherwise the algorithm might take much longer to converge.

Related

Which supervised machine learning classification method suits for randomly spread classes? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 2 years ago.
Improve this question
If classes are randomly spread or it is having more noise, which type of supervised ML classification model will give better results, and why?
It is difficult to say which classifier will perform best on general problems. It often requires testing of a variety of algorithms on a given problem in order to determine which classifier performs best.
Best performance is also dependent on the nature of the problem. There is a great answer in this stackoverflow question which looks at various scoring metrics. For each problem, one needs to understand and consider which scoring metric will be best.
All of that said, neural networks, Random Forest classifiers, Support Vector Machines, and a variety of others are all candidates for creating useful models given that classes are, as you indicated, equally distributed. When classes are imbalanced, the rules shift slightly, as most ML algorithms assume balance.
My suggestion would be to try a few different algorithms, and tune the hyper parameters, to compare them for your specific application. You will often find one algorithm is better, but not remarkably so. In my experience, often of far greater importance, is how your data are preprocessed and how your features are prepared. Once again this is a highly generic answer as it depends greatly on your given application.

How to choose which model to fit to data? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 2 years ago.
Improve this question
My question is given a particular dataset and a binary classification task, is there a way we can choose a particular type of model that is likely to work best? e.g. consider the titanic dataset on kaggle here: https://www.kaggle.com/c/titanic. Just by analyzing graphs and plots, are there any general rules of thumb to pick Random Forest vs KNNs vs Neural Nets or do I just need to test them out and then pick the best performing one?
Note: I'm not talking about image data since CNNs are obv best for those.
No, you need to test different models to see how they perform.
The top algorithms based on the papers and kaggle seem to be boosting algorithms, XGBoost, LightGBM, AdaBoost, stack of all of those together, or just Random Forests in general. But there are instances where Logistic Regression can outperform them.
So just try them all. If the dataset is >100k, you're not gonna lose that much time, and you might learn something valuable about your data.

anomaly detection with clustering? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 4 years ago.
Improve this question
One of anomaly detection algorithms is to use multivariate Gaussian to construct a probability density, according to Andrew Ng's coursera lecture.
What if data show cluster structures (not a single chunk)? In this case do we resort to unsupervised clustering to construct the density? If yes, how to do it? Are there other systematic ways to discover if such a case exists?
You can just use regular GMM and use a threshold on the likelihood to identify outliers. Points that don't fit the model well are outliers.
This works okay as long as your data really is composed of Gaussians.
Furthermore, clustering is fairly expensive. Usually it will be faster to directly use a nonparametric outlier model like KNN or LOF or LOOP.

When to use regression trees/forests? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 7 years ago.
Improve this question
As I was looking for a fine regression algorithm for my problem. I found out one can do that with simple decision trees as well, which is usually used for classification. The output would be something like:
The red noise would be the prediction states of such a tree or forest.
Now my question is, why at all to use this method, when there are alternatives, that really try to figure out the underlying equation (such as the famous support vector machines SVM). Are there any positive / unique aspects, or was a regression tree more a nice-to-have-algorithm?
The image you posted conveys a smooth function of y in x. A regression tree is certainly not the best technique to estimate such a function and I probably wouldn't use SVMs either. This looks like a good application for splines, e.g., by using a GAM (generalized additive model).
A regression tree on the other hand is a handy tool if you haven't got such smooth functions and if you don't know which explanatory variable will have which effect on the response. It will be particularly useful if there are jumps in the response or interactions - especially if the jump points and interaction patterns are not known in advance.

machine learning (unsupervised method) [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I have a question about Reinforcement learning. If we use a mechanism to find the response of the environment in an unsupervised method to improve its performance, is the method still unsupervised?
In other word, using the response of the environment, is a method supervised or we can do it in an unsupervised manner? If so, how?
I have to disagree with #phs. Reinforcment learning is treated in the literature either as:
completely separate, third method of training -- so it is not supervised or unsupervised, it is simply reinforcment
it is sometimes marked as supervised due to its much stronger similarities to this paradigm
So, if the algorithm is trained in the reinforcment fashion and unsupervised, you can call it a unsupervised-reinforcment hybrid or something similar, but no longer "unsupervised", as reinforcment learning requires some additional knowledge about the world, than the one encoded in the data representation (feedbacks are not stored in data representation, they are much more like "true labels").
Unsupervised learning describes a class of problems where the model is not provided "answers" during its training phase, whatever that might mean in the current context.
Clustering is a canonical example. In a clustering problem one is only looking for inherent structure or grouping in the training data, and not seeking to distinguish "right" data points from "wrong" ones.
Your question is vague, but I believe you are asking whether we can call a training method unsupervised even if we have a proscribed algorithm for performing the training. The answer is yes; the word is just a word. All learning algorithms have inherent proscribed structure (the algorithm) and so are in some sense "supervised".

Resources