Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
Deep Learning Techniques (Deep Neural Network, Deep Belief Network, Deep Stacking Networks, ...) are very efficient in some areas. They take a very long time to train, but this is a only-once cost.
I read several papers about different techniques and they only focused on accuracy and time to train them. How fast are they to produce an answer in practice, once trained ?
Are there some data available on benchmarking deep networks with perhaps millions of parameters ?
I would think that they are quite fast as all the weights are fixed, but as the functions can be quite complex and the number of parameters quite high, I'm not sure on how they really perform in practice.
The speed is highly dependent on the size of the network. Assuming your network is dense Feed Forward network, each layer of the network is represented by a (usually very rectangular) matrix. Pushing an input through the network requires a matrix vector product. So if you have a network with 8 layers, it will take you 8 matrix products. How long each of those takes depends on the original dimension of the data set and the size of said layers.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 2 years ago.
Improve this question
If classes are randomly spread or it is having more noise, which type of supervised ML classification model will give better results, and why?
It is difficult to say which classifier will perform best on general problems. It often requires testing of a variety of algorithms on a given problem in order to determine which classifier performs best.
Best performance is also dependent on the nature of the problem. There is a great answer in this stackoverflow question which looks at various scoring metrics. For each problem, one needs to understand and consider which scoring metric will be best.
All of that said, neural networks, Random Forest classifiers, Support Vector Machines, and a variety of others are all candidates for creating useful models given that classes are, as you indicated, equally distributed. When classes are imbalanced, the rules shift slightly, as most ML algorithms assume balance.
My suggestion would be to try a few different algorithms, and tune the hyper parameters, to compare them for your specific application. You will often find one algorithm is better, but not remarkably so. In my experience, often of far greater importance, is how your data are preprocessed and how your features are prepared. Once again this is a highly generic answer as it depends greatly on your given application.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 2 years ago.
Improve this question
My question is given a particular dataset and a binary classification task, is there a way we can choose a particular type of model that is likely to work best? e.g. consider the titanic dataset on kaggle here: https://www.kaggle.com/c/titanic. Just by analyzing graphs and plots, are there any general rules of thumb to pick Random Forest vs KNNs vs Neural Nets or do I just need to test them out and then pick the best performing one?
Note: I'm not talking about image data since CNNs are obv best for those.
No, you need to test different models to see how they perform.
The top algorithms based on the papers and kaggle seem to be boosting algorithms, XGBoost, LightGBM, AdaBoost, stack of all of those together, or just Random Forests in general. But there are instances where Logistic Regression can outperform them.
So just try them all. If the dataset is >100k, you're not gonna lose that much time, and you might learn something valuable about your data.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
When using kernels to delimit non linear domains in SVMs, we introduce new features based on the training examples. We then have as many features as training examples. But having as many features as examples increases the chances of overfitting right? Should we drop some of these new features?
You really can't drop any of the kernel-generated features, in many cases you don't know what features are being used or what weight is being given to them. In addition to the usage of kernels, SVMs use regularization, and this regularization decreases the possibility of overfitting.
You can read about the connection between the formulation of SVMs and statistical learning theory, but the high level summary is that the SVM doesn't just find a separating hyperplane but finds one that maximizes the margin.
The wikipedia article for SVMs is very good and provides excellent links to regularization, and parameter search and many other important topics.
increasing feature did increase the chances of overfitting,may be you should use cross validation(libsvm contains) strategy to test the model you trained now is overfitting or not,
and use feature seletion tools to select feature http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/fselect/fselect.py
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
According to several people on StackOverflow Bayesian filtering is better than Neural Networks for detecting spam.
According to the literature I've read that shouldn't be the case. Please explain!
There is no mathematical proof or explanation that can explain why the applications of Neural Networks have not been as good at detecting spam as Bayesian filters. This does not mean that Neural Networks would not produce similar or better results, but the time it would take for one to tweak the Neural Network topology and train it to get even approximately the same results as a Bayesian filter is simply not justified. At the end of the day, people care about results and minimizing the time/effort achieving those results. When it comes to spam detection, Bayesian filters get you the best results with the least amount of effort and time. If the spam detection system using Bayesian filters detects 99% of the spam correctly, then there is very little incentive for people to spend a lot of time adjusting Neural Networks just so they can eek out an extra 0.5% or so.
"According to the literature I've read that shouldn't be the case."
It's technically correct. If properly configured, a Neural Network would get as good or even better results than the Bayesian filters, but its the cost/benefit ratio that makes the difference and ultimately the trend.
Neural Networks works mostly as black box approach. You determine your inputs and outputs. After that finding suitable architecture (2 hidden layer Multi layer perceptron , RBF network etc) is done mostly empirically. There are suggestions for determining architecture but they are, well suggestions.
This is good for some problems since we, domain analyst, do not have enough information about problem itself. Ability of NN to find an answer is a wanted thing.
Bayesian Network is on the other hand is designed mostly by domain analyst. Since spam classification is a well known problem, a domain analyst can tweak architecture more easily. Bayesian network would get better results more easily in this way.
Also most NNs are not very good with changing features therefore almost always need to be RE-trained,
an expensive operation.
Bayesian network on the other hand may only change probabilities.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Can someone predict :) or guess how does the Google Prediction API work under the hood?
I know there are some machine learning techniques:
Decision Trees, Neuron networks, naive Bayesian classification etc.
Which technique do you think Google is using?
The single answer to the question on Stats SE is good, given limited information from Google itself. It concludes with the same thought I had, that Google isn't telling regarding the innards of the Google Prediction API.
There was a Reddit discussion about this too. The most helpful response was from a user who is credible due to his prior work in that field (in my opinion). He wasn't certain what Google Prediction API was using, but had some ideas about what it was NOT using, based on discussions on the Google Group for the Prediction API:
the current implementation is not able to deal correctly with non-linear
separable data sets (XOR and Circular). That probably means that they
are fitting linear models such as regularized logistic regression or
SVMs but not neural networks or kernel SVMs. Fitting linear models is
very scalable to both wide problems (many features) and long problems
(many samples) provided that you use... stochastic gradient descent
with truncated gradients to handle sparsity inducing regularizers.
There was a little more, and of course, some other responses. Note that Google Prediction API has since released a new version, but it is not any more obvious (to me) how it works "under the hood".