I have build my ML model in classification task, then I want to calculate the prediction by machine, compared with prediction by human. How do I get the accuracy of machine's prediction compared with human's prediction? And how many person/s did I must have(minimum) to compare with machine?
Related
Why is the accuracy on train dataset not always 100% while we use the same dataset to train the model?
Though tree-based ML algorithms give us 100% accuracy on train dataset many times, but why is this not happening every time. I know this results in overfitting but why not 100% accuracy every time on the dataset using which our model is trained?
I'm following the course on Machine Learning from Coursera and I just had an interrogation.
Multiple classifier making a xor classifier
On this picture we can see that in order to make a xor classifier we build other smaller classifiers which are trained with linearly separable gate.
So each classifier has a job (for example AND, OR, etc) defined and the network must be trained for this task.
But in a bigger neural net it's impossible to define a task for each neuron (or classifier).
So my question is : Is this the task of the Back-Propogation algorithm (in addition to the fact that it is used to update the weight) ?
If someone is wondering the same thing, yes it is the case.
The backprop algorithm makes "smaller linear solvable" per each neuron (or classifier).
I'm working on a project with colorectal cancer stage multiclass-classification using Gene Expression Data. My dataset contains 11 Biomarkers. The results from the classification are around 40%. I have tried different models for classification with KNN, SVM, neural network..., and also I have tried algorithms from ensemble machine learning. Has anyone has any idea what can I do with the dataset to improve the results?
To decide what to do next, you will need some metrics:
How well can a team of human experts classify the data?
What is the model accuracy on the training dataset?
What is the model accuracy on the testing dataset?
If the training accuracy is much worse than human experts, you should increase the complexity of the model until the training results approach or exceed human experts. You can do this by increasing the number of input features, choosing a different machine learning model, or increasing the number of layers in the NN. If the training accuracy is poor, you need to improve this first before spending time improving the testing accuracy.
If the training accuracy is good but the testing accuracy is much worse than the training accuracy, you are probably overfitting. Get or create more training data, and use regularization.
I am new to machine learning and I am currently working on classification problem. I am able to train the model and predict test data sets. I want to know whether is there some way by which I can get scores along with the prediction. By scores , I mean those are proximity scores along with prediction. For example, in standard age-salary-buy (based on age and salary whether the customer will buy the product or not) classification problem, I want to know what is a score out of 100 that he will buy that product in addition to the prediction of whether he will buy it or not.
Currently, I am using LibSVM Algo. Is there some algo which provides me above data ?
Thanks.
What you are looking for is a support of your decision. In other words, many classifiers base their decision of x class over labels Y on:
cl(x) = arg max_{y \in Y} p(y|x)
where p(y|x) is their internal estimation of "x having label y". And such classifiers include:
neural networks (with sigmoid output)
logistic regression
naive bayes
voting ensembles (such as RF)
...
These methods can be easily converted to your 0-100 scale, as probability is in 0-1 scale.
Some, on the other hand use measure proportional to probability (such as SVM), but unbounded, here you can get this value (often called decision function) but you cannot convert it to 0-100 score (as you do not have "maximum" value). This is a big drawback, so some modification were proposed. In particular for SVM you have Platt's scaling which actually fits a logistic regression on top of SVM so you get your probability estimate. In libSVM you can set -b to get probability estimates
from libsvm website
-b probability_estimates: whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0)
I tried to use a neural network to predict some data. I used the
MATLAB neural network fitting toolbox and I could predict some tests.
But the problem is the accuracy is not good enough for my results.
I tried to change the neuron numbers to change accuracy, but it was not good.
I wanted to change the trainer function, but I didn't find anything.
For example, I want to command MATLAB's toolbox to try to train until the accuracy is less than 0.1.
What should I do?
You can set net.trainParam.goal to set the performance goal or increase net.trainParam.epochs to increase the maximum number of epochs to train.