How to create ROC curve from multiple classification models - machine-learning

How to create ROC curve from several classification models in order to compare them with each other. I'm using KNIME analytics platform.

In order to compare the classification model on the basis of ROC curve, the best way is to create the three separate ROC curve for each classification model.
After that compare the area under the ROC curve of each model because accuracy is measured by the area under the ROC curve. The one with a higher value of the area under ROC is the best classification model.

It is quite easy. You just need to compute the probabilities/normalized class distribution values and put them in the same table. In the ROC view nodes you can specify them for the positive class and see the ROC curves:

Related

Measuring ROC and AUC

I have read plenty of articles about ROC and AUC, and I found out we need to measure TPR and FPR for different classification thresholds. Does it mean that ROC and AUC can be measured for only probabilistic classifiers and not the descrete ones (like trees)?
Yes, in order to calculate AUC, you need to have predicted probabilities. AUC is the area under the ROC curve. To make a ROC curve you need to calculate true positive rate and false positive rate for different decision thresholds - and in order to use different decision thresholds, you need to have probabilities as your model's output (because it makes no sense to apply a threshold to a binary label 0 or 1.) For more information about how to calculate AUC, when to use AUC, and the strengths and weakness of AUC as a performance metric, you can read this article.

Determine accuracy of model which estimates probability of one of the classes

I'm modeling an event with two outcomes, 0(rejection) and 1(acceptance). I have created a model which estimates the probability that 1(acceptance) will happen (i.e. the model will calculate that '1' will happen with 80% chance or in other words probability of acceptance is 0.8)
Now, I have a large record of outcomes of trials with the estimates from the model (For example: probability of acceptance=0.8 and actual class (acceptance=1)). I would like to quantify or validate how accurate the model is. Is this possible, and if so how?
Note: I am just predicting probability of class 1. Let's say prediction for class 1 is 0.8 and the actual class value is 1. Now I want to find performance of my model.
You simply need to convert your probability to one of two discrete classes with thresholded rounding, i.e. if p(y=1|x)>0.5: predict 1, else predict 0. Then all of the metrics are applicable. The threshold can be chosen by inspecting the ROC curve and/or precision-recall changes or can be simply set to 0.5.
Sort the objects by prediction.
Then compute the ROC AUC of the resulting curve.

what is the score in plot_learning_curve of scikit learn?

In scikit learn, I make a regression of Boston House Price and get the following learning curve. But what is meaning of score(y axis) in regression?
Graph visualizes the learning curves of the model for both training and validation as the size of the training set is increased. The shaded region of a learning curve denotes the uncertainty of that curve (measured as the standard deviation). The model is scored on both the training and testing sets using R2, the coefficient of determination.
It depends on what do you want to measure, you can choose anything from following chart(may be any other metric not present here):
Reference:
http://scikit-learn.org/stable/modules/model_evaluation.html

Learning curve for noisy data

I am doing a supervised classification of small texts, and the data is very noisy. I plotted a learning curve: x-axis is # instances. y-axis is the value of F-measure. The curve is falling: the more instances I use, the lower the F-measure score is. Is it typical for noisy data? Or there is some other reason for this behavior?
Did you calculate F-measure using training set or test set?
If you calculated it using training set then falling learning curve is pretty normal.
If you calculated it using test set then there may be many causes, the most probable is that training and test sets are not iid.

ROC curves cross validation

How to generate a ROC curve for a cross validation?
For a single test I think I should threshold the classification scores of SVM to generate the ROC curve.
But I am unclear about how to generate it for a cross validation?
After a complete round of cross validation all observations have been classified once (although by different models) and have been give an estimated probability of belonging to the class of interest, or a similar statistic. These probabilities can be used to generate a ROC curve in exactly the same way as probabilities obtained on an external test set. Just calculate the classwise error rates as you vary the classification threshold from 0 to 1 and your are all set.
However, typically you would like to perform more than one round of crossvalidation, as the performance varies depending on how the folds are divided. It is not obvious to me how to calculate the mean ROC curve of all rounds. I suggest plotting them all and calculate the mean AUC.
As follow-up to Backlin:
The variation in the results for different runs of k-fold or leave-n-out cross validation show instability of the models. This is valuable information.
Of course you can pool the results and just generate one ROC.
But you can also plot the set of curves
see e.g. the R package ROCR
or calculate e.g. median and IQR at different thresholds and construct a band depicting these variations.
Here's an example: the shaded areas are the inter quartile ranges observed over 125 iterations of 8-fold cross validation. The thin black areas contain half of the observed specificity-sensitivity pairs for one particular threshold, median marked by x (ignore the + marks).

Resources