Difference between score and predict - machine-learning

What is the difference between .score and .predict? As I recall we input two parameters in score which are(features, target) while in predict we give only (X_train). What is the main difference and what to use when?

score is used for evaluating the quality of a model’s predictions
predict is used to predict the output for a given input


SVM model predicts instances with probability scores greater than 0.1(default threshold 0.5) as positives

I'm working on a binary classification problem. I had this situation that I used the logistic regression and support vector machine model imported from sklearn. These two models were fit with the same , imbalanced training data and class weights were adjusted. And they have achieved comparable performances. When I used these two pre-trained models to predict a new dataset. The LR model and the SVM models predicted similar number of instances as positives. And the predicted instances share a big overlap.
However, when I looked at the probability scores of being classified as positives, the distribution by LR is from 0.5 to 1 while the SVM starts from around 0.1. I called the function model.predict(prediction_data) to find out the instances predicted as each class and the function
model.predict_proba(prediction_data) to give the probability scores of being classified as 0(neg) and 1(pos), and assume they all have a default threshold 0.5.
There is no error in my code and I have no idea why the SVM predicted instances with probability scores < 0.5 as positives as well. Any thoughts on how to interpret this situation?
That's a known fact in sklearn when it comes to binary classification problems with SVC(), which is reported, for instance, in these github issues
(here and here). Moreover, it is also
reported in the User guide where it is said that:
In addition, the probability estimates may be inconsistent with the scores:
the “argmax” of the scores may not be the argmax of the probabilities; in binary classification, a sample may be labeled by predict as belonging to the positive class even if the output of predict_proba is less than 0.5; and similarly, it could be labeled as negative even if the output of predict_proba is more than 0.5.
or directly within libsvm faq, where it is said that
Let's just consider two-class classification here. After probability information is obtained in training, we do not have prob > = 0.5 if and only if decision value >= 0.
All in all, the point is that:
on one side, predictions are based on decision_function values: if the decision value computed on a new instance is positive, the predicted class is the positive class and viceversa.
on the other side, as stated within one of the github issues, np.argmax(self.predict_proba(X), axis=1) != self.predict(X) which is where the inconsistency comes from. In other terms, in order to always have consistency on binary classification problems you would need a classifier whose predictions are based on the output of predict_proba() (which is btw what you'll get when considering calibrators), like so:
def predict(self, X):
y_proba = self.predict_proba(X)
return np.argmax(y_proba, axis=1)
I'd also suggest this post on the topic.

Can a probability score from a machine learning model XGBClassifier or Neural Network be treated as a confidence score?

If there are 4 classes and output probability from the model is A=0.30,B=0.40,C=0.20 D=0.10 then can I say that output from the model is class B with 40% confidence? If not then why?
Although a softmax activation will ensure that the outputs satisfy the surface Kolmogorov axioms (probabilities always sum to one, no probability below zero and above one) and the individual values can be seen as a measure of the network's confidence, you would need to calibrate the model (train it not as a classifier but rather as a probability predictor) or use a bayesian network before you could formally claim that the output values are your per-class prediction confidences. (https://arxiv.org/pdf/1706.04599.pdf)

What is the difference between Keras model.evaluate() and model.predict()?

I used Keras biomedical image segmentation to segment brain neurons. I used model.evaluate() it gave me Dice coefficient: 0.916. However, when I used model.predict(), then loop through the predicted images by calculating the Dice coefficient, the Dice coefficient is 0.82. Why are these two values different?
The model.evaluate function predicts the output for the given input and then computes the metrics function specified in the model.compile and based on y_true and y_pred and returns the computed metric value as the output.
The model.predict just returns back the y_pred
So if you use model.predict and then compute the metrics yourself, the computed metric value should turn out to be the same as model.evaluate
For example, one would use model.predict instead of model.evaluate in evaluating an RNN/ LSTM based models where the output needs to be fed as input in next time step
The problem lies in the fact that every metric in Keras is evaluated in a following manner:
For each batch a metric value is evaluated.
A current value of loss (after k batches is equal to a mean value of your metric across computed k batches).
The final result is obtained as a mean of all losses computed for all batches.
Most of the most popular metrics (like mse, categorical_crossentropy, mae) etc. - as a mean of loss value of each example - have a property that such evaluation ends up with a proper result. But in case of Dice Coefficient - a mean of its value across all of the batches is not equal to actual value computed on a whole dataset and as model.evaluate() uses such way of computations - this is the direct cause of your problem.
The keras.evaluate() function will give you the loss value for every batch. The keras.predict() function will give you the actual predictions for all samples in a batch, for all batches. So even if you use the same data, the differences will be there because the value of a loss function will be almost always different than the predicted values. These are two different things.
It is about regularization. model.predict() returns the final output of the model, i.e. answer. While model.evaluate() returns the loss. The loss is used to train the model (via backpropagation) and it is not the answer.
This video of ML Tokyo should help to understand the difference between model.evaluate() and model.predict().

Likelihood of a sample prediction in machine learning

I know some machine learning algorithms can output the probability of predicted labels of an input sample.
For example, give a sample with three possible labels, a probability tuple (0.2,0.3,0.5) can be outputted through some probabilistic learning algorithms, such as logistic regression or probability estimate tree. Then the label with maximum probability (here 0.5) is outputted as the final prediction.
My question is, given a new sample having the predicted probability tuple (0.3,0.4,0.3), how can I quantitatively determine the likelihood of that the predicted label (here the second label) is correct?
Many thanks
(This IMHO question doesn't belong here. It does belong to stat stack exchange)
The answer is freakingly simple: the probability/likelihood -- which is not exactly the same -- is 0.4, which is pretty low.
If you want to run a small experiment. Build/learn a model classify a few instances and compare it to the ground truth. In addition sum up the probability of the most likely label. You will see that the sum of probabilities matches the fraction of correctly classified instances
your model is wrong
your sample set is to small

How to get scores along with Machine Learning results?

I am new to machine learning and I am currently working on classification problem. I am able to train the model and predict test data sets. I want to know whether is there some way by which I can get scores along with the prediction. By scores , I mean those are proximity scores along with prediction. For example, in standard age-salary-buy (based on age and salary whether the customer will buy the product or not) classification problem, I want to know what is a score out of 100 that he will buy that product in addition to the prediction of whether he will buy it or not.
Currently, I am using LibSVM Algo. Is there some algo which provides me above data ?
What you are looking for is a support of your decision. In other words, many classifiers base their decision of x class over labels Y on:
cl(x) = arg max_{y \in Y} p(y|x)
where p(y|x) is their internal estimation of "x having label y". And such classifiers include:
neural networks (with sigmoid output)
logistic regression
naive bayes
voting ensembles (such as RF)
These methods can be easily converted to your 0-100 scale, as probability is in 0-1 scale.
Some, on the other hand use measure proportional to probability (such as SVM), but unbounded, here you can get this value (often called decision function) but you cannot convert it to 0-100 score (as you do not have "maximum" value). This is a big drawback, so some modification were proposed. In particular for SVM you have Platt's scaling which actually fits a logistic regression on top of SVM so you get your probability estimate. In libSVM you can set -b to get probability estimates
from libsvm website
-b probability_estimates: whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0)
