accuracy as a function of precision and recall - machine-learning

I know that:
Accuracy = TP+TN/TP+FP+FN+TN
Precision = TP/TP+FP
Recall = TP/TP+FN
But I want to know if there is any way to compute accuracy given only precision and recall values.

I could not find anything using only precision and recall but this works fine for me:

Related

Scikit_learn precision and recall computed incorectly

I have unbelievably stupid problem. Calculating precision and recall by sci-kit learn gives me crazy values, totally different than calculated by me, using confusion matrix.
Here's my code:
I tries also average 'weighted' and 'macro', and separated functions f_score, precision_score and recall_score. Nothing helped.
I got these results:
Firstly there is y_test values, then y_pred (as you can see, there is only one true positive prediction) then recall and precision calculated out of confusion matrix results (precision 0.14 is something I did expected). In the end there are precision and recall calculated by sklearn function and... I don't understand! Why the difference?!
Does anyone have idea why these results look like this?
Yeah, that was veeery stupid problem. The solution was changing average='micro' to 'binary'. Then the results are correct.

High AUC and 100% recall, but precision and F1 are low

I have an imbalanced dataset which has 43323 rows and 9 of them belong to 'failure' class, other rows belong to 'normal' class. I trained a classifier with 100% recall and 94.89% AUC for test data (0.75/0.25 split with stratify = y). However, the classifier has 0.18% precision & 0.37% F1 score. I assumed I can find better F1 score by changing the threshold but I failed (I checked the threshold between 0 to 1 with step = 0.01). Also, it seems weired to me that usually when dealing with imbalanced dataset, it is hard to get a high recall. The goal is to get a better F1 score. What can I do for the next step? Thanks!
(To be clear, I used SMOTE to upsample the failure samples in training dataset)
Getting 100% recall is trivial in fact: just classify everything as 1.
Is the precision/recall curve any good? Perhaps a more thorough scan could yield a better result:
probabilities = model.predict_proba(X_test)
precision, recall, thresholds = sklearn.metrics.precision_recall_curve(y_test, probabilities)
f1_scores = 2 * recall * precision / (recall + precision)
best_f1 = np.max(f1_scores)
best_thresh = thresholds[np.argmax(f1_scores)]

Precision and Recall computation for different group sizes

I didn't find an answer for this question anywhere, so I hope someone here could help me and also other people with the same problem.
Suppose that I have 1000 Positive samples and 1500 Negative samples.
Now, suppose that there are 950 True Positives (positive samples correctly classified as positive) and 100 False Positives (negative samples incorrectly classified as positive).
Should I use these raw numbers to compute the Precision, or should I consider the different group sizes?
In other words, should my precision be:
TruePositive / (TruePositive + FalsePositive) = 950 / (950 + 100) = 90.476%
OR should it be:
(TruePositive / 1000) / [(TruePositive / 1000) + (FalsePositive / 1500)] = 0.95 / (0.95 + 0.067) = 93.44%
In the first calculation, I took the raw numbers without any consideration to the amount of samples in each group, while in the second calculation, I used the proportions of each measure to its corresponding group, to remove the bias caused by the groups' different size
Answering the stated question: by definition, precision is computed by the first formula: TP/(TP+FP).
However, it doesn't mean that you have to use this formula, i.e. precision measure. There are many other measures, look at the table on this wiki page and choose the one most suited for your task.
For example, positive likelihood ratio seems to be the most similar to your second formula.

Why do I have big precision errors on GPU?

I am doing a series of calculations on GPU that requires a good enough precision, but it seems I am getting a much lower precision than when using float on CPU.
For starters, when I load a value of 0.01 in a float buffer, it gets loaded as 0.009995 in the shader. Why is that? I would think 0.01 is a value in range for float vectors (using the simd library available for Metal).
Then, when doing a simple operation like this, the precision gets visibly worse:
simd::float4 p = simd::float4 { -0.04, -0.07, 0, 1 };
simd::float4 v = myMatrix * p;
v *= 1.0 / v.w;
p in the example is what I expect and use in the CPU test; on the GPU it is calculated as { -0.039978, -0.069946, 0.0, 1.0 }, with one integer subtraction and one float multiplication by the already wrong 0.009995.
What I would expect to get from v is { -0.010627, 0.006991, -0.034100 } (calculated with the simd library on CPU, already worse precision than using doubles, { -0.010613, 0.006982, -0.034056 }, but bearable).
What I get instead is { -0.010483, 0.006405, -0.044067 }. This gets much worse with subsequent operations and the result becomes quickly unusable.
Why is the result so different even if using the same precision and why float data is not loaded 1:1? I tried disabling the fast math option for Metal, but it didn't change anything.
Alas, it was not a precision issue, as the way I setup the test wasn't correct, so the GPU wasn't using the numbers I actually thought it used.
You can't have 0.01 in float value because there is no binary representation. That is why 0.009995 was used. I guess SO already has good answers about floating point number representation in binary so you need just search.
Here is good tool for checking how float number looks like in binary. If you enter 0.01 you see this:
Decimal Representation: 0.01
Binary Representation: 00111100001000111101011100001010
After casting to double precision: 0.009999999776482582

what is f-measure for each class in weka

When we evaluate a classifier in WEKA, for example a 2-class classifier, it gives us 3 f-measures: f-measure for class 1, for class 2 and the weighted f-measure.
I'm so confused! I thought f-measure is a balanced measure that show balanced performance measure for multiple class, so what does f-measure for class 1 and 2 mean?
The f-score (or f-measure) is calculated based on the precision and recall. The calculation is as follows:
Precision = t_p / (t_p + f_p)
Recall = t_p / (t_p + f_n)
F-score = 2 * Precision * Recall / (Precision + Recall)
Where t_p is the number of true positives, f_p the number of false positives and f_n the number of false negatives. Precision is defined as the fraction of elements correctly classified as positive out of all the elements the algorithm classified as positive, whereas recall is the fraction of elements correctly classified as positive out of all the positive elements.
In the multiclass case, each class i have a respective precision and recall, in which a "true positive" is an element predicted to be in i is really in it and a "true negative" is an element predicted to not be in i that isn't in it.
Thus, with this new definition of precision and recall, each class can have its own f-score by doing the same calculation as in the binary case. This is what Weka's showing you.
The weighted f-score is a weighted average of the classes' f-scores, weighted by the proportion of how many elements are in each class.
I am confused too,
I used the same equation for f-score for each class depending of their precision and recall, but the results are different!
example:
f-score different from weka claculaton

Resources