how to calculate the model accuracy manually? - machine-learning

I have a a confusion matrix and a classification report as shown in the attached image.
My question is, how this accuracy is calculated?
I have tried many formula to calculate the accuracy by hand but no one give me the right one?
please, I need a help to find how the model is calculate this accuracy?
What is the mathematical equation that he used to reach this accuracy?
enter image description here

You need to divide the total number of correct answers by your total sample size. In this case:
class 0: 918 * 0.9434... = 866 correct predictions
class 1: 2522 * 0.9964... = 2513
class 2: 349 * 0.9198... = 321
(866 + 2513 + 321) / 3789 = 0.9765...

Related

AUC for multiclass classification

Let's assume that we have a classification problem with 3 classes and that we have highly imbalanced data. Let's say in class 1 we have 185 data points, in class 2 199 and in class 3 720.
For calculating the AUC on a multiclass problem there is the macro-average (giving equal weight to the classification of each label) and micro-average method (considering each element of the label indicator matrix as a binary predictio) as written in the scikit-learn tutorial.
For such imbalanced dataset should micro-averaging or macro-averaging of AUC be used?
I'm unsure because when we have a confusion matrix as shown below, I'm getting a micro-averaged AUC of 0.76 and a macro-averaged AUC of 0.55.
Since you have the class with majority number of data points classified with much higher precision, the overall precision computed with micro-average is going to be higher than the same computed with macro-average.
Here, P1 = 12/185 = 0.06486486,
P2 = 11/199 = 0.05527638,
P3 = 670 / 720 = 0.9305556
overall precision with macro-average = (P1 + P2 + P3) / 3 = 0.3502323, which is much less than overall precision with micro-average = (12+11+670)/(185+199+720) = 0.6277174.
Same holds true for AUC.

How to identify the normalized feature

I have been trying to solve a problem stated in an exam of coursera. I am not seeking the solution but I need to get the steps and concepts to resolve this.
Can any one share the concept and steps to help me find the solution.
UPDATE:
I was expecting a down-vote and its not unusual, as its the most easiest thing people can do. I am seeking the direction to solve the problem as I wasn't able to get the idea to solve it after watching the videos on Coursera. I hope someone sensible out there can share a direction and step to achieve the mentioned goal.
Mean Normalization
Mean normalization, also known as 'standardization', is one of the most popular techniques of feature scaling.
Andrew Ng describes it in the 12a slide of lecture 4:
How to resolve the problem
The problem asks you to standardize the first feature in the third example: midterm = 94;
well, we have just to resolve the equation!
Just for clarity, the notation:
μ (mu) = "avg value of x in training set", in other words: the mean of the x1 column.
σ (sigma) = "range (max-min)", literaly σ = max-min (of the x1 column).
So:
μ = ( 89 + 72 + 94 +69 )/4 = 81
σ = ( 94 - 69 ) = 25
x_std = (94 - 81)/25 = 0.52
Result: 0.52
Best regards,
Marco.
The first step of solving this question is to identify what is , from the content of the lecture, it refers to the first feature of the third training case. Which is the unsquared version of the midterm score in the third row of the table.
Secondly, you need to understand the concept of normalization. The reason why we need normalization is that the value of some features among all training examples may much larger than the value of other features, which may make the cost function have pretty bad shape and this will make it harder gradient descent to find the minimum. In order to solve this, we want to make all features have nearly the same scale, and make the range of the feature to be centered at zero.
In this question, we want to scale every feature to a scale of 1, in order to do this, you need to find the max and min value of the feature among all training cases. Then squeeze the range of the feature to 0 and 1. The second step is to find the center value of the feature (average value in this case) and move the center value of the feature to 0.
I think this is pretty much all hints I can give you, you will totally be able to calculate the answer to this question by yourself from this point.

How to calculate the accuracy of classes from a 7x7 confusion matrix?

So I've got the following results from Naïves Bayes classification on my data set:
I am stuck however on understanding how to interpret the data. I am wanting to find and compare the accuracy of each class (a-g).
I know accuracy is found using this formula:
However, lets take the class a. If I take the number of correctly classified instances - 313 - and divide it by the total number of 'a' (4953) from the row a, this gives ~6.32%. Would this be the accuracy?
EDIT: if we use the column instead of the row, we get 313/1199 which gives ~26.1% which seems a more reasonable number.
EDIT 2: I have done a calculation of the accuracy of a in excel which gives me 84% as the accuracy, using the accuracy calculation shown above:
This doesn't seem right, as the overall accuracy of classification successfully is ~24%
No -- all you've calculated is tp/(tp+fn), the total correct identifications of class a, divided by the total of actual a examples. This is recall, not accuracy. You need to include the other two figures.
fp is the rest of the a column; tn is all of the other figures in the non-a rows and columns, the 6x6 sub-matrix. This will reduce all 35K+ trials to a 2x2 matrix with labels a and not a, the 2x2 confusion matrix with which you're already familiar.
Yes, you get to repeat that reduction for each of the seven features. I recommend doing it programmatically.
RESPONSE TO OP UPDATE
Your accuracy is that high: you have a huge quantity of true negatives, not-a samples that were properly classified as not-a.
Perhaps it doesn't feel right because our experience focuses more on the class in question. There are [other statistics that handle that focus.
Recall is tp / (tp+fn) -- of all items actually in class a, what percentage did we properly identify? This is the 6.32% figure.
Precision is tp / (tp + fp) -- of all items identified as class a, what percentage were actually in that class. This is the 26.1% figure you calculated.

Precision and Recall computation for different group sizes

I didn't find an answer for this question anywhere, so I hope someone here could help me and also other people with the same problem.
Suppose that I have 1000 Positive samples and 1500 Negative samples.
Now, suppose that there are 950 True Positives (positive samples correctly classified as positive) and 100 False Positives (negative samples incorrectly classified as positive).
Should I use these raw numbers to compute the Precision, or should I consider the different group sizes?
In other words, should my precision be:
TruePositive / (TruePositive + FalsePositive) = 950 / (950 + 100) = 90.476%
OR should it be:
(TruePositive / 1000) / [(TruePositive / 1000) + (FalsePositive / 1500)] = 0.95 / (0.95 + 0.067) = 93.44%
In the first calculation, I took the raw numbers without any consideration to the amount of samples in each group, while in the second calculation, I used the proportions of each measure to its corresponding group, to remove the bias caused by the groups' different size
Answering the stated question: by definition, precision is computed by the first formula: TP/(TP+FP).
However, it doesn't mean that you have to use this formula, i.e. precision measure. There are many other measures, look at the table on this wiki page and choose the one most suited for your task.
For example, positive likelihood ratio seems to be the most similar to your second formula.

Experiment with OHSUMED dataset using SVM Rank

I am trying to learn RankSVM using OHSUMED dataset and SVM Rank library as explained in following link:
http://research.microsoft.com/en-s/um/beijing/projects/letor/Baselines/RankSVM-Struct.txt
I used same parameters as link suggests for OHSUMED dataset. i.e
OHSUMED/QueryLevelNorm/cv_l1_e0.001/fold1_l1_c0.0002_e0.001.log
OHSUMED/QueryLevelNorm/cv_l1_e0.001/fold2_l1_c0.002_e0.001.log
OHSUMED/QueryLevelNorm/cv_l1_e0.001/fold3_l1_c0.01_e0.001.log
OHSUMED/QueryLevelNorm/cv_l1_e0.001/fold4_l1_c0.02_e0.001.log
OHSUMED/QueryLevelNorm/cv_l1_e0.001/fold5_l1_c0.01_e0.001.log
But when I train my model & run "svm_rank_classify" command I get following result:
Reading model...done.
Reading test examples...done.
Classifying test examples...done
Runtime (without IO) in cpu-seconds: 0.00
Average loss on test set: 0.3864
Zero/one-error on test set: 100.00% (0 correct, 22 incorrect, 22 total)
NOTE: The loss reported above is the fraction of swapped pairs averaged over
all rankings. The zero/one-error is fraction of perfectly correct
rankings!
Total Num Swappedpairs : 31337
Avg Swappedpairs Percent: 38.64
Please suggest If any steps I am missing here?
Thanks.
The zero/one-error is the percentage of rankings (i.e. qid sets) where the model ranked at least one pair incorrectly. Your accuracy over all pairs is actually:
(100 - Avg Swappedpairs Percent) = 61.36%

Resources