I evaluate a segmentation model using a bound box technique. Then I
sum the values of TP, FP, TN, and FN for each image. The total images were
10 (rows numbers in the below table). I need to calculate the accuracy of this model.
The equation of accuracy = (TP+TN)/(TP+FP+FN+TN)
(TP+FP+FN+TN) is the total number. I confused of the total here ...(actual and predicted
The question is: what is the value of the Total Number in this case? Why?
imgNo TP FP TN FN
1 4 0 0 0
2 6 1 1 0
3 2 3 0 0
4 1 1 1 0
5 5 0 0 0
6 3 1 0 0
7 0 3 1 0
8 1 0 0 0
9 3 2 1 0
10 4 1 1 0
I appreciate any help.
TP : True Positive is the number of objects you correctly identified in image.
FP : False Positive are objects you identified but actually that's a mistake because there is no such object in ground-truth.
TN : True Negative is when algorithm doesn't identify any object and indeed that is the case with ground-truth. i.e. correct negative identification.
FN : False Negative is when your algorithm failed to identify objects (i.e. the ground truth contains objects in the image(s), but it is marked as background by your algorithm). In other words, you missed identifying an object.
Its 0 anyway in your experiments.
So, TP+TN = True Total cases. Don't include FN because that is wrong detection.
you can use a heat map to visual analyze your coefficients from a logistic regression. roc_curve returns the false positives and true positive values. The confusion matrix returns fp, tp, fn, and fp aggregates.
fpr, tpr, threshholds = roc_curve(y_test,y_preds_proba_lr_df)
plt.plot([0, 1], [0, 1], 'k--')
plt.plot(fpr, tpr)
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC Curve')
plt.show()
accuracy=round(pipeline['lr'].score(X_train, y_train) * 100, 2)
print("Model Accuracy={accuracy}".format(accuracy=accuracy))
cm=confusion_matrix(y_test,predictions)
sns.heatmap(cm,annot=True,fmt="g")
Related
I have a dataset of factory workstations.
There are two types of error in same particular time.
User selects error and time interval (dependent variable-y)
Machines produces errors during production (independent variables-x)
User selected error types are 8 unique in total so I tried to predict those errors using machine-produced errors(total 188 types) and some other numerical features such as avg. machine speed, machine volume, etc.
Each row represents user-selected error in particular time;
For example in first line user selects time interval as:
2018-01-03 12:02:00 - 2018-01-03 12:05:37
and m_er_1(machine error 1) also occured in same time interval 12 times.
m_er_1_dur(machine error 1 duration) is total duration of machine error in seconds
So I matched those two tables and looks like below;
user_error m_er_1 m_er_2 m_er_3 ... m_er_188 avg_m_speed .. m_er_1_dur
A 12 0 0 0 150 217
B 0 0 2 0 10 0
A 3 0 0 6 34 37
A 0 0 0 0 5 0
D 0 0 0 0 3 0
E 0 0 0 0 1000 0
In the end, I have 1900 rows 390 rows( 376( 188 machine error counts + 188 machine error duration) + 14 numerical features) and due to machine errors it is a sparse dataset, lots of 0.
There a none outliers, none nan values, I normalized and tried several classification algorithms( SVM, Logistic Regression, MLPC, XGBoost, etc.)
I also tried PCA but didn't work well, for 165 components explained_variance_ratio is around 0.95
But accuracy metrics are very low, for logistic regression accuracy score is 0.55 and MCC score around 0.1, recall, f1, precision also very low.
Are there some steps that I miss? What would you suggest for multiclass classification for sparse dataset?
Thanks in advance
I am testing a dataset with two labels 'A' and 'B' on a decision tree classifier. I accidentally found out that the model get different precision result on the same testing data. I want to know why.
Here is what I do, I train the model, and test it on
1. the testing set,
2. the data only labelled 'A' in the testing set,
3. and the data only labelled 'B'.
Here is what I got:
for testing dataset
precision recall f1-score support
A 0.94 0.95 0.95 25258
B 0.27 0.22 0.24 1963
for data only labelled 'A' in testing dataset
precision recall f1-score support
A 1.00 0.95 0.98 25258
B 0.00 0.00 0.00 0
for data only labelled 'B' in testing dataset
precision recall f1-score support
A 0.00 0.00 0.00 0
B 1.00 0.22 0.36 1963
The training dataset and model are the same, the data in 2 and 3rd test are also same with those in 1. Why the precision for 'A' and 'B' differ so much? What is the real precision for this model? Thank you very much.
You sound confused, and it is not at all clear why you are interested in metrics where you have completely remove one of the two labels from your evaluation set.
Let's explore the issue with some reproducible dummy data:
from sklearn.metrics import classification_report
import numpy as np
y_true = np.array([0, 1, 0, 1, 1, 0, 0])
y_pred = np.array([0, 0, 1, 1, 0, 0, 1])
target_names = ['A', 'B']
print(classification_report(y_true, y_pred, target_names=target_names))
Result:
precision recall f1-score support
A 0.50 0.50 0.50 4
B 0.33 0.33 0.33 3
avg / total 0.43 0.43 0.43 7
Now, let's keep only class A in our y_true:
indA = np.where(y_true==0)
print(indA)
print(y_true[indA])
print(y_pred[indA])
Result:
(array([0, 2, 5, 6], dtype=int64),)
[0 0 0 0]
[0 1 0 1]
Now, here is the definition of precision from the scikit-learn documentation:
The precision is the ratio tp / (tp + fp) where tp is the number of true positives and fp the number of false positives. The precision is intuitively the ability of the classifier not to label as positive a sample that is negative.
For class A, a true positive (tp) would be a case where the true class is A (0 in our case), and we have indeed predict A (0); from above, it is apparent that tp=2.
The tricky part is the false positives (fp): they are the cases where we have predicted A (0), where the true label is B (1). But it is apparent here that we cannot have any such cases, since we have (intentionally) removed all the B's from our y_true (why we would want to do such a thing? I don't know, it does not make any sense at all); hence, fp=0 in this (weird) setting. Hence, our precision for class A will be tp / (tp+0) = tp/tp = 1.
Which is the exact same result given by the classification report:
print(classification_report(y_true[indA], y_pred[indA], target_names=target_names))
# result:
precision recall f1-score support
A 1.00 0.50 0.67 4
B 0.00 0.00 0.00 0
avg / total 1.00 0.50 0.67 4
and obviously the case for B is identical.
why the precision is not 1 in case #1 (for both A and B)? The data are the same
No, they are very obviously not the same - the ground truth is altered!
Bottom line: removing classes from your y_true before computing precision etc. does not make any sense at all (i.e. your reported results in case #2 and case #3 are of no practical use whatsoever); but, since for whatever reasons you decide to do so, your reported results are exactly as expected.
I am currently trying to revise for my final year exams and came across this question, I have looked everywhere in my lecture slides for any sort of help and cannot find any. Any help in providing insight in to how to solve this question would be appreciated (I am not just asking for the answer, I need to comprehend the topic). Furthermore, do I assume that all inputs are equal to 1? Do i include 7 inputs in the input layer? Im at a loss as to how to answer.
The question is as follows:
b) Determine, with justification, the simplest type and topology (i.e. number of neurons & layers) of artificial neural network that could learn the data set below.
Click here for picture of the dataset.
If I'm not mistaken, you have two inputs X1, X2, and one target output. For each input consisting, of two numbers X1, X2, the appropriate output ("target") is given.
As a first step, you could sketch the seven data points - just draw the 3 ones and 4 zeroes at the right places on on the square (X1, X2) ∈ [0, 1.05] × [0, 1]. Maybe you remember something similar from the lecture, possibly near a mention of "XOR".
The edit queue is full, so adding data from the linked image here
Pattern X1 X2 Target
1 0.01 -0.1 1
2 0.90 0.09 0
3 0.89 -0.05 0
4 1.05 0.95 1
5 -0.01 0.12 0
6 1.05 0.97 1
7 0.98 0.10 0
It looks like 1 possible solution is X1 >= 1.0 OR X2 <= -0.1
Alternatively, if you round each of X1 and X2, it becomes
Pattern X1 X2 Target
1 0 0 1
2 1 0 0
3 1 0 0
4 1 1 1
5 0 0 0
6 1 1 1
7 1 0 0
Then it IS XOR, and the solution is round(X1) XOR round(X2). In that case you can use 1 activation layer (like round, RELU, sigmoid, linear), 1 hidden layer of 2 neurons and 1 output layer of 1 neuron.
See this stackoverflow post for a detail of how to solve XOR with a neural net.
I have used Weka to train my data set, but I don't know if I got a good result then.
Can someone gives me some ideas?
This is my result:
=== Stratified cross-validation ===
=== Summary ===
Correctly Classified Instances 2823 97.9188 %
Incorrectly Classified Instances 60 2.0812 %
Kappa statistic 0
Mean absolute error 0.0208
Root mean squared error 0.1443
Relative absolute error 50.6234 %
Root relative squared error 101.0568 %
Coverage of cases (0.95 level) 97.9188 %
Mean rel. region size (0.95 level) 50 %
Total Number of Instances 2883
Ignored Class Unknown Instances 119
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Class
0.000 0.000 0.000 0.000 0.000 0.000 0.500 0.020 0
1.000 1.000 0.979 1.000 0.989 0.000 0.500 0.940 1
Weighted Avg. 0.979 0.979 0.959 0.979 0.969 0.000 0.500 0.921
=== Confusion Matrix ===
a b <-- classified as
0 60 | a = 0
0 2823 | b = 1
2 classes was used, tagged with b classes all instances correctly classified, so Correctly Classified Instances rate = 2823/(2823+60) = 97.9188% and class 1 TP value is 1, but never tagged with a classes instances correctly classified so Incorrectly Classified Instances rate = 60/(2823+60) = 2.0812% and class 0 TP value is 0
I performed classification on a small data set 65x9 using Decision Trees (Random Forest and Random Tree). I have four classes and 8 Attributes and 65 Instances.
My Application is in assistive robotics. So,Im extracting some parameters from my sensor data that I think are relevant to classify the users run while they are performing some task. I get the movement data from the sensor package deployed on the wheelchair. Im classify certain action like turning 180 degree, and Im giving him a mark (from 1 to 4) So from the sensor package and the software I had extracted parameters like velocity, distance, time, standard deviation of the velocity etc. that are relevant for the classification of the users run. So my data are all numbers.
When I performed Decision Trees Classify I got this Results
=== Classifier model (full training set) ===
Random forest of 10 trees, each constructed while considering 4 random features.
Out of bag error: 0.5231
Time taken to build model: 0.01 seconds
=== Evaluation on training set ===
=== Summary ===
Correctly Classified Instances 64 98.4615 %
Incorrectly Classified Instances 1 1.5385 %
Kappa statistic 0.9791
Mean absolute error 0.0715
Root mean squared error 0.1243
Relative absolute error 19.4396 %
Root relative squared error 29.0038 %
Total Number of Instances 65
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure ROC Area Class
1 0 1 1 1 1 c1
1 0 1 1 1 1 c2
0.952 0 1 0.952 0.976 1 c3
1 0.019 0.917 1 0.957 1 c4
Weighted Avg. 0.985 0.003 0.986 0.985 0.985 1
=== Confusion Matrix ===
a b c d <-- classified as
14 0 0 0 | a = c1
0 19 0 0 | b = c2
0 0 20 1 | c = c3
0 0 0 11 | d = c4
This is too good. Am I doing something wrong?