False Positive in Face Recognition - image-processing

I am involved in a project concerning the face recognition of images, of these images once I have used my CNN I must personally indicate the number of True Positive, True Negative, False Positive and False Negative present.
From what I know the True Positive indicates the number of faces correctly identified, so if I have 3 faces in an image and the machine identifies them then I have 3 True Positives, the True Negative indicates that the machine has not identified faces where faces do not are, the False Positive indicates errors of the machine that identifies faces where there are none and the False Negative indicates that the machine does not identify faces that are in fact there.
Unfortunately I didn't understand the concept of True Negative very well, how can I understand how many cases of True negative I have in a processing?
For example, if in a photo I have 3 faces and the machine identifies 3 of them then I will have 3 True Positive, 0 False Negative, 0 False Positive but how many True Negative will I have?

When you are talking about an image with 3 faces, that means your method (or any other face recognition method) probably runs series of testing on different section (sub-images) of the original image and find faces. In this cases, there are lots of frames where no face exists and your method did correctly and didn't find any face (equivalent to True Negative). You should count these tests.
In order to do that, you can reduce you problem to another problem which returns True/False to the question "Whether a face exists in this image or not?". Then you have a large image, you can break it into smaller images (or using a sliding-window that moves on the image). Then your measurements will be like:
True Positive: Number of frames where a face existed and your method returned True.
True Negative: Number of frames where no face existed and your method returned False.
False Positive: Number of frames where no face existed but your method returned True.
False Negative: Number of frames where a face existed but your method returned False.

Related

Object detection in an image using haar cascade(opencv)

I have around 2000 images and I have to find the images which are having the given "L" shaped object in it as shown below.
i have used the below images as positive images
ans the below ones as negatives which is not having the L shaped object.
and created the xml file(classifier) using using opencv_traincascade method of opencv.
But it is giving some false detection as well along with correct detection any way to increase accuracy
edit: some of the false cases

How to know if the data can be classified in machine learning

In classification problem, we can adjust soft margin to make things easier. However, in some cases, especially in real-world data collections, the classification problem can be extremely hard because of the noise introduced. In 2D space, the following graph can be obtained:
We can easily conclude that these two classes of data can not be classified. However, usually the features of data can be huge, which can not be plot in 2D or 3D space. I also tried TSNE to visualize the classified data. But since TSNE uses KL to train, while SVM uses loss function, I can not get anything from the following picture.
where the green ones are true positive, blue ones are false negative, yellow ones are false positive, black ones are true negative.
So my question is: is there a scientific method that can be used to analyse whether a problem can be classified or not?

Meaning of "False Positives Per Window"

In the paper Histograms of Oriented Gradients for Human Detection (Navneet Dalal and Bill Triggs) (see link below), to visualize their results, they use a ROC curve, on which the Y axis is TP and the X axis is FPPW (False Positives Per Window).
What is the meaning of this phrase FFPW?
I thought about 3 possible options... I don't know - maybe all of them are wrong. Your help will be appreciated:
Maybe it is the rate of incorrectly classified negative samples, which is: (NUMBER_OF_FALSE_POSITIVES / NUMBER_OF_NEGATIVE_SAMPLES)
Or maybe it is the rate of false alarms per true alarms, which is: (NUMBER_OF_FALSE_POSITIVES / NUMBER_OF_TRUE_POSITIVES)
Or maybe it is the rate of false alarms per true windows in the whle image,
which is: (NUMBER_OF_FALSE_POSITIVES / NUMBER_OF_TRUE_SAMPLES)
I'll be glad to know whether one of them is the correct one, or if you know any other correct definition.
Link to the paper:
(https://lear.inrialpes.fr/people/triggs/pubs/Dalal-cvpr05.pdf)
It appears to be defined as NUMBER_OF_FALSE_POSITIVES / NUMBER_OF_WINDOWS, where the detection window is a 64x128 moving window. Notice in the last paragraph of section 4 it states:
... In a multiscale detector it corresponds to a raw error rate of about 0.8 false positives per 640×480 image tested.
I had the same confusion. The authors state that they are using DET curves. When you look at several examples about DET curves you see that x-axis is actually False Positive Rate. That means FPPW is FALSE_POSITIVE_RATE.
Hence FPPW = NUMBER_OF_FALSE_POSITIVES / NUMBER_OF_NEGATIVE_SAMPLES
They have a window which they move across the image and evaluate if it shows a human or not.
FPPW is a measure of how often they detect something else as a human within their detector window. It describes the quality of their classification in a way that is independent from image sizes or people counts on a particular image.
So basically they count how often their dumb computer says "yes that's a human", when they show it some rock or icecream.

Filter false, blinking vectors (from findContours)

I'm using find contours to find points of interest. I have the image processing pretty much narrowed down, but the resulting vector list returns some false positives. These false positives tend to blink once in a video stream and then disappear and show up somewhere else again. These false positives are expected, but I wonder what algorithm I can use to eliminate them from the vector list, presumably with a list from the previous frame.
I'm using opencv on Android.
What is the algorithm or function I'm looking for called?
I solved this partially by a bitshift_and with a dilated version of the previous frame.

Train cascade classifier

I got some questions about the training of a cascade classifier:
On Some of my pictures half of the object is visible. Should I mark the visible part as region of interest, use the picture as negative sample or sort it out completely?
Is the classifier able to detect objects that are just partly visible (using Haar features)?
What should be the ratio of negative and positive samples? Often I read that you should use more negative samples. But for example in this thread it is mentioned that the ratio should be 2:1 (more positive samples).
My current classifier detects to much false positives. According to this tutorial you can either increase the number of stages or decrease the false alarm rate per stage. But I can't increase the number of stages without increasing the false alarm rate. If I just increase the number of stages, the training stops at some point because the classifier runs out of samples. Is the only way to reduce the false positives to increase the number of samples?
Would be glad if someone could answer one of my questions :)
In case of cascade classifier I would suggest to throw away the "half" objects. Since are they positive samples? no since they don't contain the object entirely, are they negative samples? no , because they are not something which have nothing to do with our object.
In my experience I started with training with almost similar number of negative and positive images, and I had the similar problem. Increasing the number of samples was the first step. You should probably increase the number of negative samples, note that you need to get different images, simply having 100 similar background images are almost the same as having only like 5-10 images. In my case the best ratio was positive:negative = 2:1. You still need to try out though it depends on the classifier you are trying to build. If your object is not something too fancy and comes in simple shapes and sizes (like a company logo or coin, or an orange) you don't have to get too many samples but if you are trying to build a classifier which checks for some complicated objects ( like a chair, yes.. chair is a serious object, since it comes in many different shapes and sizes) than you will need a lot of samples.
Hope this helps.

Resources