How LFW dataset used for evaluating facenet model - image-processing

I am building a face recognition model using facenet. I could in most of the papers, LFW is used for validation. Trying to understand how LFW is used for validation as it has only 1600 classes with more than 2 images out of 5400 classes. Trying to find answers for the following questions
1) For validation, do we need to use only the classes with more than 1 image and neglect the remaining class ?
2) In the below link there are files under the name 'pairs.txt' and 'people.txt'. How is it exactly used ?
http://vis-www.cs.umass.edu/lfw/

To prepare a flipped dataset as a query dataset
You can use original lfw as a reference dataset, and flip it as a query dataset.
check this repo for detail https://github.com/ZhaoJ9014/face.evoLVe.PyTorch/blob/master/util/extract_feature_v1.py.
the author also gave extract_feature_v2.py which adding centre crop before flip.

Related

tfx.components.StatisticsGen display train and eval in two different figures, is it possible to have them in a single figure as tfdv does?

a superimposed display for train/val splits using StatisticsGen
Hi,
I'm currently using tfx pipeline inside kubeflow. I struggle to have StatisticsGen showing a single graph with train and validation splits curves superimposed, allowing better comparaison distributions. this is exactly how tfdv.visualize_statistics(lhs_statistics=train_stats, rhs_statistics=eval_stats, lhs_name='train', rhs_name='eval') behaves (see illustration 1), and I would like StatisticsGen to also provide a superimposed splits graph.
Thanks for any reference or help so that i can move forward.
Regards
You can use something like
# docs-infra: no-execute
# Compare evaluation data with training data
tfdv.visualize_statistics(lhs_statistics=eval_stats, rhs_statistics=train_stats,
lhs_name='EVAL_DATASET', rhs_name='TRAIN_DATASET')
From the tensorflow data validation tutorial

what's dataset type in tensorflow object-detection api?

I am trying to do my own object detection using my own dataset. I started my first machine learning program from google tensorflow object detection api, the link is here:eager_few_shot_od_training_tf2_colab.ipynb
In the colab tutorial, the author use javascript label the images, the result like this:
gt_boxes = [
np.array([[0.436, 0.591, 0.629, 0.712]], dtype=np.float32),
np.array([[0.539, 0.583, 0.73, 0.71]], dtype=np.float32),
np.array([[0.464, 0.414, 0.626, 0.548]], dtype=np.float32),
np.array([[0.313, 0.308, 0.648, 0.526]], dtype=np.float32),
np.array([[0.256, 0.444, 0.484, 0.629]], dtype=np.float32)
]
When I run my own program, I use labelimg replace to javascript, but the dataset is not compatible.
Now I have two questions, the first one is what is the dataset type in colab tutorial? coco, yolo, voc, or any other? the second is how transform dataset between labelimg data and colab tutorial data? My target is using labelimg to label data then substitute in colab tutorial.
The "data type" are just ratio values based on the height and width of the image. So the coordinates are just ratio values for where to start and end the bounding box. Since each image is going to be preprocessed, that is, it's dimensions are changed when fed into the model (batch,height,width,channel) the bounding box coordinates must have the correct ratio as the image might change dimensions from it's original size.
Like for the example, the model expects images to be 640x640. So if you provide an image of 800x600 it has to be resized. Now if the model gave back the coordinates [100,100,150,150] for an 640x640, clearly that would not be the same for 800x600 images.
However, to get this data format you should use PascalVOC when using labelImg.
The typical way to do this is to create TFRecord files and decode them in your training script order to create datasets. However, you are free to choose whatever method you like Tensorflow dataset in order to train your model.
Hope this answered your questions.

Image Classification with single class dataset using Transfer Learning [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I only have around 1000 images of computers. I need to train a model that can identify if the image is computer or not-computer. I do not have a dataset for not-computer, as it could be anything.
I guess the best method for this would be to apply transfer learning. I am trying to train data on a pre-trained VGG19 Model. But still, I am unaware on how to train a model with just computers images without any non-computer images.
I am new to ML Overall, so sorry if question is not to the point.
No way, I'm sorry. You'll need a lot (at least other 1000 images) of non-computer images. You can take them from everywhere, the more they "vary" the better is for your model to extract what features characterize a computer.
Imagine to be a baby that is trained to always say "yes" in front of something, next time you'll se something you'll say "yes" no matter what is in front of you...
The same is for machine learning models, you need positive examples and negative examples, or your model will have 100% accuracy by predicting always "yes".
If you want to see it a mathematically/geometrically, you can see each sample (in your case an image) as a point in the feature space: imagine to draw an axis for each attribute you have (x,y,z an so on), an image will be a point in that space.
For simplicity let's consider a 2-dimension space, which means that each image could be described with 2 attributes (not the case for images, usually the features are a lot, but for simplicity imagine feature_1 = number of colors, feature_2 = number of angles), in this example we can simply draw a point in a cartesian graph, one for each image:
The objective of a classifier is to draw a line which better separate the red dots from the blue dots, which means separate positive examples, from negative examples.
If you give the model only positive samples (which is what you were going to do), you'll have infinite models with 100% accuracy! Because you can put a line wherever you want, the only requirement is to not "cut" your dataset.
Given that I suppose you are a beginner, I'll just tell you what to do, not how because it would take years ;)
1) Collect data - as I told you, even negative examples, at least other 1000 samples
2) Split the data into train/test - a good split could be 2/3 of the samples in the training set and 1/3 in the test set. [REMEMBER] Keep consistency of the final class distribution, i.e. if you had 50%-50% of classes "Computer"-"Non computer", you should keep that percentage for both train set and test set
3) Train a model - have a look at this link for a guided examples, it uses the MNIST dataset, which is a famous image classification one, you should use your data
4) Test the model on the test set and look at performance
While it is not impossible to take data belonging to one only one class of data and then use methods to classify whether other data belong to the same class or not, you usually do not end up with too good accuracy that way.
One way to do this, is to use something called "autoencoders". The point here is that you use the same image as input and as the target, and you make sure that the (usually neural network) is forced to compress the image in some way so that it only stores what is important to recreate images of computers. Ideally, this should lead to a model which is good at recreating images of computers, and bad at everything else, meaning you can test how high the loss is on the output, and if it higher than some threshold you've decided on, you deem it to be something else. Again, you're probably not going to get anything close to 90% accuracy doing this, but it is an approach to your problem.
A better approach is to go hunting for models which have been pre-trained on some dataset which had computers as part of the dataset, take the same dataset and set all computers to one class (+ your own images, make sure they adhere to the dataset format) and a selection of the other images to the other class. Make sure to not make the classes too unbalanced, otherwise your model will suffer from it. Extend the pre-trained model with a couple of layer, fully connected should probably do fine, and make the pre-trained part of the model not trainable, so you don't mess up the good weights there when you're practically telling it to ignore everything which is not a computer.
This is probably your best bet, but is going to require a bit more effort on your side in terms of finding all of these parts which you need to make it happen, and to understand how to integrate that code into yours.
You can either use transfer learning using a pretrained model on the imagenet dataset. As mentioned in another answer, there are a bunch of classes inside imagenet close to computers and electronic devices (such as monitors, CD players, laptops, speakers, etc.). So you can fine-tune the model on your dataset and train it to predict computers (train on around 750 images and test on the remaining 250).
You can manually collect images for objects other than computers, preferably a lot of electronic devices (because they are close to computers) and a bunch of other household things (there is a home objects dataset by Caltech). You should collect about 1000 such images to have a class balance. You can train your own custom model once you have this dataset.
No problem!
step one: install a deep-learning toolkit of your choice. they all come with nice tutorials these days.
step two: grab a pre-trained imagenet model. In that model, there are already a few computer classes built into it! ( "desktop_computer", "laptop", 'notebook", and another class for hand-held computers "hand-held_computer")
step three: use model to predict. for this, you'll need to have your images the correct size.
more steps: further fine-tune the model...a bit more advanced but will give you some gains.
Something to think about is what is your goal? accuracy? false positives/negatives, etc? It's always good having a goal of what you need to accomplish from the start.
EDIT: probably the easiest way to get started(if you don't have libraries, gpu, etc) is to go to google colab ( https://colab.research.google.com/notebooks/welcome.ipynb ) and make a notebook in your browser and run the following code.
#some code take and modded from https://www.learnopencv.com/keras-tutorial- using-pre-trained-imagenet-models/
import keras
import numpy as np
from keras.applications import vgg16
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.applications.imagenet_utils import decode_predictions
import matplotlib.pyplot as plt
from PIL import Image
import requests
from io import BytesIO
%matplotlib inline
vgg_model = vgg16.VGG16(weights='imagenet')
def predict_image(image_url, model):
response = requests.get(image_url)
original = Image.open(BytesIO(response.content))
newsize = (224, 224)
original = original.resize(newsize)
# convert the PIL image to a numpy array
# IN PIL - image is in (width, height, channel)
# In Numpy - image is in (height, width, channel)
numpy_image = img_to_array(original)
# Convert the image / images into batch format
# expand_dims will add an extra dimension to the data at a particular axis
# We want the input matrix to the network to be of the form (batchsize, height, width, channels)
# Thus we add the extra dimension to the axis 0.
image_batch = np.expand_dims(numpy_image, axis=0)
plt.imshow(np.uint8(image_batch[0]))
plt.show()
# prepare the image for the VGG model
processed_image = vgg16.preprocess_input(image_batch.copy())
# get the predicted probabilities for each class
predictions = model.predict(processed_image)
# convert the probabilities to class labels
# We will get top 5 predictions which is the default
label = decode_predictions(predictions)
print label[0][0:2] #just display top 2
urls = ['https://4.imimg.com/data4/CO/YS/MY-29352968/samsung-desktop-computer-500x500.jpg', 'https://cdn.britannica.com/77/170477-050-1C747EE3/Laptop-computer.jpg']
for u in urls:
predict_image(u, vgg_model)
This should be a good starting point. Oh, and if the top predicted label is not in the computer, laptop, etc set, then it's NOT a computer!

How to save feature values of all batch data from pretrained torch networks?

Now I'm using fb torch library from github fb torch resnet
It's my first time to use torch and lua, so Im encountering some problems.
My goal is to save the feature vector of specific layer (last avg pooling of resnet) into a one file with the class of the input image. All input images are from cifar-10 db.
The file format that i want to get is like belows
image1.txt := class index of image and feature vector of image 1 of cifar-10
image2.txt := class index of image and feature vector of image 2 of cifar-10
// and so on through all images of cifar-10
Now I have seen some sample code of that github extract-features.lua
Because it's my first time for lua, I feel so hard to understand this code and to modify to the way i want. And i don't want my data to save into t7 file format.
How can i access only one specific layer from network in torch via lua? (last average pooling)
How can i access values of the layer and classification result index?
How can read all each images from cifar-10 db file(t7 batch)?
Sorry for too many questions. But im feeling hard using torch because of pool amouns of community threads and posting of torch.. please understand me.
How can i access only one specific layer from network in torch via lua? (last average pooling)
To access each layer you just have to load the model and get it using an integer number. If you do print model you will be able to see in which position the last average pooling is.
model = torch.load(path_to_model):cuda()
avg_pooling_layer = model:get(position_of_the_avg_pooling_layer)
How can i access values of the layer and classification result index?
I do not quite understand what you mean by this. If you want to see the output or the weights from a specific layer. (following the code above) You need to get these elements from the layer table. Again, to see which ones are the possible elements to get use print avg_pooling_layer
weights = avg_pooling_layer.weight -- get the weights of the layer
output = avg_pooling_layer.output -- get the output of the layer
How can read all each images from cifar-10 db file(t7 batch)?
To read the images from a t7 file use the torch function torch.load. (used before to load the model).
cifar_10 = torch.load("path_to_cifar-10.t7")
Once loaded you could have the training and test set in subtables or functions. Again, print the table and visualize which values are the ones you need to get.
Hope this helps!

Applying Multi-label Transformation in Rapidminer?

I am working on text categorization in rapid miner and require to implement a problem transformation method to convert multi-label data set into single label i.e. Label Power set etc but couldn't find one in Rapid miner, i am sure i am missing something or may be Rapid miner has provided them with another name or something ?
1) I searched and found "Polynomial By Binomial" operator for Rapidminer which i think is using Binary Relevance internally for problem transformation but how can i apply others i.e. Label Power set or Classifier Chains ?
2) Secondly SVM (Learner) inside "Polynomial By Binomial" operator is applied K(Number of classes)times and combines 'K' Models into a single model but it would still classify a multi-label (multiple labels) example as a single label (one label) example, How can i get the multiple labels associate with an example ?
3) Do i have to store each model generated inside "Polynomial By Binomial" and then apply each on testing data to find out the multiple labels associate with an example ?
I am new to rapid miner so ignore my mistake
Thanks in Advance ...
Polynomial to Bionomial is not the way you want to go.
This operator performs something like XvsAll. This enables you to solve multiclass problems with a learner only capable doing binomial classification.
For your problem:
Would it to transform your table like this:
before:
ID Label
1 A|B|C
2 B|C
to
ID Label
1 A
2 B
3 C
4 B
5 C
The tricky thing for this is how to calculate the performance. But i think once this is clear a combination of recall/remember/remove duplicates and join will do it.

Resources