I want to build a image classifier i gathered images from web and i resized them using PIL libray
now i want those images to be converted as input .what operations do i need to perform on these
images.I also did covert images in to numpy arrays and stored them in an list named features and what to do next
Well there are a number of decisions to make. One is to partition your images into a training set, a validation set and generally also a test set. I typically use 10% of the images as a validation set and 10% of the images as a test set. Next you need to decide how you want to provide your images to the network. My preference is to use the Keras ImageDataGenerator.flow from directory. This requires you to create 3 directories to store images. I put the test images in a directory called 'test', the validation images in a directory called 'valid' and the training images in a directory called 'train'. Now in each of these directories you need to create identically named class directories. For example if you are trying to classify images of dogs and cats. You would create a 'dogs' sub directory and 'cats' sub directory within the test, train and valid directories. Be sure to name them identically because the names of the sub directories determine the names of your classes. Now populate the class directories with your images. These can be images in standard formats like jpg. Now create 3 generators a train generator, a validation generator and a test generator as in
train_gen=ImageDataGenerator(preprocessing_function=pre_process).flow_from_directory('train', target_size=(height, width), batch_size=train_batch_size, seed=rand_seed, class_mode='categorical', color_mode='rgb')
do the same for the validation generator and the test generator. Documentation for the ImageDataGenerator and flow_from_directory is here.. Now you have your images stored and the data generators set up to provide data to your model in batches based on batch size. So now we can get to actually building a model. You can build your own model however there are excellent models for image processing available for you to use. This is called transfer learning. I like to use a model called MobileNet. I prefer this because it has a small number of trainable parameters (about 4 million) versus other models which have 10's of millions. Keras has this and many other image processing models . Documentation is here. Now you have to modify the final layer of the model to adapt it to your application. MobileNet was trained on the ImageNet data set that had 1000 classes. You need to remove this last layer and make it a dense layer having as many nodes as you have classes and use the softmax activation function. An example for the case of 2 classes is shown below.
mobile = tf.keras.applications.mobilenet.MobileNet( include_top=Top,
input_shape=(height,width,3),
pooling='avg', weights='imagenet',
alpha=1, depth_multiplier=1)
x=mobile.layers[-2].output
predictions=Dense (2, activation='softmax')(x)
model = Model(inputs=mobile.input, outputs=predictions)
for layer in model.layers:
layer.trainable=True
model.compile(Adam(lr=.001, loss='categorical_crossentropy', metrics=['accuracy'])
The last line of code compiles your model using the Adam optimizer with a learning rate of .001.Now we can finally get to training the model. I use the modelfit generator as shown below:
data = model.fit_generator(generator = train_gen,validation_data=val_gen, epochs=epochs, initial_epoch=start_epoch,
callbacks = callbacks, verbose=1)
Documentation for the above is here. The model will train on your training set and validate on the validation set. For each epoch (training cycle) you will get a print out of the training loss, training accuracy, validation loss and validation accuracy so you can monitor how your model is performing. The final step is to run your test set to see how well your model performs on data it was not trained on. Do do that use the code below:
resultspmodel.evaluate(test_gen, verbose=0)
print('Model accuracy on Test Set is {0:7.2f} %'.format(results[1]* 100)
That's about it but of course there are a lot of details to fill in. If you are new to Convolutional Neural Networks and machine learning I would recommend an excellent tutorial on YouTube at here. There are about 20 sequential tutorials in the play list. I used this tutorial as a beginner and found it excellent. It will cover all the topics you need to become skilled at using CNN classifiers. Good Luck!
Related
I am using the Tensorflow Object Detection API to train a couple of models (with SSD and Faster RCNN) in a custom dataset. Everything works well, but I want to know how to extract the convolutional and classification model weights, in order to load those weights in an external (for instance keras) convolutional and full connected corresponding model. I've read about the meta architectures (SSDMetaArch and FasterRCNNMetaArch) and restoring checkpoint, but I am not sure yet how to do it for my purpose.
The above because I want to use something like CAM or GradCAM to visually check what the model learns for every class in my dataset.
Thank you
I am quite new to nvidia-tlt. Currently, I have trained, pruned and retrained the model with the kitti dataset, also am able to do these steps on any datasets with the required kitti format. What I want to do is used a previously trained model on kitti and fine tune it to my own data. The config file have the options pretrained_model_path, resume_model_path and pruned_model_path, So there is no option for the fine-tune in config. If I try to use pretrained_model_path, it throws an exception for the shape.
Invalid argument: Incompatible shapes: [6,29484,3] vs. [6,29484,12]
That error is expected.
Technically the pretrained model that we download from ngc comes without final layer which represents the total number of classes and their respective bboxes.
Once you train that model with any dataset, then the trained model will be frozen with the top layer. Now, if you want to finetune the same model with different number of classes you will get error related to invalid shapes.
You need to train the model on the new dataset from the beginning.
If you want to finetune the model with different dataset but of the same classes then you can use the previously trained model.
This may sound like a naive question, but i am quite new on this. Let's say I use the Google pre-trained word2vector model (https://github.com/dav/word2vec) to train a classification model. I save my classification model. Now I load back the classification model into memory for testing new instances. Do I need to load the Google word2vector model again? Or is it only used for training my model?
It depends on how your corpuses and test examples are structured and pre-processed.
You are probably using the pre-trained word-vectors to turn text into numerical features. At first, text examples are vectorized to train the classifier. Later, other (test/production) text examples will be vectorized in the same, and presented to get the classifier to get its judgements.
So you will need to use the same text-to-vectors process for test/production text examples as was used during training. Perhaps you've done that in a separate earlier bulk step, in which case you already have the features in the vector form the classifier uses. But often your classifier pipeline will itself take raw text, and vectorize it – in which case it will need the same pre-trained (word)->(vector) mappings available at test time as were available during training.
Deep Learning has been applied successfully on several large data sets for the classification of a handful of classes (cats, dogs, cars, planes, etc), with performances beating simpler descriptors like Bags of Features over SIFT, color histograms, etc.
Nevertheless, training such a network requires a lot of data per class and a lot of training time. However, very often one doesn't have enough data or just wants to get an idea of how well a convolutional neural network might do, before spending time one designing and training such a device and gathering the training data.
In this particular case, it might be ideal to have a network configured and trained using some benchmark data set used by the state of the art publications, and to simply apply it to some data set that you might have as a feature extractor.
This results in a set of features for each image, which one could feed to a classical classification method like SVM's, logistic regression, neural networks, etc.
In particular when one does not have enough data to train the CNN, I may expect this to outperform a pipeline where the CNN was trained on few samples.
I was looking at the tensorflow tutorials, but they always seem to have a clear training / testing phase. I couldn't find a pickle file (or similar) with a pre-configured CNN feature extractor.
My questions are: do such pre-trained networks exist and where can I find them. Alternatively: does this approach make sense? Where could I find a CNN+weights ?
EDIT
W.r.t. #john's comment I tried using 'DecodeJpeg:0' and 'DecodeJpeg/contents:0' and checked the outputs, which are different (:S)
import cv2, requests, numpy
import tensorflow.python.platform
import tensorflow as tf
response = requests.get('https://i.stack.imgur.com/LIW6C.jpg?s=328&g=1')
data = numpy.asarray(bytearray(response.content), dtype=np.uint8)
image = cv2.imdecode(data,-1)
compression_worked, jpeg_data = cv2.imencode('.jpeg', image)
if not compression_worked:
raise Exception("Failure when compressing image to jpeg format in opencv library")
jpeg_data = jpeg_data.tostring()
with open('./deep_learning_models/inception-v3/classify_image_graph_def.pb', 'rb') as graph_file:
graph_def = tf.GraphDef()
graph_def.ParseFromString(graph_file.read())
tf.import_graph_def(graph_def, name='')
with tf.Session() as sess:
softmax_tensor = sess.graph.get_tensor_by_name('pool_3:0')
arr0 = numpy.squeeze(sess.run(
softmax_tensor,
{'DecodeJpeg:0': image}
))
arr1 = numpy.squeeze(sess.run(
softmax_tensor,
{'DecodeJpeg/contents:0': jpeg_data}
))
print(numpy.abs(arr0 - arr1).max())
So the max absolute difference is 1.27649, and in general all the elements differ (especially since the average value of the arr0 and arr1 themselves lies between 0 - 0.5).
I also would expect that 'DecodeJpeg:0' needs a jpeg-string, not a numpy array, why else does the name contain 'Jpeg'. #john: Could you state how
sure you are about your comment?
So I guess I'm not sure what is what, as I would expect a trained neural network to be deterministic (but chaotic at most).
The TensorFlow team recently released a deep CNN trained on the ImageNet dataset. You can download the script that fetches the data (including the model graph and the trained weights) from here. The associated Image Recognition tutorial has more details about the model.
While the current model isn't specifically packaged to be used in a subsequent training step, you could explore modifying the script to reuse parts of the model and the trained weights in your own network.
I am working on a project about the feedforward pathway of the ventral stream, and i have 6 images to be recognized at the InferoTemporal Layer.
Please can someone give me images' exmamples showing to me what is the difference between training images and test images. So what i should add to my folder that contain my training images? Does i should add another folder that contain a list of test images ? if yes, what should be these test images?
Does the training images must contains the images to be analysed or recognized and the test images must contains the images in memory? In other words, if we have for example 16 training faces and one or two test faces. So we should analyse what is the face in the training that correspond to the face in test ? Is that true ??
Note: I don't need a code, I am only interested to get a brief explanations about the difference between test ans training images.
Any help will be very appreciated.
The only difference between training and test images is the fact, that test images are not used for selecting your models parameters. Each model has some kind of paramters, variables, which it fits to the data. This is called a training process. The training/test set separation ensures, that your model (algorithm) can actually do something more that just memorizing images - so you test it on test images, which has not been used during the training phase.
It has been already discussed in detail on SO: whats is the difference between train, validation and test set, in neural networks?
In HMAX, you use all the data at the input image layer. And garbor filter, max-pooling, radial basis kernel functions on all of them. Only at C2 layer, you start to train a subset of the images (mostly with a linear kernel based SVM). The subset is set to training data. And the rest are test data. In one word, training images are first used to build the SVM and then the test images are assigned to digit classes using the majority-voting method.
But this is in fact equivalent as you put the training images at the image layer at first. After all the layers going through, you then put the test images at the image layer to restart for the recognition. Since both training and test image need scaling, and all the operations at previous layers prior to C2 are the same, you just mix them altogether at the beginning.
Although you use all the training and test images at the image layer, you still need to shuffle the data and pick up some of them as the training, and the others as the testing.