Feeding image into tensorflow - machine-learning

I am new to using TensorFlow. So I wastrying the MNIST tutorials in ML for beginners. The code runs just fine. But what if I want to input an image of my own, which has say a handwritten number on it, and se if it predicts what number it might be? How do I feed my own image into the TensorFlow program?

Assuming you're using this file.
If you look at x, the shape is [None, 784]. To feed your own image in, you'll have to store the image as a variable (loading it using PIL or OpenCV or something), flatten it, wrap it in a list, and pass it to the graph in the feed_dict, looking something like this:
sess.run(y_, feed_dict={x: [np.flatten(image_you_loaded_in)]})
It will need to be a 28x28 image in order for this code to work without modification.

Related

what's dataset type in tensorflow object-detection api?

I am trying to do my own object detection using my own dataset. I started my first machine learning program from google tensorflow object detection api, the link is here:eager_few_shot_od_training_tf2_colab.ipynb
In the colab tutorial, the author use javascript label the images, the result like this:
gt_boxes = [
np.array([[0.436, 0.591, 0.629, 0.712]], dtype=np.float32),
np.array([[0.539, 0.583, 0.73, 0.71]], dtype=np.float32),
np.array([[0.464, 0.414, 0.626, 0.548]], dtype=np.float32),
np.array([[0.313, 0.308, 0.648, 0.526]], dtype=np.float32),
np.array([[0.256, 0.444, 0.484, 0.629]], dtype=np.float32)
]
When I run my own program, I use labelimg replace to javascript, but the dataset is not compatible.
Now I have two questions, the first one is what is the dataset type in colab tutorial? coco, yolo, voc, or any other? the second is how transform dataset between labelimg data and colab tutorial data? My target is using labelimg to label data then substitute in colab tutorial.
The "data type" are just ratio values based on the height and width of the image. So the coordinates are just ratio values for where to start and end the bounding box. Since each image is going to be preprocessed, that is, it's dimensions are changed when fed into the model (batch,height,width,channel) the bounding box coordinates must have the correct ratio as the image might change dimensions from it's original size.
Like for the example, the model expects images to be 640x640. So if you provide an image of 800x600 it has to be resized. Now if the model gave back the coordinates [100,100,150,150] for an 640x640, clearly that would not be the same for 800x600 images.
However, to get this data format you should use PascalVOC when using labelImg.
The typical way to do this is to create TFRecord files and decode them in your training script order to create datasets. However, you are free to choose whatever method you like Tensorflow dataset in order to train your model.
Hope this answered your questions.

Why doesn't model.predict() work well on novel MNIST-like input?

I'm an experienced developer, new to Machine Learning. I'm experimenting with Keras/TensorFlow, starting with the mnist_mlp.py example. I installed Keras and TensorFlow using pip on a Mac.
In order to understand the inner workings better, instead of running the file ('python mnist_mlp.py'), I'm cutting and pasting the file contents into a Python (2.7.12) interactive window.
Everything runs fine and I get the 98.4% test accuracy as noted in the comments of that file.
What I want to do next is to feed it novel input and use model.predict() to see how it performs. I create 28x28 images in GIMP and bring them into my Python session (being careful to convert from 4-channel, 8-bit RGBA images to a linear single-channel floating-point array).
When I feed this into the model, I get what look like strange results to me. Some images are correctly categorized while others are wildly off.
They look like perfectly reasonable numbers to me, and they match the MNIST set examples pretty closely. When I extract the array back out and look at it it looks OK, so it doesn't seem to be a flipping or flopping issue. When I feed MNIST images in the same way, they appear to work correctly.
I'm not sure what's going on here. Is it a case of overfitting? Why is the validation data set the same as the test set?
Test images and python code with instructions can be found here:
https://s3.amazonaws.com/stackoverflow-47799896/StackOverflow_47799896.zip
Thanks.
EDIT: I tried the same test with the convnet example (mnist_cnn.py) and got slightly better results but still similar errors. If anyone wants to try that, they can use the same functions in the readme.py file but make these changes:
import numpy as np
x = np.ndarray((1,28,28,1), dtype='float32')
def l (s):
with open(s, 'rb') as fd:
_ = fd.read(1)
for i in xrange(28):
for j in xrange(28):
v = ord(fd.read(1))
x[0][i][j][0] = v / 255.0
_ = fd.read(3)
EDIT 2: Interestingly, if I replace the first 19 items in the training data set (out of 60,000) with my images in the MLP case, I get at or near perfect prediction of all my images after training. Does this suggest overfitting?

TensorFlow 1.2.1 and InceptionV3 to classify an image

I'm trying to create an example using the Keras built in the latest version of TensorFlow from Google. This example should be able to classify a classic image of an elephant. The code looks like this:
# Import a few libraries for use later
from PIL import Image as IMG
from tensorflow.contrib.keras.python.keras.preprocessing import image
from tensorflow.contrib.keras.python.keras.applications.inception_v3 import InceptionV3
from tensorflow.contrib.keras.python.keras.applications.inception_v3 import preprocess_input, decode_predictions
# Get a copy of the Inception model
print('Loading Inception V3...\n')
model = InceptionV3(weights='imagenet', include_top=True)
print ('Inception V3 loaded\n')
# Read the elephant JPG
elephant_img = IMG.open('elephant.jpg')
# Convert the elephant to an array
elephant = image.img_to_array(elephant_img)
elephant = preprocess_input(elephant)
elephant_preds = model.predict(elephant)
print ('Predictions: ', decode_predictions(elephant_preds))
Unfortunately I'm getting an error when trying to evaluate the model with model.predict:
ValueError: Error when checking : expected input_1 to have 4 dimensions, but got array with shape (299, 299, 3)
This code is taken from and based on the excellent example coremltools-keras-inception and will be expanded more when it is figured out.
The reason why this error occured is that model always expects the batch of examples - not a single example. This diverge from a common understanding of models as mathematical functions of their inputs. The reasons why model expects batches are:
Models are computationaly designed to work faster on batches in order to speed up training.
There are algorithms which takes into account the batch nature of input (e.g. Batch Normalization or GAN training tricks).
So four dimensions comes from a first dimension which is a sample / batch dimension and then - the next 3 dimensions are image dims.
Actually I found the answer. Even though the documentation states that if the top layer is included the shape of the input vector is still set to take a batch of images. Thus we need to add this before the code line for the prediction:
elephant = numpy.expand_dims(elephant, axis=0)
Then the tensor is in the right shape and everything works correctly. I am still uncertain why the documentation states that the input vector should be (3x299x299) or (299x299x3) when it clearly wants 4 dimensions.
Be careful!

Image data agumentation tequniques using keras.preprocessing.image.ImageDataGenerator?

I would like to generate augmented data for images by Random rotation, shifts, shear and flips.
I have found this keras function.
The function keras.preprocessing.image.ImageDataGenerator But I've seen this being used to directly train networks.
Is there a way to input images and then save the transformed images on HDD instead of how if currently works in examples in this link
Or is there another simple plug and use python package I can use instead of implementing everything with numpy or opencv ?
Basically - this is generator which is infinitely returning a batches of images. One could do the following:
def save_images_from_generator(maximal_nb_of_images, generator):
nb_of_images_processed = 0
for x, _ in generator:
nb_of_images += x.shape[0]
if nb_of_images <= maximal_nb_of_images:
for image_nb in range(x.shape[0]):
your_custom_save(x[image_nb]) # your custom function for saving images
else:
break
to save images from keras image generator.
You can save the images outputted by ImageGenerator to HDD. One option is to use datagen.flow as follows:
for X_batch, y_batch in datagen.flow(X_train, y_train, batch_size=9, save_to_dir='images', save_prefix='aug', save_format='png')
A second option is to manually loop over each image, load it, and apply a random transformation. Once you have instantiated your ImageGenerator, just call:
img_trans = datagen.random_transform(img)
Then, save the transformed image to HDD using PIL etc.
A third option is to manually loop over each image, load it, and apply a random transformation using a third party program. I recommend imgaug, found here.

How to save feature values of all batch data from pretrained torch networks?

Now I'm using fb torch library from github fb torch resnet
It's my first time to use torch and lua, so Im encountering some problems.
My goal is to save the feature vector of specific layer (last avg pooling of resnet) into a one file with the class of the input image. All input images are from cifar-10 db.
The file format that i want to get is like belows
image1.txt := class index of image and feature vector of image 1 of cifar-10
image2.txt := class index of image and feature vector of image 2 of cifar-10
// and so on through all images of cifar-10
Now I have seen some sample code of that github extract-features.lua
Because it's my first time for lua, I feel so hard to understand this code and to modify to the way i want. And i don't want my data to save into t7 file format.
How can i access only one specific layer from network in torch via lua? (last average pooling)
How can i access values of the layer and classification result index?
How can read all each images from cifar-10 db file(t7 batch)?
Sorry for too many questions. But im feeling hard using torch because of pool amouns of community threads and posting of torch.. please understand me.
How can i access only one specific layer from network in torch via lua? (last average pooling)
To access each layer you just have to load the model and get it using an integer number. If you do print model you will be able to see in which position the last average pooling is.
model = torch.load(path_to_model):cuda()
avg_pooling_layer = model:get(position_of_the_avg_pooling_layer)
How can i access values of the layer and classification result index?
I do not quite understand what you mean by this. If you want to see the output or the weights from a specific layer. (following the code above) You need to get these elements from the layer table. Again, to see which ones are the possible elements to get use print avg_pooling_layer
weights = avg_pooling_layer.weight -- get the weights of the layer
output = avg_pooling_layer.output -- get the output of the layer
How can read all each images from cifar-10 db file(t7 batch)?
To read the images from a t7 file use the torch function torch.load. (used before to load the model).
cifar_10 = torch.load("path_to_cifar-10.t7")
Once loaded you could have the training and test set in subtables or functions. Again, print the table and visualize which values are the ones you need to get.
Hope this helps!

Resources