Remove digit from MNIST, PyTorch - machine-learning

I'm experimenting with rotating the MNIST digits. Because a 9 is more or less a rotated 6, I'd like to remove all occurrences from the dataset.
As per this answer, I tried
dataset = datasets.MNIST(root='./data')
idx = dataset.train_labels!=9
dataset.train_labels = dataset.train_labels[idx]
dataset.train_data = dataset.train_data[idx]
which fails because the properties of the MNIST class are only readable.
I'd really like to not have to manually iterate through the entire dataset, and create a list of tuples that I feed to my dataloaders. Am I out of luck?

You might proceed as follows, namely by replacing train_labels with targets and train_data with data:
from torchvision import datasets
dataset = datasets.MNIST(root='data')
idx = dataset.targets!=9
dataset.targets = dataset.targets[idx]
dataset.data = dataset.data[idx]
Indeed, as you can see at https://pytorch.org/vision/stable/_modules/torchvision/datasets/mnist.html#MNIST, train_labels and train_data have eventually been marked as properties and as such they can't be set to some values, while targets and data have been probably added as public attributes in the meanwhile.
#property
def train_labels(self):
warnings.warn("train_labels has been renamed targets")
return self.targets
#property
def train_data(self):
warnings.warn("train_data has been renamed data")
return self.data

Related

How to use tf.image.resize_with_pad but pad with ones instead of zeros?

According to Tensorflow documentation, the padding is always with zeros instead of ones.
Is there there a way to change the padding to ones?
If not, what is the best alternative for a tensorflow dataset?
Here is my code example:
def resize_with_pad(image, label):
image = tf.image.resize_with_pad(image=image,
target_height=resized_wh,
target_width=resized_wh,
method=ResizeMethod.BILINEAR,
antialias=False)
return image, label
def create_tf_dataset_pipeline(tf_dataset):
tf_dataset = tf_dataset.map(load_image, num_parallel_calls=AUTOTUNE)
tf_dataset = tf_dataset.map(normalize, num_parallel_calls=AUTOTUNE)
tf_dataset = tf_dataset.map(resize_with_pad, num_parallel_calls=AUTOTUNE)
tf_dataset = tf_dataset.batch(batch_size)
tf_dataset = tf_dataset.prefetch(AUTOTUNE)
return tf_dataset
train_data = tf.data.Dataset.from_tensor_slices((x_train_filepaths, y_train_class))
train_data = create_tf_dataset_pipeline(train_data)
I tried resizing and padding the images and saving it in a directory (i.e. frontloading the processing), but that is very inflexible as I need to create a new dataset every time I want to train a model on a different size. It would be much better if I could do it dynamically with tensor flow.

Is it possible to vectorize this calculation in numpy?

Can the following expression of numpy arrays be vectorized for speed-up?
k_lin1x = [2*k_lin[i]*k_lin[i+1]/(k_lin[i]+k_lin[i+1]) for i in range(len(k_lin)-1)]
Is it possible to vectorize this calculation in numpy?
x1 = k_lin
x2 = k_lin
s = len(k_lin)-1
np.roll(x2, -1) #do this do bring the column one position right
result1 = x2[:s]+x1[:s] #your divider. You add everything but the last element
result2 = x2[:s]*x1[:s] #your upper part
# in one line
result = 2*x2[:s]*x1[:s] / (x2[:s]+x1[:s])
You last column wont be added or taken into the calculations and you can do this by simply using np.roll to shift the columns. x2[0] = x1[1], x2[1] = x1[2].
This is barely a demo of how you should approach google numpy roll. Also instead of using s on x2 you can simply drop the last column since it's useless for the calculations.

How to split a model trained in keras?

I trained a model with 4 hidden layers and 2 dense layers and I have saved that model.
Now I want to load that model and want to split into two models, one with one hidden layers and another one with only dense layers.
I have splitted the model with hidden layer in the following way
model = load_model ("model.hdf5")
HL_model = Model(inputs=model.input, outputs=model.layers[7].output)
Here the model is loaded model, in the that 7th layer is my last hidden layer. I tried to split the dense in the like
DL_model = Model(inputs=model.layers[8].input, outputs=model.layers[-1].output)
and I am getting error
TypeError: Input layers to a `Model` must be `InputLayer` objects.
After splitting, the output of the HL_model will the input for the DL_model.
Can anyone help me to create a model with dense layer?
PS :
I have tried below code too
from keras.layers import Input
inputs = Input(shape=(9, 9, 32), tensor=model_1.layers[8].input)
model_3 = Model(inputs=inputs, outputs=model_1.layers[-1].output)
And getting error as
RuntimeError: Graph disconnected: cannot obtain value for tensor Tensor("conv2d_1_input:0", shape=(?, 144, 144, 3), dtype=float32) at layer "conv2d_1_input". The following previous layers were accessed without issue: []
here (144, 144, 3) in the input image size of the model.
You need to specify a new Input layer first, then stack the remaining layers over it:
DL_input = Input(model.layers[8].input_shape[1:])
DL_model = DL_input
for layer in model.layers[8:]:
DL_model = layer(DL_model)
DL_model = Model(inputs=DL_input, outputs=DL_model)
A little more generic. You can use the following function to split a model
from keras.layers import Input
from keras.models import Model
def get_bottom_top_model(model, layer_name):
layer = model.get_layer(layer_name)
bottom_input = Input(model.input_shape[1:])
bottom_output = bottom_input
top_input = Input(layer.output_shape[1:])
top_output = top_input
bottom = True
for layer in model.layers:
if bottom:
bottom_output = layer(bottom_output)
else:
top_output = layer(top_output)
if layer.name == layer_name:
bottom = False
bottom_model = Model(bottom_input, bottom_output)
top_model = Model(top_input, top_output)
return bottom_model, top_model
bottom_model, top_model = get_bottom_top_model(model, "dense_1")
Layer_name is just the name of the layer that you want to split at.

How to use masking layer to mask input/output in LSTM autoencoders?

I am trying to use LSTM autoencoder to do sequence-to-sequence learning with variable lengths of sequences as inputs, using following code:
inputs = Input(shape=(None, input_dim))
masked_input = Masking(mask_value=0.0, input_shape=(None,input_dim))(inputs)
encoded = LSTM(latent_dim)(masked_input)
decoded = RepeatVector(timesteps)(encoded)
decoded = LSTM(input_dim, return_sequences=True)(decoded)
sequence_autoencoder = Model(inputs, decoded)
encoder = Model(inputs, encoded)
where inputs are raw sequence data padded with 0s to the same length (timesteps). Using the code above, the output is also of length timesteps, but when we calculate loss function we only want first Ni elements of the output (where Ni is length of input sequence i, which may be different for different sequences). Does anyone know if there is some good way to do that?
Thanks!
Option 1: you can always train without padding if you accept to train separate batches.
See this answer to a simple way of separating batches of equal length: Keras misinterprets training data shape
In this case, all you have to do is to perform the "repeat" operation in another manner, since you don't have the exact length at training time.
So, instead of RepeatVector, you can use this:
import keras.backend as K
def repeatFunction(x):
#x[0] is (batch,latent_dim)
#x[1] is inputs: (batch,length,features)
latent = K.expand_dims(x[0],axis=1) #shape(batch,1,latent_dim)
inpShapeMaker = K.ones_like(x[1][:,:,:1]) #shape (batch,length,1)
return latent * inpShapeMaker
#instead of RepeatVector:
Lambda(repeatFunction,output_shape=(None,latent_dim))([encoded,inputs])
Option2 (doesn't smell good): use another masking after RepeatVector.
I tried this, and it works, but we don't get 0's at the end, we get the last value repeated until the end. So, you will have to make a weird padding in your target data, repeating the last step until the end.
Example: target [[[1,2],[5,7]]] will have to be [[[1,2],[5,7],[5,7],[5,7]...]]
This may unbalance your data a lot, I think....
def makePadding(x):
#x[0] is encoded already repeated
#x[1] is inputs
#padding = 1 for actual data in inputs, 0 for 0
padding = K.cast( K.not_equal(x[1][:,:,:1],0), dtype=K.floatx())
#assuming you don't have 0 for non-padded data
#padding repeated for latent_dim
padding = K.repeat_elements(padding,rep=latent_dim,axis=-1)
return x[0]*padding
inputs = Input(shape=(timesteps, input_dim))
masked_input = Masking(mask_value=0.0)(inputs)
encoded = LSTM(latent_dim)(masked_input)
decoded = RepeatVector(timesteps)(encoded)
decoded = Lambda(makePadding,output_shape=(timesteps,latent_dim))([decoded,inputs])
decoded = Masking(mask_value=0.0)(decoded)
decoded = LSTM(input_dim, return_sequences=True)(decoded)
sequence_autoencoder = Model(inputs, decoded)
encoder = Model(inputs, encoded)
Option 3 (best): crop the outputs directly from the inputs, this also eliminates the gradients
def cropOutputs(x):
#x[0] is decoded at the end
#x[1] is inputs
#both have the same shape
#padding = 1 for actual data in inputs, 0 for 0
padding = K.cast( K.not_equal(x[1],0), dtype=K.floatx())
#if you have zeros for non-padded data, they will lose their backpropagation
return x[0]*padding
....
....
decoded = LSTM(input_dim, return_sequences=True)(decoded)
decoded = Lambda(cropOutputs,output_shape=(timesteps,input_dim))([decoded,inputs])
For this LSTM Autoencoder architecture, which I assume you understand, the Mask is lost at the RepeatVector due to the LSTM encoder layer having return_sequences=False.
So another option, instead of cropping like above, could also be to create custom bottleneck layer that propagates the mask.

Image classification with iOS 11 mlmodel - convert issue using coremltools and trained .caffemodel

It seems that I have some conversion issue using coremltool and a trained .caffemodel. I was able to train and test caffe dogs model (120 categories, 20k images) and it passed my tests with direct caffe classifications. Unfortunately, after converting to mlmodel it doesn't give me a valid prediction on the same input.
Traning model
The model has been trained using Caffe, GoogleNet, set of 20k images over 120 categories packed into lmdb and about 500k iterations. I've prepared images database and all the rest and put all files together here
Classification with caffe
This classification example by caffe. When I'm trying to run a classification request against the trained caffemodel - it works just great, high probability (80-99%), right results:
Classification with Apple iOS 11 CoreML
Unfortunately, when I'm trying to pack this DTDogs.caffemodel & deploy.txt into .mlmodel consumable by Apple iOS 11 CoreML I have different prediction results. Actually, no errors loading and using the model but I'm unable to get valid classifications, all the predictions are 0-15% confidence and have wrong labels. In order to test it properly I'm using exactly the same images I used for the direct classification with caffe:
I've also tried the pre-trained and pre-packed models with my iOS app from here - they work just fine so it seems to be an issue with packing procedure.
What did I miss?
Here is the example of classification with caffe: no issues, right answers (python):
import numpy as np
import sys
import caffe
import os
import urllib2
import matplotlib.pyplot as plt
%matplotlib inline
test_folder = '/home/<username>/Desktop/CaffeTest/'
test_image_path = "http://cdn.akc.org/content/hero/irish-terrier-Hero.jpg"
# init caffe net
model_def = test_folder + 'deploy.prototxt'
model_weights = test_folder + 'DTDogs.caffemodel'
# caffe.set_mode_gpu()
net = caffe.Net(model_def, model_weights, caffe.TEST)
# prepare transformer
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_transpose('data', (2,0,1))
transformer.set_raw_scale('data', 255)
transformer.set_channel_swap('data', (2,1,0))
net.blobs['data'].reshape(1, 3, 256, 256)
test_image = urllib2.urlopen(test_image_path)
with open(test_folder + 'testImage.jpg','wb') as output:
output.write(test_image.read())
image = caffe.io.load_image(test_folder + 'testImage.jpg')
transformed_image = transformer.preprocess('data', image)
net.blobs['data'].data[...] = transformed_image
# classify
output = net.forward()
output_prob = output['prob'][0]
output_prob_val = output_prob.max() * 100
output_prob_ind = output_prob.argmax()
labels_file = test_folder + 'labels.txt'
labels = np.loadtxt(labels_file, str, delimiter='\t')
plt.imshow(image)
print 'predicted class is:', output_prob_ind
print 'predicted probabily is:', output_prob_val
print 'output label:', labels[output_prob_ind]
Here is the example of packing the DTDogs.mlmodel model using coremltools. I see that resulted .mlmodel file is twice smaller then the original .caffemodel but it's probably some kind of archiving or compression optimisation by coremltools (python):
import coremltools;
caffe_model = ('DTDogs.caffemodel', 'deploy.prototxt')
labels = 'labels.txt'
coreml_model = coremltools.converters.caffe.convert(caffe_model, class_labels = labels, image_input_names= "data")
coreml_model.short_description = "Dogs Model v1.14"
coreml_model.save('DTDogs.mlmodel')
Here is an example of using the DTDogs.mlmodel in the app. I'm using a regular image picker to pick the same image I used for the .caffe classification test (swift):
func imagePickerController(_ picker: UIImagePickerController, didFinishPickingMediaWithInfo info: [String : Any]) {
picker.dismiss(animated: true)
print("Analyzing Imageā€¦")
guard let uiImage = info[UIImagePickerControllerOriginalImage] as? UIImage
else { print("no image from image picker"); return }
guard let ciImage = CIImage(image: uiImage)
else { print("can't create CIImage from UIImage"); return }
imageView.image = uiImage
do {
let model = try VNCoreMLModel(for: DTDogs().model)
let classificationRequest = VNCoreMLRequest(model: model, completionHandler: self.handleClassification)
let orientation = CGImagePropertyOrientation(uiImage.imageOrientation)
let handler = VNImageRequestHandler(ciImage: ciImage, orientation: Int32(orientation.rawValue))
try handler.perform([classificationRequest])
} catch {
print(error)
}
}
Usually what happens in these cases is that the image Core ML is passing into the model is not in the right format.
In the case of Caffe models, you typically need to set is_bgr=True when you call caffe.convert(), and you'll typically have to pass in mean values of RGB that will be subtracted from the input image, and possibly a scaling value as well.
In other words, Core ML needs to do the same stuff that your transformer does in the Python script.
Something like this:
coreml_model = coremltools.converters.caffe.convert(
caffe_model, class_labels = labels, image_input_names= "data",
is_bgr=True, image_scale=255.)
I'm not sure if the image_scale=255. is needed but it's worth a try. :-)

Resources