Saving extracted features in CNN - machine-learning

I've just started learning machine learning algorithms. I would like to train VGG-16 network for my own dataset. I am using tflearn.DNN to simulate the VGG net.
I want to save the output (which is a tensor) of fully connected layer, that extracts 4096 features, into a file. I wanted to know how to save these features.
When i ran the following lines:
feed_dict = feed_dict_builder(X, Y, model.inputs, model.targets)
output = model.predictor.evaluate(feed_dict, convnet1)
print(output)
output.save('features.npy')
I got the following exception and error:
Exception in thread Thread-48:
Traceback (most recent call last):
File "/home/anupama/anaconda3/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/home/anupama/anaconda3/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "/home/anupama/anaconda3/lib/python3.6/site-packages/tflearn/data_flow.py", line 187, in fill_feed_dict_queue
data = self.retrieve_data(batch_ids)
File "/home/anupama/anaconda3/lib/python3.6/site-packages/tflearn/data_flow.py", line 222, in retrieve_data
utils.slice_array(self.feed_dict[key], batch_ids)
File "/home/anupama/anaconda3/lib/python3.6/site-packages/tflearn/utils.py", line 180, in slice_array
return [x[start] for x in X]
File "/home/anupama/anaconda3/lib/python3.6/site-packages/tflearn/utils.py", line 180, in <listcomp>
return [x[start] for x in X]
IndexError: index 2 is out of bounds for axis 1 with size 2
[0.0]
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-23-f2d62c020964> in <module>()
4 output = model.predictor.evaluate(feed_dict, convnet1)
5 print(output)
----> 6 output.save('/home/anupama/Internship/feats')
AttributeError: 'list' object has no attribute 'save'

You should save the FC layer of the network as a separate tensor and use DNN.predictor to evaluate it. Sample code:
import tflearn
from tflearn.utils import feed_dict_builder
# VGG model definition
...
previous_layer = ...
fc_layer1 = tflearn.fully_connected(previous_layer, 4096, activation='relu', name='fc1')
fc_layer2 = tflearn.fully_connected(fc_layer1, 4096, activation='relu', name='fc2')
network = ...
# Training
model = tflearn.DNN(network)
model.fit(x, y)
# Evaluation
feed_dict = feed_dict_builder(x, y, model.inputs, model.targets)
output = model.predictor.evaluate(feed_dict, [fc_layer2])
np.save('features.npy', output)

Related

parallelise prediction with `map_partitions`

I have a dataframe of shape (25M, 79) and im trying to parallelise an sklearn pipeline prediction on it.
When I run it for just one partition, it works as expected:
n_partitions = 1000
ddf = dd.from_pandas(df_x_selection, npartitions=n_partitions)
grid_searcher.best_estimator_.predict_proba(ddf.get_partition(0))
But if I apply it to every partition, then it fails:
n_partitions = 1000
ddf = dd.from_pandas(df_x_selection, npartitions=n_partitions)
def _f(_df, _pipeline, _predicted_class) -> np.array:
return _pipeline.predict_proba(_df)[:, _predicted_class]
ddf.map_partitions(_f, grid_searcher.best_estimator_, 1, meta=(None, 'f8')).compute()
The error is:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/opt/conda/lib/python3.7/site-packages/pandas/core/internals/blocks.py in __init__(self, values, placement, ndim)
130 raise ValueError(
--> 131 f"Wrong number of items passed {len(self.values)}, "
132 f"placement implies {len(self.mgr_locs)}"
ValueError: Wrong number of items passed 79, placement implies 100
What am I doing wrong?
Thanks

IndexError when iterating my dataset using Dataloader in PyTorch

I iterated my dataset using Dataloader in PyTorch 0.2 like these:
dataloader = torch.utils.data.DataLoader(...)
data_iter = iter(dataloader)
data = data_iter.next()
but IndexError was raised.
Traceback (most recent call last):
File "main.py", line 193, in <module>
data_target = data_target_iter.next()
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 201, in __next__
return self._process_next_batch(batch)
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 221, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
IndexError: Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 40, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 40, in <listcomp>
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/asr4/zhuminxian/adversarial/code/dataset/data_loader.py", line 33, in __getitem__
return self.X_train[idx], self.y_train[idx]
IndexError: index 4196 is out of bounds for axis 0 with size 4135
I am wondering why the index was out of bounds. Is it the bug of Pytorch?
I tried to run my code again, the same error raised, but at different iteration and with different out-of-bound index.
My guess is that your data.Dataset.__len__ was not overloaded properly and in-fact len(dataloader.dataset) returns a number larger than len(self.X_train).
Check your implementation of the underlying dataset in '/home/asr4/zhuminxian/adversarial/code/dataset/data_loader.py'.

Keras & Tensorflow GPU Out of Memory on Large Image Data

I'm building an image classification system with Keras, Tensorflow GPU backend and CUDA 9.1, running on Ubuntu 18.04.
I'm using a very large image data set with 1.2 million images, 15k classes, and is 335 GB in size.
I can train my network on 90,000 images with no problems. However, when I scale up and use the entire data set of 1.2 million images I get the error shown below, which I believe have to do with running out of memory.
I'm using GeForce GTX 1080 with 11GB memory, and I have 128GB of RAM, 300GB of swap file and AMD Threadripper 1950X with 16 cores.
I followed the advice given to solve similar problems. I'm now using smaller batch size of 10 or even smaller, and a smaller dense inner layer of 256, and I'm still getting the same error shown below before the first training epoch begins.
[Update]: I found out that The memory error happens during the VGG16 predict_generator call, even before my network is built or trained. See code below.
First, warnings and Errors:
2018-05-19 20:24:01.255788: E tensorflow/stream_executor/cuda/cuda_driver.cc:967] failed to alloc 5635855360 bytes on host: CUresult(304)
2018-05-19 20:24:01.255850: W ./tensorflow/core/common_runtime/gpu/pool_allocator.h:195] could not allocate pinned host memory of size: 5635855360
Then exceptions:
2018-05-19 13:56:40.472404: I tensorflow/core/common_runtime/bfc_allocator.cc:680] Stats:
Limit: 68719476736
InUse: 15548829696
MaxInUse: 15548829696
NumAllocs: 15542
MaxAllocSize: 16777216
2018-05-19 13:56:40.472563: W tensorflow/core/common_runtime/bfc_allocator.cc:279] ****************************************************************************************************
Traceback (most recent call last):
File "/home/welshamy/tools/anaconda/3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call
return fn(*args)
File "/home/welshamy/tools/anaconda/3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1307, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/welshamy/tools/anaconda/3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized.
[[Node: block5_pool/MaxPool/_159 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_133_block5_pool/MaxPool", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "bottleneck.py", line 37, in <module>
bottleneck_features_train = model_vgg.predict_generator(train_generator_bottleneck)
File "/home/welshamy/tools/anaconda/3/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/home/welshamy/tools/anaconda/3/lib/python3.6/site-packages/keras/engine/training.py", line 2510, in predict_generator
outs = self.predict_on_batch(x)
File "/home/welshamy/tools/anaconda/3/lib/python3.6/site-packages/keras/engine/training.py", line 1945, in predict_on_batch
outputs = self.predict_function(ins)
File "/home/welshamy/tools/anaconda/3/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2478, in __call__
**self.session_kwargs)
File "/home/welshamy/tools/anaconda/3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
run_metadata_ptr)
File "/home/welshamy/tools/anaconda/3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run
feed_dict_tensor, options, run_metadata)
File "/home/welshamy/tools/anaconda/3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
run_metadata)
File "/home/welshamy/tools/anaconda/3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized.
[[Node: block5_pool/MaxPool/_159 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_133_block5_pool/MaxPool", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Here is my code:
import numpy as np
from keras.callbacks import EarlyStopping, ModelCheckpoint
from keras.layers import Dropout, Flatten, Dense
from keras.models import Sequential
from keras.preprocessing.image import ImageDataGenerator
from keras import applications
from keras.utils.np_utils import to_categorical
import matplotlib.pyplot as plt
# Dimensions of our images.
img_width, img_height = 224, 224
train_data_dir = './train_sample'
epochs = 100
batch_size = 10
# Data preprocessing
# Pixel values rescaling from [0, 255] to [0, 1] interval
datagen = ImageDataGenerator(rescale=1. / 255)
# Retrieve images and their classes for training set.
train_generator_bottleneck = datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode=None,
shuffle=False)
num_classes = len(train_generator_bottleneck.class_indices)
model_vgg = applications.VGG16(include_top=False, weights='imagenet')
bottleneck_features_train = model_vgg.predict_generator(train_generator_bottleneck)
np.save('../models/bottleneck_features_train.npy', bottleneck_features_train)
train_data = np.load('../models/bottleneck_features_train.npy')
train_labels = to_categorical(train_generator_bottleneck.classes, num_classes=num_classes)
model_top = Sequential()
model_top.add(Flatten(input_shape=train_data.shape[1:]))
model_top.add(Dense(256, activation='relu'))
model_top.add(Dropout(0.5))
model_top.add(Dense(num_classes, activation='softmax'))
model_top.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# Model saving callback
checkpointer = ModelCheckpoint(filepath='../models/bottleneck_features.h5', monitor='val_acc', verbose=1,
save_best_only=True)
# Early stopping
early_stopping = EarlyStopping(monitor='val_acc', verbose=1, patience=5)
history = model_top.fit(
train_data,
train_labels,
verbose=2,
epochs=epochs,
batch_size=batch_size,
callbacks=[checkpointer, early_stopping],
validation_split=0.3)
I don't believe the problem here is batch_size, as you mention it already is so low. Furthermore, because you said that it works for 90k images, the issue is probably that train_data cannot fit on the GPU in memory (which is needed at the start of each fit epoch). In order to alleviate this problem, you will need to fit your model_top with a generator, just as you get your features from predict_generator. One way you can do this is wrapping a generator class around train_data, but I would instead just connect the two models (note I could not test this, but I think it is right):
model_vgg = applications.VGG16(include_top=False, weights='imagenet')
model_top = Flatten()(model_vgg)
model_top = Dense(256, activation='relu')(model_top)
model_top = Dropout(0.3)(model_top)
model_top = Dense(num_classes, activation='softmax')(model_top)
model = Model(inputs=model_vgg.inputs, outputs=model_top)
model.compile(loss='sparse_categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# Model saving callback
checkpointer = ModelCheckpoint(filepath='../models/bottleneck_features.h5', monitor='val_acc', verbose=1,
save_best_only=True)
# Early stopping
early_stopping = EarlyStopping(monitor='val_acc', verbose=1, patience=5)
history = model.fit_generator(
train_data,
train_labels,
verbose=2,
steps_per_epoch=steps_per_epoch,
batch_size=batch_size,
callbacks=[checkpointer, early_stopping],
...)
I changed categorical_crossentropy to sparse_categorical_crossentropy so that just indexes can be sent as the labels, otherwise the same. You will need to supply steps_per_epoch as the length of the training data / the batch size. Or just put whatever number to test. I also used the keras functional api to make this more clear.
This would also allow the weights of the VGG top to change in order to help you classify better. If this is not what you want for some reason, you can freeze it by iterating over all of the vgg layers and setting trainable to false.
lmk if it works.

Unsupervised loss function in Keras

Is there any way in Keras to specify a loss function which does not need to be passed target data?
I attempted to specify a loss function which omitted the y_true parameter like so:
def custom_loss(y_pred):
But I got the following error:
Traceback (most recent call last):
File "siamese.py", line 234, in <module>
model.compile(loss=custom_loss,optimizer=Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0))
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 911, in compile
sample_weight, mask)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 436, in weighted
score_array = fn(y_true, y_pred)
TypeError: custom_loss() takes exactly 1 argument (2 given)
I then tried to call fit() without specifying any target data:
model.fit(x=[x_train,x_train_warped, affines], batch_size = bs, epochs=1)
But it looks like not passing any target data causes an error:
Traceback (most recent call last):
File "siamese.py", line 264, in <module>
model.fit(x=[x_train,x_train_warped, affines], batch_size = bs, epochs=1)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1435, in fit
batch_size=batch_size)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1322, in _standardize_user_data
in zip(y, sample_weights, class_weights, self._feed_sample_weight_modes)]
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 577, in _standardize_weights
return np.ones((y.shape[0],), dtype=K.floatx())
AttributeError: 'NoneType' object has no attribute 'shape'
I could manually create dummy data in the same shape as my neural net's output but this seems extremely messy. Is there a simple way to specify an unsupervised loss function in Keras that I am missing?
I think the best solution is customizing the training instead of using the model.fit method.
The complete walkthrough is published in the Tensorflow tutorials page.
Write your loss function as if it had two arguments:
y_true
y_pred
If you don't have y_true, that's fine, you don't need to use it inside to compute the loss, but leave a placeholder in your function prototype, so keras wouldn't complain.
def custom_loss(y_true, y_pred):
# do things with y_pred
return loss
Adding custom arguments
You may also need to use another parameter like margin inside your loss function, even then your custom function should only take in those two arguments. But there is a workaround, use lambda functions
def custom_loss(y_pred, margin):
# do things with y_pred
return loss
but use it like
model.compile(loss=lambda y_true, y_pred: custom_loss(y_pred, margin), ...)

scikit-neuralnetwork mismatch error in dataset size

I'm trying to train an MLP classifier for the XOR problem using sknn.mlp
from sknn.mlp import Classifier, Layer
X=numpy.array([[0,1],[0,0],[1,0]])
print X.shape
y=numpy.array([[1],[0],[1]])
print y.shape
nn=Classifier(layers=[Layer("Sigmoid",units=2),Layer("Sigmoid",units=1)],n_iter=100)
nn.fit(X,y)
This results in:
No handlers could be found for logger "sknn"
Traceback (most recent call last):
File "xorclassifier.py", line 10, in <module>
nn.fit(X,y)
File "/usr/local/lib/python2.7/site-packages/sknn/mlp.py", line 343, in fit
return super(Classifier, self)._fit(X, yp)
File "/usr/local/lib/python2.7/site-packages/sknn/mlp.py", line 179, in _fit
X, y = self._initialize(X, y)
File "/usr/local/lib/python2.7/site-packages/sknn/mlp.py", line 37, in _initialize
self._create_specs(X, y)
File "/usr/local/lib/python2.7/site-packages/sknn/mlp.py", line 64, in _create_specs
"Mismatch between dataset size and units in output layer."
AssertionError: Mismatch between dataset size and units in output layer.
Scikit seems to turn your y vector into a binary vector of shape (n_samples,n_classes). n_classes is in your case two. So try
nn=Classifier(layers=[Layer("Sigmoid",units=2),Layer("Sigmoid",units=2)],n_iter=100)

Resources