I'm building an image classification system with Keras, Tensorflow GPU backend and CUDA 9.1, running on Ubuntu 18.04.
I'm using a very large image data set with 1.2 million images, 15k classes, and is 335 GB in size.
I can train my network on 90,000 images with no problems. However, when I scale up and use the entire data set of 1.2 million images I get the error shown below, which I believe have to do with running out of memory.
I'm using GeForce GTX 1080 with 11GB memory, and I have 128GB of RAM, 300GB of swap file and AMD Threadripper 1950X with 16 cores.
I followed the advice given to solve similar problems. I'm now using smaller batch size of 10 or even smaller, and a smaller dense inner layer of 256, and I'm still getting the same error shown below before the first training epoch begins.
[Update]: I found out that The memory error happens during the VGG16 predict_generator call, even before my network is built or trained. See code below.
First, warnings and Errors:
2018-05-19 20:24:01.255788: E tensorflow/stream_executor/cuda/cuda_driver.cc:967] failed to alloc 5635855360 bytes on host: CUresult(304)
2018-05-19 20:24:01.255850: W ./tensorflow/core/common_runtime/gpu/pool_allocator.h:195] could not allocate pinned host memory of size: 5635855360
Then exceptions:
2018-05-19 13:56:40.472404: I tensorflow/core/common_runtime/bfc_allocator.cc:680] Stats:
Limit: 68719476736
InUse: 15548829696
MaxInUse: 15548829696
NumAllocs: 15542
MaxAllocSize: 16777216
2018-05-19 13:56:40.472563: W tensorflow/core/common_runtime/bfc_allocator.cc:279] ****************************************************************************************************
Traceback (most recent call last):
File "/home/welshamy/tools/anaconda/3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call
return fn(*args)
File "/home/welshamy/tools/anaconda/3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1307, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/welshamy/tools/anaconda/3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized.
[[Node: block5_pool/MaxPool/_159 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_133_block5_pool/MaxPool", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "bottleneck.py", line 37, in <module>
bottleneck_features_train = model_vgg.predict_generator(train_generator_bottleneck)
File "/home/welshamy/tools/anaconda/3/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/home/welshamy/tools/anaconda/3/lib/python3.6/site-packages/keras/engine/training.py", line 2510, in predict_generator
outs = self.predict_on_batch(x)
File "/home/welshamy/tools/anaconda/3/lib/python3.6/site-packages/keras/engine/training.py", line 1945, in predict_on_batch
outputs = self.predict_function(ins)
File "/home/welshamy/tools/anaconda/3/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2478, in __call__
**self.session_kwargs)
File "/home/welshamy/tools/anaconda/3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
run_metadata_ptr)
File "/home/welshamy/tools/anaconda/3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run
feed_dict_tensor, options, run_metadata)
File "/home/welshamy/tools/anaconda/3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
run_metadata)
File "/home/welshamy/tools/anaconda/3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized.
[[Node: block5_pool/MaxPool/_159 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_133_block5_pool/MaxPool", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Here is my code:
import numpy as np
from keras.callbacks import EarlyStopping, ModelCheckpoint
from keras.layers import Dropout, Flatten, Dense
from keras.models import Sequential
from keras.preprocessing.image import ImageDataGenerator
from keras import applications
from keras.utils.np_utils import to_categorical
import matplotlib.pyplot as plt
# Dimensions of our images.
img_width, img_height = 224, 224
train_data_dir = './train_sample'
epochs = 100
batch_size = 10
# Data preprocessing
# Pixel values rescaling from [0, 255] to [0, 1] interval
datagen = ImageDataGenerator(rescale=1. / 255)
# Retrieve images and their classes for training set.
train_generator_bottleneck = datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode=None,
shuffle=False)
num_classes = len(train_generator_bottleneck.class_indices)
model_vgg = applications.VGG16(include_top=False, weights='imagenet')
bottleneck_features_train = model_vgg.predict_generator(train_generator_bottleneck)
np.save('../models/bottleneck_features_train.npy', bottleneck_features_train)
train_data = np.load('../models/bottleneck_features_train.npy')
train_labels = to_categorical(train_generator_bottleneck.classes, num_classes=num_classes)
model_top = Sequential()
model_top.add(Flatten(input_shape=train_data.shape[1:]))
model_top.add(Dense(256, activation='relu'))
model_top.add(Dropout(0.5))
model_top.add(Dense(num_classes, activation='softmax'))
model_top.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# Model saving callback
checkpointer = ModelCheckpoint(filepath='../models/bottleneck_features.h5', monitor='val_acc', verbose=1,
save_best_only=True)
# Early stopping
early_stopping = EarlyStopping(monitor='val_acc', verbose=1, patience=5)
history = model_top.fit(
train_data,
train_labels,
verbose=2,
epochs=epochs,
batch_size=batch_size,
callbacks=[checkpointer, early_stopping],
validation_split=0.3)
I don't believe the problem here is batch_size, as you mention it already is so low. Furthermore, because you said that it works for 90k images, the issue is probably that train_data cannot fit on the GPU in memory (which is needed at the start of each fit epoch). In order to alleviate this problem, you will need to fit your model_top with a generator, just as you get your features from predict_generator. One way you can do this is wrapping a generator class around train_data, but I would instead just connect the two models (note I could not test this, but I think it is right):
model_vgg = applications.VGG16(include_top=False, weights='imagenet')
model_top = Flatten()(model_vgg)
model_top = Dense(256, activation='relu')(model_top)
model_top = Dropout(0.3)(model_top)
model_top = Dense(num_classes, activation='softmax')(model_top)
model = Model(inputs=model_vgg.inputs, outputs=model_top)
model.compile(loss='sparse_categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# Model saving callback
checkpointer = ModelCheckpoint(filepath='../models/bottleneck_features.h5', monitor='val_acc', verbose=1,
save_best_only=True)
# Early stopping
early_stopping = EarlyStopping(monitor='val_acc', verbose=1, patience=5)
history = model.fit_generator(
train_data,
train_labels,
verbose=2,
steps_per_epoch=steps_per_epoch,
batch_size=batch_size,
callbacks=[checkpointer, early_stopping],
...)
I changed categorical_crossentropy to sparse_categorical_crossentropy so that just indexes can be sent as the labels, otherwise the same. You will need to supply steps_per_epoch as the length of the training data / the batch size. Or just put whatever number to test. I also used the keras functional api to make this more clear.
This would also allow the weights of the VGG top to change in order to help you classify better. If this is not what you want for some reason, you can freeze it by iterating over all of the vgg layers and setting trainable to false.
lmk if it works.
Related
I am working on a dataset of 300K images doing multi class image classification. So far i took a small dataset of around 7k images, but the code either returns memory error or my notebook just dies. The code below converts all images to a numpy array at once, which results in trouble with my memory when the last row of code gets executed. train.csv contains image-filenames and one hot encoded labels.
The code is like this:
data = pd.read_csv('train.csv')
img_width = 400
img_height = 400
img_vectors = []
for i in range(data.shape[0]):
path = 'Images/' + data['Id'][
img = image.load_img(path, target_size=(img_width, img_height, 3))
img = image.img_to_array(img)
img = img/255.0
img_vectors.append(img)
img_vectors = np.array(img_vectors)
Error Message:
MemoryError Traceback (most recent call last)
<ipython-input-13-dd2302ae54e1> in <module>
----> 1 img_vectors = np.array(img_vectors)
MemoryError: Unable to allocate array with shape (7344, 400, 400, 3) and data type float32
I guess I need a batch of smaller arrays for all images to handle memory issue, to avoid having one array with all imagedata at the same time.
On an earlier project i did image-classification without multi-label with around 225k images. Anyway this code doesnt convert all image-data to one giant array. It rather puts the imagedata into smaller batches:
#image preparation
if K.image_data_format() is "channels_first":
input_shape = (3, img_width, img_height)
else:
input_shape = (img_width, img_height, 3)
train_datagen = ImageDataGenerator(rescale=1./255, horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(train_data_dir, target_size=(img_width, img_height), batch_size=batch_size, class_mode='categorical')
validation_generator = test_datagen.flow_from_directory(validation_data_dir, target_size=(img_width, img_height), batch_size=batch_size, class_mode='categorical')
model = Sequential()
model.add(Conv2D(32, (3,3), input_shape=input_shape))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
...
model.add(Dense(17))
model.add(BatchNormalization(axis=1, momentum=0.6))
model.add(Activation('softmax'))
model.summary()
model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
model.fit_generator(
train_generator,
steps_per_epoch=nb_train_samples // batch_size,
epochs=epochs,
validation_data=validation_generator,
validation_steps=nb_validation_samples // batch_size,
class_weight = class_weight
)
So what i actually need is an approach of how I can handle big datasets of images for multilabel image classification without getting in trouble with memory.
Ideal would be to work with a csv-file containing image-filename and one-hot-encoded labels in combination with array batches for learning.
Any help or guesses here would be greatly appreciated.
The easiest way to solve the problem you are facing is to write a costume data generator, here is a tutorial that shows how to do this. The idea is that instead of using flow_from_directory, you create generate a costume dataloader, that reads each image from its source path and gives to y the correspongind labels. Practiclly I think that your data is stored on a .csv file, where each row contain the path to an image, and the labels present in the image. So your datagen will have a function getittem(self, index) that will read the image from the path in raw number index and return along with the target that is obtained by reading the labels in this raw and one hot encode them, then sum them.
I am trying to train a Keras CNN model on plant images. I needed to preprocess those images before training because they contain extra information that I don't want the model to learn.
Solution: Color-based segmentation with openCV, I kept just the green pixels
def segmented(image):
foto = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
hsv_foto = cv2.cvtColor(foto, cv2.COLOR_RGB2HSV)
colormin=(25,50,50)
colormax=(86,255,255)
mask = cv2.inRange(hsv_foto, colormin , colormax)
result = cv2.bitwise_and(foto, foto, mask=mask)
return result
Orignal and transformed image
Problem:
The function works fine while visualizing the segmented images but I am struggling to pass it to a Keras model to train just the transformed images and not the original ones from directory while training.
My solution: What I am trying now is to include my segmented() function to the keras ImageDataGenerator:
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
validation_split=0.2)
train_generator = train_datagen.flow_from_directory(
segmented('../input/v2-plant-seedlings-dataset/nonsegmentedv2/'),
target_size=(64,64),
batch_size=32,
class_mode='categorical',
subset='training')
or in training
training=model.fit_generator(
segmented(train_generator),
steps_per_epoch=100,
epochs=20,
validation_data = segmented(validation_generator),
validation_steps = 30,
callbacks=[earlystopper1, checkpointer1]
)
But I get this error that is probably related to image reading and opening
TypeError Traceback (most recent call last)
<ipython-input-47-3fc9e5fcdc32> in <module>
1 training=model.fit_generator(
----> 2 segmented1(train_generator),
3 steps_per_epoch=100,
4 epochs=20,
5 validation_data = validation_generator,
<ipython-input-46-24182f9d357f> in segmented1(np_image)
1 def segmented1(np_image):
2
----> 3 foto = cv2.cvtColor(np_image, cv2.COLOR_BGR2RGB)
4 hsv_foto = cv2.cvtColor(foto, cv2.COLOR_RGB2HSV)
5
TypeError: Expected Ptr<cv::UMat> for argument '%s'
train_datagen = ImageDataGenerator(
preprocessing_function = segmented,
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
validation_split=0.2)
I didn't try your function, but here is the explanation about the input and output of preprocessing_function which is mentioned in this linked: https (https://keras.io/preprocessing/image/):
preprocessing_function: function that will be applied on each input.
The function will run after the image is resized and augmented. The
function should take one argument: one image (Numpy tensor with rank
3), and should output a Numpy tensor with the same shape.
You should probably modify the segmented function a little bit.
This problem never used to occur but since today Tensorflow always tries to allocate a huge amount of memory, even when using very small batch sizes.
I followed this tutorial:
https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
"Using the bottleneck features of a pre-trained network: 90% accuracy in a minute"
This is my code:
import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dropout, Flatten, Dense
from keras import applications
img_width, img_height = 150, 150
top_model_weights_path = 'bottleneck_fc_model.h5'
train_data_dir = 'C:\\ImageData\\Augmented\\Train'
validation_data_dir = 'C:\\ImageData\\Augmented\\Validate'
#train_data_dir = 'C:\\Users\\NSA\\flower_photos\\Train'
#validation_data_dir = 'C:\\Users\\NSA\\flower_photos\\Validate'
nb_train_samples = 25
nb_validation_samples = 5
epochs = 10
my_batch_size = 10
def save_bottleneck_features():
datagen = ImageDataGenerator(rescale=1./255)
# build the VGG16 network
model = applications.VGG16(include_top=False, weights='imagenet')
generator = datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=my_batch_size,
class_mode=None,
shuffle=False)
bottleneck_features_train = model.predict_generator(
generator,
steps=nb_train_samples // my_batch_size,
max_queue_size=10,
workers=1,
use_multiprocessing=False,
verbose=1)
np.save(open('bottleneck_features_train.npy', 'w'),
bottleneck_features_train)
generator = datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=my_batch_size,
class_mode=None,
shuffle=False)
bottleneck_features_validation = model.predict_generator(
generator, nb_validation_samples // my_batch_size)
np.save(open('bottleneck_features_validation.npy', 'w'),
bottleneck_features_validation)
def train_top_model():
train_data = np.load(open('bottleneck_features_train.npy'))
train_labels = np.array(
[0] * (nb_train_samples / 2) + [1] * (nb_train_samples / 2))
validation_data = np.load(open('bottleneck_features_validation.npy'))
validation_labels = np.array(
[0] * (nb_validation_samples / 2) + [1] * (nb_validation_samples / 2))
model = Sequential()
model.add(Flatten(input_shape=train_data.shape[1:]))
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy', metrics=['accuracy'])
model.fit(train_data, train_labels,
epochs=epochs,
batch_size=my_batch_size,
validation_data=(validation_data, validation_labels))
model.save_weights(top_model_weights_path)
save_bottleneck_features()
train_top_model()
And this is the error I get:
PS C:\Users\NSA\ownCloud\Documents\Tensorflow\Skripts> cd 'c:\Users\NSA\ownCloud\Documents\Tensorflow\Skripts'; ${env:PYTHONIOENCODING}='UTF-8'; ${env:PYTHONUNBUFFERED}='1'; & 'C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\python.exe' 'C:\Users\NSA\.vscode\extensions\ms-python.python-2018.3.1\pythonFiles\PythonTools\visualstudio_py_launcher.py' 'c:\Users\NSA\ownCloud\Documents\Tensorflow\Skripts' '50490' '34806ad9-833a-4524-8cd6-18ca4aa74f14' 'RedirectOutput,RedirectOutput' 'c:\Users\NSA\ownCloud\Documents\Tensorflow\Skripts\first_try_real_transfer_learning_keras_vgg16.py'
C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\h5py\__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will
be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
Using TensorFlow backend.
Bottleneck Features saven
2018-04-09 16:02:08.772206: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-04-09 16:02:09.345010: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:1212] Found device 0 with properties:
name: GeForce 940MX major: 5 minor: 0 memoryClockRate(GHz): 1.189
pciBusID: 0000:02:00.0
totalMemory: 2.00GiB freeMemory: 1.66GiB
2018-04-09 16:02:09.356147: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:1312] Adding visible gpu devices: 0
2018-04-09 16:02:10.108947: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:993] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1429 MB memory) -> physical GPU (device: 0, name: GeForce 940MX, pci bus id: 0000:02:00.0, compute capability: 5.0)
Found 109 images belonging to 2 classes.
2018-04-09 16:02:16.979539: W C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.33GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2018-04-09 16:02:17.441196: W C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.19GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2018-04-09 16:02:17.792983: W C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.14GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2018-04-09 16:02:18.122577: W C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:219] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.17GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2/2 [==============================] - 4s 2s/step
Traceback (most recent call last):
File "c:\Users\NSA\ownCloud\Documents\Tensorflow\Skripts\first_try_real_transfer_learning_keras_vgg16.py", line 94, in <module>
save_bottleneck_features()
File "c:\Users\NSA\ownCloud\Documents\Tensorflow\Skripts\first_try_real_transfer_learning_keras_vgg16.py", line 56, in save_bottleneck_features
bottleneck_features_train)
File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\numpy\lib\npyio.py", line 511, in save
pickle_kwargs=pickle_kwargs)
File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\numpy\lib\format.py", line 565, in write_array
version)
File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\numpy\lib\format.py", line 335, in _write_array_header
fp.write(header_prefix)
TypeError: write() argument must be str, not bytes
PS C:\Users\NSA\ownCloud\Documents\Tensorflow\Skripts>
The error occurs specifically when calling model.predict_generator()
At first I thought its running out of memory because my batch size is too large, but even when I use a batch size of 1 it requires over 2GiB of memory. I have installed CUDA 9.0, cuDNN 7.0, Tensorflow 1.6.0 and Keras 2.1.5 using TensorFlow backend. This used to work without issue but it suddenly started giving me this error. I'm using a NVIDIA GeForce 940MX
Your problem has nothing to do with memory or tensorflow. A file opened as text is being written bytes.
Instead of opening the file as text:
open('bottleneck_features_train.npy', 'w')
open it as bytes:
open('bottleneck_features_train.npy', 'wb')
This applies to all the calls to open you have.
I am using this code to perform some experiment, I want to use intermediate layer representation of layer mainly before the fully connected layer(or last layer) of CNN.
from __future__ import print_function
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.layers import Embedding
from keras.layers import Conv1D, GlobalMaxPooling1D
from keras.datasets import imdb
# set parameters:
max_features = 5000
maxlen = 400
batch_size = 100
embedding_dims = 50
filters = 250
kernel_size = 3
hidden_dims = 250
epochs = 100
print('Loading data...')
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)
print(len(x_train), 'train sequences')
print(len(x_test), 'test sequences')
print('Pad sequences (samples x time)')
x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = sequence.pad_sequences(x_test, maxlen=maxlen)
print('x_train shape:', x_train.shape)
print('x_test shape:', x_test.shape)
print('Build model...')
model = Sequential()
# we start off with an efficient embedding layer which maps
# our vocab indices into embedding_dims dimensions
model.add(Embedding(max_features,
embedding_dims,
input_length=maxlen))
model.add(Dropout(0.2))
# we add a Convolution1D, which will learn filters
# word group filters of size filter_length:
model.add(Conv1D(filters,
kernel_size,
padding='valid',
activation='relu',
strides=1))
# we use max pooling:
model.add(GlobalMaxPooling1D())
# We add a vanilla hidden layer:
model.add(Dense(hidden_dims))
model.add(Dropout(0.2))
model.add(Activation('relu'))#<======== I need output after this.
# We project onto a single unit output layer, and squash it with a sigmoid:
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam', metrics=['accuracy'])
To get the intermediate layer representation of penultimate layer I used following code.
CODE1
get_layer_output = K.function([model.layers[0].input, K.learning_phase()],
[model.layers[6].output])
# output in test mode = 0
layer_output_test = get_layer_output([x_test, 0])[0]
# output in train mode = 1
layer_output_train = get_layer_output([x_train, 1])[0]
print(layer_output_train)
print(layer_output_train.shape)
CODE2
def get_activations(model, layer, X_batch):
get_activations = K.function([model.layers[0].input, K.learning_phase()], [model.layers[layer].output,])
activations = get_activations([X_batch,1])
return activations
import numpy as np
X_train=np.array(get_activations(model=model,layer=6, X_batch=x_train)[0], dtype=np.float32)
print(X_train)
print(X_train.shape)
Which one is correct as I am getting/printing different output for above two codes? I want to use the above correct output to multiply by weights and optimise by custom optimiser.
If you pass 1 to K.learning_phase() you will get different results every time. But both codes give the same result.
Using a higher level approach, you can do this:
from keras.models import Model
newModel = Model(model.inputs,model.layers[6].output)
Do whatever you want with newModel. You can train it (and affect the original model), and use it to predict values.
So I'm trying to practice how to use LSTMs in Keras and all parameter (samples, timesteps, features). 3D list is confusing me.
So I have some stock data and if the next item in the list is above the threshold of 5 which is +-2.50 it buys OR sells, if it is in the middle of that threshold it holds, these are my labels: my Y.
For my features my X I have a dataframe of [500, 1, 3] for my 500 samples and each timestep is 1 since each data is 1 hour increment and 3 for 3 features. But I get this error:
ValueError: Error when checking model input: expected lstm_1_input to have 3 dimensions, but got array with shape (500, 3)
How can I fix this code and what am I doing wrong?
import json
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
"""
Sample of JSON file
{"time":"2017-01-02T01:56:14.000Z","usd":8.14},
{"time":"2017-01-02T02:56:14.000Z","usd":8.16},
{"time":"2017-01-02T03:56:15.000Z","usd":8.14},
{"time":"2017-01-02T04:56:16.000Z","usd":8.15}
"""
file = open("E.json", "r", encoding="utf8")
file = json.load(file)
"""
If the price jump of the next item is > or < +-2.50 the append 'Buy or 'Sell'
If its in the range of +- 2.50 then append 'Hold'
This si my classifier labels
"""
data = []
for row in range(len(file['data'])):
row2 = row + 1
if row2 == len(file['data']):
break
else:
difference = file['data'][row]['usd'] - file['data'][row2]['usd']
if difference > 2.50:
data.append((file['data'][row]['usd'], 'SELL'))
elif difference < -2.50:
data.append((file['data'][row]['usd'], 'BUY'))
else:
data.append((file['data'][row]['usd'], 'HOLD'))
"""
add the price the time step which si 1 and the features which is 3
"""
frame = pd.DataFrame(data)
features = pd.DataFrame()
# train LSTM
for x in range(500):
series = pd.Series(data=[500, 1, frame.iloc[x][0]])
features = features.append(series, ignore_index=True)
labels = frame.iloc[16000:16500][1]
# test
#yt = frame.iloc[16500:16512][0]
#xt = pd.get_dummies(frame.iloc[16500:16512][1])
# create LSTM
model = Sequential()
model.add(LSTM(3, input_shape=features.shape, activation='relu', return_sequences=False))
model.add(Dense(2, activation='relu'))
model.add(Dense(1, activation='relu'))
model.compile(loss='mse', optimizer='adam', metrics=['accuracy'])
model.fit(x=features.as_matrix(), y=labels.as_matrix())
"""
ERROR
Anaconda3\envs\Final\python.exe C:/Users/Def/PycharmProjects/Ether/Main.py
Using Theano backend.
Traceback (most recent call last):
File "C:/Users/Def/PycharmProjects/Ether/Main.py", line 62, in <module>
model.fit(x=features.as_matrix(), y=labels.as_matrix())
File "\Anaconda3\envs\Final\lib\site-packages\keras\models.py", line 845, in fit
initial_epoch=initial_epoch)
File "\Anaconda3\envs\Final\lib\site-packages\keras\engine\training.py", line 1405, in fit
batch_size=batch_size)
File "\Anaconda3\envs\Final\lib\site-packages\keras\engine\training.py", line 1295, in _standardize_user_data
exception_prefix='model input')
File "\Anaconda3\envs\Final\lib\site-packages\keras\engine\training.py", line 121, in _standardize_input_data
str(array.shape))
ValueError: Error when checking model input: expected lstm_1_input to have 3 dimensions, but got array with shape (500, 3)
"""
Thanks.
This is my first post here I wish that could be useful I will try to do my best
First you need to create 3 dimension array to work with input_shape in keras you can watch this in keras documentation or in a better way:
from keras.models import Sequential
Sequential?
Linear stack of layers.
Arguments
layers: list of layers to add to the model.
# Note
The first layer passed to a Sequential model
should have a defined input shape. What that
means is that it should have received an input_shape
or batch_input_shape argument,
or for some type of layers (recurrent, Dense...)
an input_dim argument.
Example
```python
model = Sequential()
# first layer must have a defined input shape
model.add(Dense(32, input_dim=500))
# afterwards, Keras does automatic shape inference
model.add(Dense(32))
# also possible (equivalent to the above):
model = Sequential()
model.add(Dense(32, input_shape=(500,)))
model.add(Dense(32))
# also possible (equivalent to the above):
model = Sequential()
# here the batch dimension is None,
# which means any batch size will be accepted by the model.
model.add(Dense(32, batch_input_shape=(None, 500)))
model.add(Dense(32))
After that how to transform arrays 2 dimensions in 3 dimmension
check np.newaxis
Useful commands that help you more than you expect:
Sequential?,
-Sequential??,
-print(list(dir(Sequential)))
Best