Dense net is not running with large Dataset - image-processing

I have a following dataset
image size : 256 x 256 x 3 batch size = 3
29924 images
def get_model():
#base_model = application(weights='imagenet', input_shape=(image_size,image_size,3), include_top=False)
#base_model.trainable = False
base_model = DenseNet201(weights='imagenet', input_shape=(image_size,image_size,3), include_top=False)
model = models.Sequential()
model.add(base_model)
model.add(layers.GlobalAveragePooling2D())
model.add(layers.Dense(1024, activation='relu'))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(196, activation='softmax'))
model.summary()
#optimizer = optimizers.SGD(lr=1e-4, decay=1e-6, momentum=0.9, nesterov=True)
optimizer = optimizers.RMSprop(lr=0.0001)
model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['acc'])
return model
When I try to run the model it just stuck for a long time
and the Memory usage is keep going up and finally die
How to calculate the specific memory usage for specific model?
for my case Densenet201
is there way to run my model much faster way?(even running)
Any tips?

How to calculate the specific memory usage for specific model? for my case Densenet201
One way to obtain the size of your model in Bytes is to multiply the number of parameters (output in your call to model.summary()) by 4 - parameters are of type Float32 and there are 8 bits in a Byte.
is there way to run my model much faster way?(even running)
Any tips?
I am not sure what you mean by 2.
In general you should:
Check whether the model compilation step is causing the issue, in which case you should try a smaller model as your computer cannot hold DenseNet-201 in memory.
Check whether loading the (large) dataset in one go is causing the issue. If this is the case, look into using the ImageDataGenerator class, specifically the .flow_from_directory function so you only ever store current/queued batches in memory.
If the call to model.fit() is the issue, try to reduce the batch_size parameter (warning, this will slow down training).

Related

Is there a way to train a neuronal network (LSTM model) with multiple datasets in order to do time series forecasting?

I want to do predictions with a LSTM model, but the dataset isn't a single file, it's composed with multiple files (for example 3 Excels).
I've already checked that if you want to deal with a time series forecasting problem you have to prepare your data like (number of samples, number of time steps, number of features) and it works well if I implement this for a single Excel.
The problem consists in training with the three Excels at the same time, in this case the input tensor for the LSTM layer has the shape: (n_files, n_samples, n_timesteps, n_features), with dim = 4. This doesn't work because LSTM layers only admits input tensors with dim = 3.
My files have the same amount of data. It's collected from a device and the data has 1 value for each minute along the duration of the experiment. All the experiments have the same duration too.
I've tried to concatenate the files in order to have 1, and choosing the batch_size as the number of samples in one Excel (because I can't mix the different experiments) but it doesn't produce a good result (at least as good as the results from predicting with 1 experiment).
def build_model():
model = keras.Sequential([
layers.Masking(mask_value = 0.0, input_shape=[1,1]),
layers.LSTM(50, return_sequences=True),
layers.Dense(1)
])
optimizer = tf.keras.optimizers.Adam(0.001)
model.compile(loss='mse',
optimizer=optimizer,
metrics=['mae','RootMeanSquaredError'])
return model
model_pred = build_model()
model_pred.fit(Xchopped_train, ychopped_train, batch_size = 252,
epochs=500, verbose=1)
Where Xchopped_train and ychopped_train are the concatenated data from the 3 Excel.
Another thing I've tried is to train the model within a loop, and changing the Excel:
for i in range(len(Xtrain)):
model_pred.fit(Xtrain[i], Ytrain[i], epochs=167, verbose=1)
Where Xtrain is (3,252,1,1) and the first index refers to the number of Excel.
And by far this is my best approach but it feels like this isn't a good solution since I don't know what's happening with the NN weights or which loss function is minimizing...
Is there a more efficient way to do this? Thanks!

Is there a way to increase the variance of model's prediction?

I created a randomly generated(using numpy, between range 30 and 60) Data of about 12000 points (to
generate an artificial time-series data for more than a year in Time).
Now I am trying to fit that data points in an LSTM model and forecast
based upon that.
The LSTM model i applied,(here data is a single series so n_features = 1, and steps-in and out are for sequence-generation function for time-series, i took both equal to 5. Also the for the activation functions i tried all with both relu, both tanh and 1st tanh & 2nd relu (as shown here))
X, y = split_sequences(data, n_steps_in, n_steps_out)
n_features = X.shape[2]
model = Sequential()
model.add(LSTM(200, activation='tanh', input_shape=(n_steps_in,
n_features)))
model.add(RepeatVector(n_steps_out))
model.add(LSTM(200, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(n_features)))
opt = keras.optimizers.Adam(learning_rate=0.05)
model.compile(optimizer=opt, loss='mse')
model.fit(X, y, epochs= n, batch_size=10, verbose=1,
workers=4, use_multiprocessing = True, initial_epoch = 0)
I also tried smoothening of the data-points as they are randomly
distributed (in the predefined boundaries).
and then applied the model on the smoothed data, but still i am getting similar results.
for e.g., In this image showing both the smoothed-training data and the forecasted-prediction from the model
plt.plot(Training_data, 'g')
plt.plot(Pred_Forecasts,'r')
Every time the models are giving straight lines in prediction.
and which is obvious since it is a set of random numbers so model tends to get to a mean value between the upper and lower limits of the data, but still is there any way to generate a somewhat real looking model.
P.S-1 - I have also tried applying different models like prophet, sarima, arima.
But i think i need to find a way to increase the Variance of the prediction, which i am unable to find.
PS-2 - Sorry for the long question i am new to deep-learning so i tried to explain more.

One-hot-encoded labels___multi-hot-encoded output_Keras

I have a 1D-image with 1x2048 pixels as input and 32 classes for which I have defined a layer of 32 filters with the same size of the image(1x2048) which are L1-regularized.
My image examples are one-hot encodded. However, my goal is to get a multi-hot encoded output when I sum some of these images together and feed it to the trained model.
The training goes well and it can classify each class seperately, but if I sum two image and feed it to the model it only outputs a one-hot encoded vector( although I expect a two-hot encoded vector). If I look at the kernels after training, they make sense as most of the weights are zero except the ones which define my class.
I don't understand why I get a one-hot vector output rather than multi-hot vector.
The reason I don't already sum the images and use them for training the model is that the possible making the possible combination of the images exceed my memory power.
An image of the network I have in mind
input_shape=(1,2048,1)
model = Sequential()
model.add(Conv2D(32, kernel_size=(1, 2048), strides=(1, 1),
activation='sigmoid',
input_shape=input_shape,
kernel_regularizer=keras.regularizers.l1(0.01),
kernel_constraint=keras.constraints.non_neg() ))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=optimizer,metrics=['accuracy'])
You are using the wrong loss function
categorical_crossentropy will always return you exactly one 1-value in your vector, no matter the input. It tries to classify every instance into one (and only one) available class.
What you desire, though, is (potentially) mutliple ones in your output. Therefore, you should use binary_crossentropy instead. Also see this post.
On a side note, I would heavily advice you to really consider this twice, since - if you don't really have the case with multiple classes that often, it will maybe result in a lot of false positives. I.e., cases where you get more than one class predicted.
On another note, you might want to consider using Conv1D since your signal is 1-dimensional only.
#Azerila
The thing you are looking for is Mixup augmentation. It is implemented as follows:
def mixup(entry1,entry2):
image1,label1 = entry1
image2,label2 = entry2
alpha = [0.2]
dist = tfd.Beta(alpha, alpha)
l = dist.sample(1)[0][0]
img = l*image1+(1-l)*image2
lab = l*label1+(1-l)*label2
return img, lab

Keras - Image Input to LSTM with time_steps

I have a problem which requires that I use LSTM many-to-one architecture i.e. it will take in 19 image frames first and then give out an output.
The image frame has size (128,128,3).
I have been trying since days but could not find the answer, what should be the input_shape for LSTM?
I believe since the image frame is of size 128*128*3, thus the number of units in the input layer would be 49152. Currently the code looks like this:
timesteps = 19
data_dim = 128*128*3
model = Sequential()
model.add(LSTM(data_dim,input_shape=(timesteps, data_dim)))
model.add(Dense(10))
optimizer = 'sgd'
momentum=0.6
decay=0.0005
nesterov=True
optimizer = SGD(lr=lr, momentum=momentum, decay=decay, nesterov=nesterov)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)
This code gives Memory Error after compilation.
Thus is it due to some error in input_shape and other parameters to LSTM or is it due to my computer's hardware?
It's certainly your achitecture. Try less LSTM units, data_dim is way too much.
Considering that your input is also data_dim, this will result in more than 4*49152*49152 = 9.663.676.416 weights (not counting biases).
model.add(LSTM(less_units, input_shape=(timesteps,data_dim)))
A sequence of images:
model.add(TimeDistributed(Conv2D(output_filters,kernel_size,...), input_shape=(timesteps,x,y,channels))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(output_dim,...))

Tweaking the Loss before the Optimizer Step

I want to add an extra operation before running the AdamOptimizer operation on my loss, so as to help the model deal with repetitions in my data. The relevant code snippet looks something like this:
loss = tf.nn.softmax_cross_entropy_with_logits(logits=predLogits, labels=actLabels)
loss = tf.reshape(loss, [batchsize, -1])
repMask = tf.sqrt(tf.cast(tf.abs(tf.subtract(tf.cast(Y, tf.int64), tf.cast(X, tf.int64))), tf.float32))
lossPost = loss - repMask
train_step = tf.train.AdamOptimizer(LR).minimize(lossPost)
So, in other words, instead of minimizing loss, I want AdamOptimizer to minimize its slightly tweaked version, which is lossPost. I then train the model in the usual way:
_ = sess.run([train_step], feed_dict=feed_dict)
I noticed that adding this workaround of minimizing lossPost instead of loss has no impact on the accuracy of the model. The model produces the exact same output with or without this workaround. It seems that it continues to optimize the original, unmodified loss. Why is this the case?
My original approach was to perform this tweak at the softmax_cross_entropy_with_logits step, by using the weighted_cross_entropy_with_logits instead, but I have an extra complication there, since there is an extra dimension of Vocabulary (this is a character-level-style model). So I thought it would be easier to do this afterwords, and as long as it's done prior to the optimization step it should be doable?
In your model it seems like X and Y are constants (that is, they depend only on the data). In this case repMask is also constant, as it is defined by
repMask = tf.sqrt(tf.cast(tf.abs(tf.subtract(tf.cast(Y, tf.int64), tf.cast(X, tf.int64))), tf.float32))
Hence loss and lossPost differ by constant value, and this has no effect on the minimization process (it is like finding x that minimizes x^2-1 vs x that minimizes x^2-5. Both x are the same).

Resources