Keras - Image to Word (OCR) - image-processing

I am trying to build a very simple OCR for start my tests on bigger models. The problem here is that I can't figure out how should be my output data for my training
code:
def simple_model():
output = 28
if K.image_data_format() == 'channels_first':
input_shape = (1, input_height, input_width)
else:
input_shape = (input_height, input_width, 1)
conv_to_rnn_dims = (input_width // (2), (input_height // (2)) * conv_blades)
model = Sequential()
model.add(Conv2D(conv_blades, (3, 3), input_shape=input_shape, padding='same'))
model.add(MaxPooling2D(pool_size=(2,2), name='max2'))
model.add(Reshape(target_shape=conv_to_rnn_dims, name='reshape'))
model.add(GRU(64, return_sequences=True, kernel_initializer='he_normal', name='gru1'))
model.add(TimeDistributed(Dense(output, kernel_initializer='he_normal', name='dense2')))
model.add(Activation('softmax', name='softmax'))
model.compile(loss='mse',
optimizer='adamax',
metrics=["accuracy"])
return model
img = load_img('exit.png', grayscale=True, target_size=[input_height, input_width])
x = img_to_array(img)
x = x.reshape((1,) + x.shape)
y = np.array(['exit'])
model = simple_model()
model.fit(x, y, batch_size=1,
epochs=10,
validation_data=(x, y),
verbose=1)
print model.predict(y)
Image Example:
(source: exitfest.org)
When I run this code, I get the following error:
ValueError: Error when checking target: expected softmax to have 3 dimensions, but got array with shape (1, 1)
Note 1: I know I can't train my model with only one image and one label, I am aware and I have a bunch more images like that, but first I need to run this simple model before improve it.
Note 2: this is the first time I work with Image-to-Sequence output, it may have other problems, so feel free to change the code if there is this kind of mistake.

Well, as I haven't received any answer, I will link to the answer I posted in another question
Here I explain how to use the keras OCR example and answer some other questions.

Related

Poor predictions on second dataset from trained LSTM model

I've trained an LSTM model with 8 features and 1 output. I have one dataset and split it into two separate files to train and predict with the first half of the set, and then attempt to predict the second half of the set using the trained model from the first part of my dataset. My model predicts the trained and testing sets from the dataset I used to train the model pretty well (RMSE of around 5-7), however when I attempt to predict using the second half of the set I get very poor predictions (RMSE of around 50-60). How can I get my trained model to predict outside datasets well?
dataset at this link
file = r'/content/drive/MyDrive/only_force_pt1.csv'
df = pd.read_csv(file)
df.head()
X = df.iloc[:, 1:9]
y = df.iloc[:,9]
print(X.shape)
print(y.shape)
plt.figure(figsize = (20, 6), dpi = 100)
plt.plot(y)
WINDOW_LEN = 50
def window_size(size, inputdata, targetdata):
X = []
y = []
i=0
while(i + size) <= len(inputdata)-1:
X.append(inputdata[i: i+size])
y.append(targetdata[i+size])
i+=1
assert len(X)==len(y)
return (X,y)
X_series, y_series = window_size(WINDOW_LEN, X, y)
print(len(X))
print(len(X_series))
print(len(y_series))
X_train, X_val, y_train, y_val = train_test_split(np.array(X_series),np.array(y_series),test_size=0.3, shuffle = True)
X_val, X_test,y_val, y_test = train_test_split(np.array(X_val),np.array(y_val),test_size=0.3, shuffle = False)
n_timesteps, n_features, n_outputs = X_train.shape[1], X_train.shape[2],1
[verbose, epochs, batch_size] = [1, 300, 32]
input_shape = (n_timesteps, n_features)
model = Sequential()
# LSTM
model.add(LSTM(64, input_shape=input_shape, return_sequences = False))
model.add(Dropout(0.2))
model.add(Dense(64, activation='relu', kernel_regularizer=keras.regularizers.l2(0.001)))
#model.add(Dropout(0.2))
model.add(Dense(32, activation='relu', kernel_regularizer=keras.regularizers.l2(0.001)))
model.add(Dense(1, activation='relu'))
earlystopper = EarlyStopping(monitor='val_loss', min_delta=0, patience = 30, verbose =1, mode = 'auto')
model.summary()
model.compile(loss = 'mse', optimizer = Adam(learning_rate = 0.001), metrics=[tf.keras.metrics.RootMeanSquaredError()])
history = model.fit(X_train, y_train, batch_size = batch_size, epochs = epochs, verbose = verbose, validation_data=(X_val,y_val), callbacks = [earlystopper])
Second dataset:
tests = r'/content/drive/MyDrive/only_force_pt2.csv'
df_testing = pd.read_csv(tests)
X_testing = df_testing.iloc[:4038,1:9]
torque = df_testing.iloc[:4038,9]
print(X_testing.shape)
print(torque.shape)
plt.figure(figsize = (20, 6), dpi = 100)
plt.plot(torque)
X_testing = X_testing.to_numpy()
X_testing_series, y_testing_series = window_size(WINDOW_LEN, X_testing, torque)
X_testing_series = np.array(X_testing_series)
y_testing_series = np.array(y_testing_series)
scores = model.evaluate(X_testing_series, y_testing_series, verbose =1)
X_prediction = model.predict(X_testing_series, batch_size = 32)
If your model is working fine on training data but performs bad on validation data, then your model did not learn the "true" connection between input and output variables but simply memorized the corresponding output to your input. To tackle this you can do multiple things:
Typically you would use 80% of your data to train and 20% to test, this will present more data to the model, which should make it learn more of the true underlying function
If your model is too complex, it will have neurons which will just be used to memorize input-output data pairs. Try to reduce the complexity of your model (layers, neurons) to make it more simple, so that the remaining layers can really learn instead of memorize
Look into more detail on training performance here

Tensorflow 2 XOR implementation

I am a tensorflow newbie and to start with I want to train XOR model giving 4 inputs having 2 values and learn 4 output having 1 value.
Here is what I am doing in TF 2
model = keras.Sequential([
keras.layers.Input(shape=(2,)),
keras.layers.Dense(units=2, activation='relu'),
keras.layers.Dense(units=1, activation='softmax')
])
model.compile(optimizer='adam',
loss=tf.losses.CategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
history = model.fit(
(tf.cast([[0,0],[0,1],[1,0],[1,1]], tf.float32), tf.cast([0,1,1,0], tf.float32)),
epochs=4,
steps_per_epoch=1,
validation_data=(tf.cast([[0.7, 0.7]], tf.float32), tf.cast([0], tf.float32)),
validation_steps=1
)
Above code is giving error IndexError: list index out of range
Please help me with this and I want to understand how to come up with shapes to give to model.
You have a problem with assigning your parameters in the fit function in:
history = model.fit(
(tf.cast([[0,0],[0,1],[1,0],[1,1]], tf.float32), tf.cast([0,1,1,0], tf.float32)),
epochs=4,
steps_per_epoch=1,
validation_data=(tf.cast([[0.7, 0.7]], tf.float32), tf.cast([0], tf.float32)),
validation_steps=1)
Try and replace that line with this:
x_train = tf.cast([[0,0],[0,1],[1,0],[1,1]], tf.float32)
y_train = tf.cast([0,1,1,0], tf.float32)
x_test = tf.cast([[0.7, 0.7]], tf.float32)
y_test = tf.cast([0], tf.float32)
history = model.fit(
x=x_train, y=y_train,
epochs=4,
steps_per_epoch=1,
validation_data=(x_test, y_test),
validation_steps=1
)
And your issue should be solved.
PS: Just a suggestion, when you are doing binary classification, try to use sigmoid instead of a softmax, and respectively a BinaryCrossentropy loss instead of CategoricalCrossentropy. Good luck

Keras accuracy metrics differ from manual computation

I am working on a binary classification problem on Keras. The loss function I use is binary_crossentropy and metrics is metrics=['accuracy']. Since two classes are imbalanced, I use class_weight='auto' when I fit training data set to the model.
To see the performance, I print out the accuracy by
print GNN.model.test_on_batch([test_sample_1, test_sample_2], test_label)[1]
The output is 0.973. But this result is different when I use following lines to get the prediction accuracy
predict_label = GNN.model.predict([test_sample_1, test_sample_2])
rounded = predict_label.round(1)
print (rounded == test_label).sum()/float(rounded.shape[0])
which is 0.953.
So I am wondering how metrics=['accuracy'] evaluate the model performance and why the result is different.
For details, I attached the model summary below.
input_size = self.n_feature
encoder_size = 2000
dropout_rate = 0.5
X1 = Input(shape=(input_size, ), name='input_1')
X2 = Input(shape=(input_size, ), name='input_2')
encoder = Sequential()
encoder.add(Dropout(dropout_rate, input_shape=(input_size, )))
encoder.add(Dense(encoder_size, activation='tanh'))
encoded_1 = encoder(X1)
encoded_2 = encoder(X2)
merged = concatenate([encoded_1, encoded_2])
comparer = Sequential()
comparer.add(Dropout(dropout_rate, input_shape=(encoder_size * 2, )))
comparer.add(Dense(500, activation='relu'))
comparer.add(Dropout(dropout_rate))
comparer.add(Dense(200, activation='relu'))
comparer.add(Dropout(dropout_rate))
comparer.add(Dense(1, activation='sigmoid'))
Y = comparer(merged)
model = Model(inputs=[X1, X2], outputs=Y)
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
self.model = model
And I train model by
self.hist = self.model.fit(
x=[train_sample_1, train_sample_2],
y=train_label,
class_weight = 'auto',
validation_split=0.1,
batch_size=batch_size,
epochs=epochs,
callbacks=callbacks)

Issue in LSTM Input Dimensions in Keras

I am trying to implement a multi-input LSTM model using keras. The code is as follows:
data_1 -> shape (1150,50)
data_2 -> shape (1150,50)
y_train -> shape (1150,50)
input_1 = Input(shape=data_1.shape)
LSTM_1 = LSTM(100)(input_1)
input_2 = Input(shape=data_2.shape)
LSTM_2 = LSTM(100)(input_2)
concat = Concatenate(axis=-1)
x = concat([LSTM_1, LSTM_2])
dense_layer = Dense(1, activation='sigmoid')(x)
model = keras.models.Model(inputs=[input_1, input_2], outputs=[dense_layer])
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['acc'])
model.fit([data_1, data_2], y_train, epochs=10)
When I run this code, I get a ValueError:
ValueError: Error when checking model input: expected input_1 to have 3 dimensions, but got array with shape (1150, 50)
Do anyone have any solution to this problem?
Use data1 = np.expand_dims(data1, axis=2), before you define the model. LSTM expects inputs with dimensions (batch_size, timesteps, features), so, in your case, I guessing you have 1 feature, 50 time steps and 1150 samples, you need to add a dimension at the end of your vector.
This need to be done before you define the model otherwise when you set input_1 = Input(shape=data_1.shape) you are telling keras that your input has 1150 timesteps and 50 features,so it will expect inputs of shape (None, 1150, 50) (the non stands for "any dimension will be accepted").
The same holds for input_2
Hope this helps

keras model fit_generator ValueError: Error when checking model target: expected cropping2d_4 to have 4 dimensions, but got array with shape (32, 1)

I'm trying to use keras model.fit_generator() to fit a model, below is my definition of the generator:
from sklearn.utils import shuffle
IMG_PATH_PREFIX = "./data/IMG/"
def generator(samples, batch_size=64):
num_samples = len(samples)
while 1: # Loop forever so the generator never terminates
shuffle(samples)
for offset in range(0, num_samples, batch_size):
batch_samples = samples[offset:offset+batch_size]
images = []
angles = []
for batch_sample in batch_samples:
name = IMG_PATH_PREFIX + batch_sample[0].split('/')[-1]
center_image = cv2.imread(name)
center_angle = float(batch_sample[3])
images.append(center_image)
angles.append(center_angle)
X_train = np.array(images)
y_train = np.array(angles)
#X_train = np.expand_dims(X_train, axis=0)
#y_train = np.expand_dims(y_train, axis=1)
print("X_train shape: ", X_train.shape, " y_train shape:", y_train.shape)
#print("X train: ", X_train)
yield X_train, y_train
train_generator = generator(train_samples, batch_size = 32)
validation_generator = generator(validation_samples, batch_size = 32)
Here the output shape is:
X_train shape: (32, 160, 320, 3) y_train shape: (32,)
The model fit code is:
model = Sequential()
#cropping layer
model.add(Cropping2D(cropping=((50,20), (1,1)), input_shape=(160,320,3), dim_ordering='tf'))
model.compile(loss = "mse", optimizer="adam")
model.fit_generator(train_generator, samples_per_epoch= len(train_samples), validation_data=validation_generator, nb_val_samples=len(validation_samples), nb_epoch=3)
Then I get the error message:
ValueError: Error when checking model target: expected cropping2d_6 to have 4 dimensions, but got array with shape (32, 1)
Could someone help let me know what's the issue?
The big question here is : do you know what you are trying to do ?
1) If you read here, the input is a 4D tensor and the output is ALSO a 4D tensor. Your target is a 2D tensor of shape (batch_size,1). So of course, when keras tries to compute the error between the output which has 3D (without batch dimension) and the target which has 1D (without batch dimension), it can not make sense out of that. Outputs and targets must have the same dimensions.
2) Do you know what cropping2D is actually doing ? It is cropping your images... So removing values at the beginning and end of your cropping dimensions. In your case you are outputing images of shape (90, 218, 3). This is not a prediction, there is no weight to train on this layer so no reason to fit the "model". Your model is just cropping images. No training needed for that.

Resources