Keras - Trouble getting correct class prediction - machine-learning

I have built a CNN using Keras to categorize 2 different categories of images. The problem I am having is that I cannot seem to get a correct prediction after training.
A little background...
The data set is 78750 examples large (approximately 95% Cat. 1 and 5% Cat. 2), which may be the culprit, as I assume overfitting is occuring for Cat. 1 (I assume this is the problem, but changing the dataset size is hard for a number of other reasons)
To combat this I have added regularization on each Convolutional Layer, but to no avail.
My question is this... do I absolutely need to change my category sizes, or is there anything else I can do to combat overfitting of Cat. 1?
Here is the code for the CNN:
model = Sequential()
model.add(Conv2D(filters=25,
kernel_size=(10, 10),
strides=(1, 1),
activation='relu',
input_shape=input_shape,
padding="VALID",
kernel_initializer=random_normal(mean=0, stddev=.1),
kernel_regularizer=l2(.001)))
model.add(MaxPooling2D(pool_size=(2, 2),
strides=(2, 2)))
model.add(Conv2D(filters=25,
kernel_size=(7, 7),
strides=(1, 1),
activation='relu',
padding="VALID",
kernel_initializer=random_normal(mean=0, stddev=.1),
kernel_regularizer=l2(.001)))
model.add(MaxPooling2D(pool_size=(2, 2),
strides=(2, 2)))
model.add(Conv2D(filters=25,
kernel_size=(5, 5),
strides=(2, 2),
activation='relu',
padding="VALID",
kernel_initializer=random_normal(mean=0, stddev=.1),
kernel_regularizer=l2(.001)))
model.add(MaxPooling2D(pool_size=(2, 2),
strides=(1, 1)))
model.add(Conv2D(filters=25,
kernel_size=(5, 5),
strides=(2, 2),
activation='relu',
padding="VALID",
kernel_initializer=random_normal(mean=0, stddev=.1),
kernel_regularizer=l2(.001)))
model.add(Flatten())
model.add(Dense(2, activation='relu', kernel_initializer=random_normal(mean=0, stddev=.1), kernel_regularizer=l2(.001)))
model.add(Dense(2, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.sgd(lr=.001, momentum=0.9),
metrics=['accuracy'])
EDIT 1
Here is the output of running training for 1 epoch...
Epoch 1/2
500/78750 [..............................] - ETA: 664s - loss: 1.3999 - acc: 0.9460
1000/78750 [..............................] - ETA: 652s - loss: 1.3713 - acc: 0.9500
1500/78750 [..............................] - ETA: 648s - loss: 1.3897 - acc: 0.9460
2000/78750 [..............................] - ETA: 648s - loss: 1.3970 - acc: 0.9420
2500/78750 [..............................] - ETA: 646s - loss: 1.3965 - acc: 0.9376
3000/78750 [>.............................] - ETA: 640s - loss: 1.3972 - acc: 0.9373
3500/78750 [>.............................] - ETA: 636s - loss: 1.3886 - acc: 0.9377
4000/78750 [>.............................] - ETA: 628s - loss: 1.3886 - acc: 0.9403
4500/78750 [>.............................] - ETA: 625s - loss: 1.3857 - acc: 0.9400
5000/78750 [>.............................] - ETA: 619s - loss: 1.3813 - acc: 0.9416
5500/78750 [=>............................] - ETA: 612s - loss: 1.3773 - acc: 0.9436
6000/78750 [=>............................] - ETA: 608s - loss: 1.3756 - acc: 0.9447
6500/78750 [=>............................] - ETA: 606s - loss: 1.3735 - acc: 0.9454
7000/78750 [=>............................] - ETA: 602s - loss: 1.3733 - acc: 0.9466
7500/78750 [=>............................] - ETA: 597s - loss: 1.3709 - acc: 0.9481
8000/78750 [==>...........................] - ETA: 594s - loss: 1.3688 - acc: 0.9480
8500/78750 [==>...........................] - ETA: 589s - loss: 1.3672 - acc: 0.9485
9000/78750 [==>...........................] - ETA: 584s - loss: 1.3656 - acc: 0.9491
9500/78750 [==>...........................] - ETA: 580s - loss: 1.3642 - acc: 0.9491
10000/78750 [==>...........................] - ETA: 576s - loss: 1.3629 - acc: 0.9497
10500/78750 [===>..........................] - ETA: 571s - loss: 1.3625 - acc: 0.9494
11000/78750 [===>..........................] - ETA: 567s - loss: 1.3615 - acc: 0.9495
11500/78750 [===>..........................] - ETA: 562s - loss: 1.3604 - acc: 0.9496
12000/78750 [===>..........................] - ETA: 558s - loss: 1.3596 - acc: 0.9496
12500/78750 [===>..........................] - ETA: 554s - loss: 1.3599 - acc: 0.9496
13000/78750 [===>..........................] - ETA: 549s - loss: 1.3591 - acc: 0.9494
13500/78750 [====>.........................] - ETA: 545s - loss: 1.3588 - acc: 0.9496
14000/78750 [====>.........................] - ETA: 541s - loss: 1.3588 - acc: 0.9496
14500/78750 [====>.........................] - ETA: 537s - loss: 1.3581 - acc: 0.9497
15000/78750 [====>.........................] - ETA: 533s - loss: 1.3577 - acc: 0.9497
15500/78750 [====>.........................] - ETA: 529s - loss: 1.3571 - acc: 0.9503
16000/78750 [=====>........................] - ETA: 525s - loss: 1.3568 - acc: 0.9502
16500/78750 [=====>........................] - ETA: 520s - loss: 1.3563 - acc: 0.9498
17000/78750 [=====>........................] - ETA: 515s - loss: 1.3557 - acc: 0.9500
17500/78750 [=====>........................] - ETA: 510s - loss: 1.3552 - acc: 0.9501
18000/78750 [=====>........................] - ETA: 506s - loss: 1.3547 - acc: 0.9504
18500/78750 [======>.......................] - ETA: 502s - loss: 1.3544 - acc: 0.9504
19000/78750 [======>.......................] - ETA: 497s - loss: 1.3540 - acc: 0.9502
19500/78750 [======>.......................] - ETA: 492s - loss: 1.3537 - acc: 0.9502
20000/78750 [======>.......................] - ETA: 488s - loss: 1.3533 - acc: 0.9501
20500/78750 [======>.......................] - ETA: 483s - loss: 1.3529 - acc: 0.9497
21000/78750 [=======>......................] - ETA: 479s - loss: 1.3525 - acc: 0.9496
21500/78750 [=======>......................] - ETA: 475s - loss: 1.3522 - acc: 0.9500
22000/78750 [=======>......................] - ETA: 471s - loss: 1.3518 - acc: 0.9498
22500/78750 [=======>......................] - ETA: 466s - loss: 1.3515 - acc: 0.9497
23000/78750 [=======>......................] - ETA: 462s - loss: 1.3512 - acc: 0.9499
23500/78750 [=======>......................] - ETA: 458s - loss: 1.3509 - acc: 0.9496
24000/78750 [========>.....................] - ETA: 454s - loss: 1.3506 - acc: 0.9495
24500/78750 [========>.....................] - ETA: 450s - loss: 1.3503 - acc: 0.9499
25000/78750 [========>.....................] - ETA: 445s - loss: 1.3501 - acc: 0.9501
25500/78750 [========>.....................] - ETA: 441s - loss: 1.3498 - acc: 0.9500
26000/78750 [========>.....................] - ETA: 437s - loss: 1.3496 - acc: 0.9501
26500/78750 [=========>....................] - ETA: 433s - loss: 1.3494 - acc: 0.9503
27000/78750 [=========>....................] - ETA: 428s - loss: 1.3491 - acc: 0.9501
27500/78750 [=========>....................] - ETA: 424s - loss: 1.3489 - acc: 0.9501
28000/78750 [=========>....................] - ETA: 419s - loss: 1.3487 - acc: 0.9501
28500/78750 [=========>....................] - ETA: 415s - loss: 1.3484 - acc: 0.9503
29000/78750 [==========>...................] - ETA: 411s - loss: 1.3482 - acc: 0.9503
29500/78750 [==========>...................] - ETA: 407s - loss: 1.3480 - acc: 0.9501
30000/78750 [==========>...................] - ETA: 403s - loss: 1.3478 - acc: 0.9503
30500/78750 [==========>...................] - ETA: 399s - loss: 1.3476 - acc: 0.9501
31000/78750 [==========>...................] - ETA: 395s - loss: 1.3474 - acc: 0.9502
31500/78750 [===========>..................] - ETA: 391s - loss: 1.3472 - acc: 0.9501
32000/78750 [===========>..................] - ETA: 387s - loss: 1.3470 - acc: 0.9501
32500/78750 [===========>..................] - ETA: 383s - loss: 1.3468 - acc: 0.9502
33000/78750 [===========>..................] - ETA: 379s - loss: 1.3467 - acc: 0.9501
33500/78750 [===========>..................] - ETA: 375s - loss: 1.3465 - acc: 0.9501
34000/78750 [===========>..................] - ETA: 371s - loss: 1.3464 - acc: 0.9503
34500/78750 [============>.................] - ETA: 367s - loss: 1.3462 - acc: 0.9502
35000/78750 [============>.................] - ETA: 363s - loss: 1.3461 - acc: 0.9503
35500/78750 [============>.................] - ETA: 358s - loss: 1.3459 - acc: 0.9503
36000/78750 [============>.................] - ETA: 354s - loss: 1.3458 - acc: 0.9502
36500/78750 [============>.................] - ETA: 350s - loss: 1.3456 - acc: 0.9504
37000/78750 [=============>................] - ETA: 346s - loss: 1.3455 - acc: 0.9504
37500/78750 [=============>................] - ETA: 341s - loss: 1.3454 - acc: 0.9505
38000/78750 [=============>................] - ETA: 337s - loss: 1.3452 - acc: 0.9506
38500/78750 [=============>................] - ETA: 333s - loss: 1.3451 - acc: 0.9506
39000/78750 [=============>................] - ETA: 329s - loss: 1.3450 - acc: 0.9506
39500/78750 [==============>...............] - ETA: 325s - loss: 1.3449 - acc: 0.9506
40000/78750 [==============>...............] - ETA: 321s - loss: 1.3448 - acc: 0.9508
40500/78750 [==============>...............] - ETA: 317s - loss: 1.3447 - acc: 0.9509
41000/78750 [==============>...............] - ETA: 313s - loss: 1.3445 - acc: 0.9507
41500/78750 [==============>...............] - ETA: 309s - loss: 1.3444 - acc: 0.9506
42000/78750 [===============>..............] - ETA: 304s - loss: 1.3443 - acc: 0.9507
42500/78750 [===============>..............] - ETA: 300s - loss: 1.3442 - acc: 0.9508
43000/78750 [===============>..............] - ETA: 296s - loss: 1.3441 - acc: 0.9508
43500/78750 [===============>..............] - ETA: 292s - loss: 1.3440 - acc: 0.9508
44000/78750 [===============>..............] - ETA: 287s - loss: 1.3439 - acc: 0.9508
44500/78750 [===============>..............] - ETA: 283s - loss: 1.3438 - acc: 0.9509
45000/78750 [================>.............] - ETA: 279s - loss: 1.3438 - acc: 0.9509
45500/78750 [================>.............] - ETA: 275s - loss: 1.3437 - acc: 0.9511
46000/78750 [================>.............] - ETA: 271s - loss: 1.3436 - acc: 0.9510
46500/78750 [================>.............] - ETA: 267s - loss: 1.3435 - acc: 0.9512
47000/78750 [================>.............] - ETA: 263s - loss: 1.3434 - acc: 0.9513
47500/78750 [=================>............] - ETA: 259s - loss: 1.3433 - acc: 0.9512
48000/78750 [=================>............] - ETA: 255s - loss: 1.3432 - acc: 0.9513
48500/78750 [=================>............] - ETA: 250s - loss: 1.3431 - acc: 0.9512
49000/78750 [=================>............] - ETA: 246s - loss: 1.3430 - acc: 0.9511
49500/78750 [=================>............] - ETA: 242s - loss: 1.3429 - acc: 0.9511
50000/78750 [==================>...........] - ETA: 238s - loss: 1.3428 - acc: 0.9513
50500/78750 [==================>...........] - ETA: 233s - loss: 1.3428 - acc: 0.9514
51000/78750 [==================>...........] - ETA: 229s - loss: 1.3427 - acc: 0.9514
51500/78750 [==================>...........] - ETA: 225s - loss: 1.3426 - acc: 0.9514
52000/78750 [==================>...........] - ETA: 221s - loss: 1.3427 - acc: 0.9515
52500/78750 [===================>..........] - ETA: 217s - loss: 1.3426 - acc: 0.9515
53000/78750 [===================>..........] - ETA: 213s - loss: 1.3425 - acc: 0.9515
53500/78750 [===================>..........] - ETA: 209s - loss: 1.3425 - acc: 0.9516
54000/78750 [===================>..........] - ETA: 204s - loss: 1.3424 - acc: 0.9515
54500/78750 [===================>..........] - ETA: 200s - loss: 1.3423 - acc: 0.9513
55000/78750 [===================>..........] - ETA: 196s - loss: 1.3423 - acc: 0.9515
55500/78750 [====================>.........] - ETA: 192s - loss: 1.3422 - acc: 0.9514
56000/78750 [====================>.........] - ETA: 188s - loss: 1.3421 - acc: 0.9513
56500/78750 [====================>.........] - ETA: 184s - loss: 1.3420 - acc: 0.9513
57000/78750 [====================>.........] - ETA: 179s - loss: 1.3420 - acc: 0.9513
57500/78750 [====================>.........] - ETA: 175s - loss: 1.3419 - acc: 0.9513
58000/78750 [=====================>........] - ETA: 171s - loss: 1.3419 - acc: 0.9513
58500/78750 [=====================>........] - ETA: 167s - loss: 1.3418 - acc: 0.9512
59000/78750 [=====================>........] - ETA: 163s - loss: 1.3417 - acc: 0.9510
59500/78750 [=====================>........] - ETA: 159s - loss: 1.3417 - acc: 0.9511
60000/78750 [=====================>........] - ETA: 155s - loss: 1.3416 - acc: 0.9511
60500/78750 [======================>.......] - ETA: 150s - loss: 1.3415 - acc: 0.9512
61000/78750 [======================>.......] - ETA: 146s - loss: 1.3414 - acc: 0.9512
61500/78750 [======================>.......] - ETA: 142s - loss: 1.3414 - acc: 0.9512
62000/78750 [======================>.......] - ETA: 138s - loss: 1.3413 - acc: 0.9512
62500/78750 [======================>.......] - ETA: 134s - loss: 1.3412 - acc: 0.9513
63000/78750 [=======================>......] - ETA: 130s - loss: 1.3412 - acc: 0.9514
63500/78750 [=======================>......] - ETA: 126s - loss: 1.3411 - acc: 0.9514
64000/78750 [=======================>......] - ETA: 121s - loss: 1.3411 - acc: 0.9515
64500/78750 [=======================>......] - ETA: 117s - loss: 1.3411 - acc: 0.9516
65000/78750 [=======================>......] - ETA: 113s - loss: 1.3410 - acc: 0.9516
65500/78750 [=======================>......] - ETA: 109s - loss: 1.3412 - acc: 0.9516
66000/78750 [========================>.....] - ETA: 105s - loss: 1.3411 - acc: 0.9517
66500/78750 [========================>.....] - ETA: 101s - loss: 1.3410 - acc: 0.9516
67000/78750 [========================>.....] - ETA: 97s - loss: 1.3410 - acc: 0.9516
67500/78750 [========================>.....] - ETA: 92s - loss: 1.3409 - acc: 0.9516
68000/78750 [========================>.....] - ETA: 88s - loss: 1.3408 - acc: 0.9515
68500/78750 [=========================>....] - ETA: 84s - loss: 1.3408 - acc: 0.9515
69000/78750 [=========================>....] - ETA: 80s - loss: 1.3407 - acc: 0.9515
69500/78750 [=========================>....] - ETA: 76s - loss: 1.3407 - acc: 0.9515
70000/78750 [=========================>....] - ETA: 72s - loss: 1.3406 - acc: 0.9515
70500/78750 [=========================>....] - ETA: 68s - loss: 1.3405 - acc: 0.9516
71000/78750 [==========================>...] - ETA: 64s - loss: 1.3405 - acc: 0.9516
71500/78750 [==========================>...] - ETA: 59s - loss: 1.3404 - acc: 0.9516
72000/78750 [==========================>...] - ETA: 55s - loss: 1.3404 - acc: 0.9517
72500/78750 [==========================>...] - ETA: 51s - loss: 1.3403 - acc: 0.9518
73000/78750 [==========================>...] - ETA: 47s - loss: 1.3403 - acc: 0.9517
73500/78750 [===========================>..] - ETA: 43s - loss: 1.3402 - acc: 0.9518
74000/78750 [===========================>..] - ETA: 39s - loss: 1.3401 - acc: 0.9517
74500/78750 [===========================>..] - ETA: 35s - loss: 1.3401 - acc: 0.9518
75000/78750 [===========================>..] - ETA: 31s - loss: 1.3400 - acc: 0.9518
75500/78750 [===========================>..] - ETA: 26s - loss: 1.3401 - acc: 0.9519
76000/78750 [===========================>..] - ETA: 22s - loss: 1.3400 - acc: 0.9519
76500/78750 [============================>.] - ETA: 18s - loss: 1.3400 - acc: 0.9519
77000/78750 [============================>.] - ETA: 14s - loss: 1.3399 - acc: 0.9519
77500/78750 [============================>.] - ETA: 10s - loss: 1.3399 - acc: 0.9519
78000/78750 [============================>.] - ETA: 6s - loss: 1.3398 - acc: 0.9518
78500/78750 [============================>.] - ETA: 2s - loss: 1.3398 - acc: 0.9518
78750/78750 [==============================] - 855s - loss: 1.3397 - acc: 0.9518 - val_loss: 1.3321 - val_acc: 0.9523
and here is the model.summary()...
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 72, 72, 25) 2525
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 36, 36, 25) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 30, 30, 25) 30650
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 15, 15, 25) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 6, 6, 25) 15650
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 5, 5, 25) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 1, 1, 25) 15650
_________________________________________________________________
flatten_1 (Flatten) (None, 25) 0
_________________________________________________________________
dense_1 (Dense) (None, 2) 52
_________________________________________________________________
dense_2 (Dense) (None, 2) 6
=================================================================
Total params: 64,533
Trainable params: 64,533
Non-trainable params: 0

Your data set is highly unbalanced, so the model treats the second category as noise and classifies everything as category 1. The simplest way to balance the data set is to oversample examples of the second class, make the model see category 2 more often.
This will probably solve the issue with class output, but such model will have poor generalization. In order to improve generalization ability you can try data augmentation, random tranformations applied to the images.

Try decreasing the learning rate to 1e-5 for 1 or 2 epochs and see if the accuracy goes up. If it doesn't work, please give model.summary().

Related

Model loss remains unchaged

I would like to understand what could be responsible for this model loss behaviour. Training a CNN network, with 6 hidden-layers, the loss shoots up from around 1.8 to above 12 after the first epoch and remains constant for the remaining 99 epochs.
724504/724504 [==============================] - 358s 494us/step - loss: 1.8143 - acc: 0.7557 - val_loss: 16.1181 - val_acc: 0.0000e+00
Epoch 2/100
724504/724504 [==============================] - 355s 490us/step - loss: 12.0886 - acc: 0.2500 - val_loss: 16.1181 - val_acc: 0.0000e+00
Epoch 3/100
724504/724504 [==============================] - 354s 489us/step - loss: 12.0886 - acc: 0.2500 - val_loss: 16.1181 - val_acc: 0.0000e+00
Epoch 4/100
724504/724504 [==============================] - 348s 481us/step - loss: 12.0886 - acc: 0.2500 - val_loss: 16.1181 - val_acc: 0.0000e+00
Epoch 5/100
724504/724504 [==============================] - 355s 490us/step - loss: 12.0886 - acc: 0.2500 - val_loss: 16.1181 - val_acc: 0.0000e+00
I cannot believe this got to do with the dataset I work with, because I tried this with a different, publicly available dataset, the performance is exactly the same (in fact exact figures for loss/accuracy).
I also tested this with a somehow show network having 2 hidden-layers, see the performance below:
724504/724504 [==============================] - 41s 56us/step - loss: 0.4974 - acc: 0.8236 - val_loss: 15.5007 - val_acc: 0.0330
Epoch 2/100
724504/724504 [==============================] - 40s 56us/step - loss: 0.5204 - acc: 0.8408 - val_loss: 15.5543 - val_acc: 0.0330
Epoch 3/100
724504/724504 [==============================] - 41s 56us/step - loss: 0.6646 - acc: 0.8439 - val_loss: 15.3904 - val_acc: 0.0330
Epoch 4/100
724504/724504 [==============================] - 41s 57us/step - loss: 8.8982 - acc: 0.4342 - val_loss: 15.5867 - val_acc: 0.0330
Epoch 5/100
724504/724504 [==============================] - 41s 57us/step - loss: 0.5627 - acc: 0.8444 - val_loss: 15.5449 - val_acc: 0.0330
Can someone points the probable cause of this behaviour? What parameter / configuration needs be adjusted?
EDIT
Model creation
model = Sequential()
activ = 'relu'
model.add(Conv2D(32, (1, 3), strides=(1, 1), padding='same', activation=activ, input_shape=(1, n_points, 4)))
model.add(Conv2D(32, (1, 3), strides=(1, 1), padding='same', activation=activ))
model.add(MaxPooling2D(pool_size=(1, 2)))
#model.add(Dropout(.5))
model.add(Conv2D(64, (1, 3), strides=(1, 1), padding='same', activation=activ))
model.add(Conv2D(64, (1, 3), strides=(1, 1), padding='same', activation=activ))
model.add(MaxPooling2D(pool_size=(1, 2)))
#model.add(Dropout(.5))
model.add(Conv2D(128, (1, 3), strides=(1, 1), padding='same', activation=activ))
model.add(Conv2D(128, (1, 3), strides=(1, 1), padding='same', activation=activ))
model.add(MaxPooling2D(pool_size=(1, 2)))
model.add(Dropout(.5))
model.add(Flatten())
A = model.output_shape
model.add(Dense(int(A[1] * 1/4.), activation=activ))
model.add(Dropout(.5))
model.add(Dense(NoClass, activation='softmax'))
optimizer = Adam(lr=0.0001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X_reample, Y_resample, epochs=100, batch_size=64, shuffle=False,
validation_data=(Test_X, Test_Y))
Changing the learning rate to lr=0.0001 here's the result after 100 epochs.
72090/72090 [==============================] - 29s 397us/step - loss: 0.5040 - acc: 0.8347 - val_loss: 4.3529 - val_acc: 0.2072
Epoch 99/100
72090/72090 [==============================] - 28s 395us/step - loss: 0.4958 - acc: 0.8382 - val_loss: 6.3422 - val_acc: 0.1806
Epoch 100/100
72090/72090 [==============================] - 28s 393us/step - loss: 0.5084 - acc: 0.8342 - val_loss: 4.3781 - val_acc: 0.1925
the optimal epoch size: 97, the value of high accuracy 0.20716827656581954
EDIT 2
Apparently, SMOTE isn't good for sampling all but majority class in a multiclassification, see below the trian/test plot:
Can you please try using BatchNormalization also, place just after your pooling layers. it is good to include it

Neural network prediction in Keras returning the same value for different inputs

I am trying to predict an array of values with a Keras neural network, as follows:
def create_network():
np.random.seed(0)
number_of_features = (22)
# start neural network
network = models.Sequential()
# Adding three layers
# Add fully connected layer with ReLU
network.add(layers.Dense(units=35, activation="relu", input_shape=(number_of_features,)))
# Add fully connected layer with ReLU
network.add(layers.Dense(units=35, activation='relu'))
# Add fully connected layer with no activation function
network.add(layers.Dense(units=1))
# Compile neural network
network.compile(loss='mse',
optimizer='RMSprop',
metrics=['mse'])
# Return compiled network
return network
neural_network = KerasClassifier(build_fn=create_network, epochs = 10, batch_size = 100, verbose = 0)
neural_network.fit(x_train, y_train)
neural_network.predict(x_test)
When using the code to make a prediction I get the following output data:
array([[-0.23991031],
[-0.23991031],
[-0.23991031],
[-0.23991031],
[-0.23991031],
[-0.23991031],
[-0.23991031],
[-0.23991031],
[-0.23991031],
[-0.23991031],
[-0.23991031],
[-0.23991031],
[-0.23991031],
[-0.23991031],
[-0.23991031],
[-0.23991031],
[-0.23991031],
[-0.23991031],
[-0.23991031],
[-0.23991031],
[-0.23991031]])
The different values in this array should not be the same, since the input data points are not the same. Why is this happening?
The data is split into train and test like this:
x_train, x_test, y_train, y_test = train_test_split(df3, df2['RETURN NEXT 12 MONTHS'], test_size=0.2)# 0.2 means 20% of the values are used for testing
Here are some samples of the data:
x_train
RETURN ON INVESTED CAPITAL ... VOLATILITY
226 0.0436 ... 0.3676
309 0.1073 ... 0.3552
306 0.1073 ... 0.3660
238 0.1257 ... 0.4352
254 0.1960 ... 0.4230
308 0.1073 ... 0.3661
327 0.2108 ... 0.2674
325 0.2108 ... 0.2836
...
The dataframe above has 22 columns, and about 100 rows.
The corresponding y training data is:
y_train
226 0.137662
309 1.100000
306 0.725738
238 0.244292
254 -0.557806
308 1.052402
327 -0.035730
...
I have tried to use different number of epochs, batch_size, and different model architecture, but they all give the same output for all input.
Per epoch loss:
Epoch 1/10
10/100 [==>...........................] - ETA: 1s - loss: 1525.8176 - mse: 1525.8176
100/100 [==============================] - 0s 2ms/step - loss: 13771.8389 - mse: 13771.8389
Epoch 2/10
10/100 [==>...........................] - ETA: 0s - loss: 4315.0015 - mse: 4315.0015
30/100 [========>.....................] - ETA: 0s - loss: 23554.2446 - mse: 23554.2441
40/100 [===========>..................] - ETA: 0s - loss: 18089.7297 - mse: 18089.7305
50/100 [==============>...............] - ETA: 0s - loss: 15002.7878 - mse: 15002.7871
100/100 [==============================] - 0s 2ms/step - loss: 10520.1019 - mse: 10520.1025
Epoch 3/10
10/100 [==>...........................] - ETA: 0s - loss: 2722.1135 - mse: 2722.1135
100/100 [==============================] - 0s 167us/step - loss: 8500.4698 - mse: 8500.4697
Epoch 4/10
10/100 [==>...........................] - ETA: 0s - loss: 3192.2231 - mse: 3192.2231
50/100 [==============>...............] - ETA: 0s - loss: 4860.0622 - mse: 4860.0620
90/100 [==========================>...] - ETA: 0s - loss: 7377.6898 - mse: 7377.6904
100/100 [==============================] - 0s 1ms/step - loss: 6911.2499 - mse: 6911.2500
Epoch 5/10
10/100 [==>...........................] - ETA: 0s - loss: 1996.5687 - mse: 1996.5687
60/100 [=================>............] - ETA: 0s - loss: 6902.6661 - mse: 6902.6660
70/100 [====================>.........] - ETA: 0s - loss: 6162.6467 - mse: 6162.6470
90/100 [==========================>...] - ETA: 0s - loss: 6195.4129 - mse: 6195.4131
100/100 [==============================] - 0s 4ms/step - loss: 5773.2919 - mse: 5773.2925
Epoch 6/10
10/100 [==>...........................] - ETA: 0s - loss: 3063.1946 - mse: 3063.1946
80/100 [=======================>......] - ETA: 0s - loss: 5351.2784 - mse: 5351.2793
90/100 [==========================>...] - ETA: 0s - loss: 5100.9203 - mse: 5100.9214
100/100 [==============================] - 0s 2ms/step - loss: 4755.7785 - mse: 4755.7793
Epoch 7/10
10/100 [==>...........................] - ETA: 0s - loss: 3710.8032 - mse: 3710.8032
70/100 [====================>.........] - ETA: 0s - loss: 4607.9606 - mse: 4607.9609
100/100 [==============================] - 0s 943us/step - loss: 3847.0730 - mse: 3847.0732
Epoch 8/10
10/100 [==>...........................] - ETA: 0s - loss: 1742.0632 - mse: 1742.0632
30/100 [========>.....................] - ETA: 0s - loss: 2304.5816 - mse: 2304.5818
100/100 [==============================] - 0s 1ms/step - loss: 3109.5293 - mse: 3109.5293
Epoch 9/10
10/100 [==>...........................] - ETA: 0s - loss: 2027.7537 - mse: 2027.7537
100/100 [==============================] - 0s 574us/step - loss: 2537.4794 - mse: 2537.4795
Epoch 10/10
10/100 [==>...........................] - ETA: 0s - loss: 2966.5125 - mse: 2966.5125
100/100 [==============================] - 0s 177us/step - loss: 2191.3686 - mse: 2191.3687

Validation accuracy stagnates while training accuracy improves

I'm pretty new to deep learning so I'm sorry if I'm missing something obvious.
I am currently training a CNN with a dataset I put together.
When training, the training accuracy behaves pretty normal and improves, reaching >99% accuracy. My validation accuracy starts off at about 75% and fluctuates around 81% ± 1%. After training, the model performs really well on completely new data.
Epoch 1/100
187/187 [==============================] - 103s 550ms/step - loss: 1.1336 - acc: 0.5384 - val_loss: 0.8065 - val_acc: 0.7405
Epoch 2/100
187/187 [==============================] - 97s 519ms/step - loss: 0.8041 - acc: 0.7345 - val_loss: 0.7566 - val_acc: 0.7720
Epoch 3/100
187/187 [==============================] - 97s 519ms/step - loss: 0.7194 - acc: 0.7945 - val_loss: 0.7410 - val_acc: 0.7846
Epoch 4/100
187/187 [==============================] - 97s 517ms/step - loss: 0.6688 - acc: 0.8324 - val_loss: 0.7295 - val_acc: 0.7924
Epoch 5/100
187/187 [==============================] - 97s 518ms/step - loss: 0.6288 - acc: 0.8611 - val_loss: 0.7197 - val_acc: 0.7961
Epoch 6/100
187/187 [==============================] - 96s 515ms/step - loss: 0.5989 - acc: 0.8862 - val_loss: 0.7252 - val_acc: 0.7961
Epoch 7/100
187/187 [==============================] - 96s 514ms/step - loss: 0.5762 - acc: 0.8981 - val_loss: 0.7135 - val_acc: 0.8063
Epoch 8/100
187/187 [==============================] - 97s 518ms/step - loss: 0.5513 - acc: 0.9186 - val_loss: 0.7089 - val_acc: 0.8077
Epoch 9/100
187/187 [==============================] - 96s 513ms/step - loss: 0.5351 - acc: 0.9280 - val_loss: 0.7113 - val_acc: 0.8053
Epoch 10/100
187/187 [==============================] - 96s 514ms/step - loss: 0.5189 - acc: 0.9417 - val_loss: 0.7167 - val_acc: 0.8094
Epoch 11/100
187/187 [==============================] - 96s 515ms/step - loss: 0.5026 - acc: 0.9483 - val_loss: 0.7104 - val_acc: 0.8162
Epoch 12/100
187/187 [==============================] - 96s 516ms/step - loss: 0.4914 - acc: 0.9538 - val_loss: 0.7114 - val_acc: 0.8101
Epoch 13/100
187/187 [==============================] - 96s 515ms/step - loss: 0.4809 - acc: 0.9583 - val_loss: 0.7099 - val_acc: 0.8141
Epoch 14/100
187/187 [==============================] - 96s 512ms/step - loss: 0.4681 - acc: 0.9656 - val_loss: 0.7149 - val_acc: 0.8182
Epoch 15/100
187/187 [==============================] - 96s 515ms/step - loss: 0.4605 - acc: 0.9701 - val_loss: 0.7139 - val_acc: 0.8172
Epoch 16/100
187/187 [==============================] - 96s 514ms/step - loss: 0.4479 - acc: 0.9753 - val_loss: 0.7102 - val_acc: 0.8182
Epoch 17/100
187/187 [==============================] - 96s 513ms/step - loss: 0.4418 - acc: 0.9805 - val_loss: 0.7087 - val_acc: 0.8247
Epoch 18/100
187/187 [==============================] - 96s 512ms/step - loss: 0.4363 - acc: 0.9809 - val_loss: 0.7148 - val_acc: 0.8213
Epoch 19/100
187/187 [==============================] - 96s 516ms/step - loss: 0.4225 - acc: 0.9870 - val_loss: 0.7184 - val_acc: 0.8203
Epoch 20/100
187/187 [==============================] - 96s 513ms/step - loss: 0.4241 - acc: 0.9863 - val_loss: 0.7216 - val_acc: 0.8189
Epoch 21/100
187/187 [==============================] - 96s 513ms/step - loss: 0.4132 - acc: 0.9908 - val_loss: 0.7143 - val_acc: 0.8199
Epoch 22/100
187/187 [==============================] - 96s 515ms/step - loss: 0.4050 - acc: 0.9936 - val_loss: 0.7109 - val_acc: 0.8233
Epoch 23/100
187/187 [==============================] - 96s 515ms/step - loss: 0.4040 - acc: 0.9928 - val_loss: 0.7118 - val_acc: 0.8203
Epoch 24/100
187/187 [==============================] - 96s 511ms/step - loss: 0.3989 - acc: 0.9930 - val_loss: 0.7194 - val_acc: 0.8165
Epoch 25/100
187/187 [==============================] - 97s 517ms/step - loss: 0.3933 - acc: 0.9946 - val_loss: 0.7163 - val_acc: 0.8155
Epoch 26/100
187/187 [==============================] - 97s 516ms/step - loss: 0.3884 - acc: 0.9957 - val_loss: 0.7225 - val_acc: 0.8148
Epoch 27/100
187/187 [==============================] - 95s 510ms/step - loss: 0.3876 - acc: 0.9959 - val_loss: 0.7224 - val_acc: 0.8179
The plot in itself looks like overfitting, but I've taken plenty of measures to fix overfitting but none seem to work. Here is my model:
# transfer learning with ResNet50
base_model=ResNet50(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
# function to finetune model
def build_finetune_model(base_model, dropout, fc_layers, num_classes):
# make base model untrainable
for layer in base_model.layers:
layer.trainable = False
x = base_model.output
x = Flatten()(x)
# add dense layers
for fc in fc_layers:
# use regularizer
x = Dense(fc, use_bias=False, kernel_regularizer=l2(0.003))(x)
# add batch normalization
x = BatchNormalization()(x)
x = Activation('relu')(x)
# add dropout
x = Dropout(dropout)(x)
# New softmax layer
x = Dense(num_classes, use_bias=False)(x)
x = BatchNormalization()(x)
predictions = Activation('softmax')(x)
finetune_model = Model(inputs=base_model.input, outputs=predictions)
return finetune_model
FC_LAYERS = [1024, 1024]
dropout = 0.5
model = build_finetune_model(base_model, dropout=dropout, fc_layers=FC_LAYERS,num_classes=len(categories))
I'm adjusting for class weights and have set a really low learning rate in hopes of slowing the learning down.
model.compile(optimizer=Adam(lr=0.000005),loss='categorical_crossentropy',metrics=['accuracy'], weighted_metrics=class_weight)
I'm really confused by the fact that the validation accuracy starts so high (significantly higher than training accuracy) and barely improves during the entire training process. As mentioned before it seems to be overfitting but I added dropouts, batch normalization and regularizers, it doesn't seem to work. Augmenting data with Horizontal flips, random cropping, random brightness and rotation does not change the accuracy significantly either. Turning shuffle off for my data inside ImageDataGenerator().flow_from_directory() for my training data makes the model train around 25% for training accuracy and <50% for validation accuracy (Edit: accuracy seems to be so low because the learning rate was too low in that case).
Again, the model works surprisingly well on new testing data. I'm looking to increase the validation accuracy and want to understand why the neural network is behaving that way.
Your model is overfitting. You may want to use data augmentation on a model of images. e.g. use ImageDataGenerator (https://keras.io/preprocessing/image/) to randomly shift, rotate and crop images.
SGD tried to find the simplest way possible to minimise the loss function on the dataset; given a large enough set of data points it is forced to come up with a generic solution; but whenever possible DNNs tend to "memorise" the inputs since that is the simplest way to reduce the loss. Dropouts and regularisation do help but at the end of the day what matters is the validation metrics. Assuming of course that your validation set is correctly balanced.

Keras Image Classification - Prediction accuracy on validation dataset does not match val_acc

I am trying to classify a set of images within two categories: left and right.
I built a CNN using Keras, my classifier seems to work well:
I have 1,939 images used for training (50% left, 50% right)
I have 648 images used for validation (50% left, 50% right)
All images are 115x45, in greyscale
acc is increasing up to 99.53%
val_acc is increasing up to 98.38%
Both loss and val_loss are converging close to 0
Keras verbose looks normal to me:
60/60 [==============================] - 6s 98ms/step - loss: 0.6295 - acc: 0.6393 - val_loss: 0.4877 - val_acc: 0.7641
Epoch 2/32
60/60 [==============================] - 5s 78ms/step - loss: 0.4825 - acc: 0.7734 - val_loss: 0.3403 - val_acc: 0.8799
Epoch 3/32
60/60 [==============================] - 5s 77ms/step - loss: 0.3258 - acc: 0.8663 - val_loss: 0.2314 - val_acc: 0.9042
Epoch 4/32
60/60 [==============================] - 5s 83ms/step - loss: 0.2498 - acc: 0.8942 - val_loss: 0.2329 - val_acc: 0.9042
Epoch 5/32
60/60 [==============================] - 5s 76ms/step - loss: 0.2408 - acc: 0.9002 - val_loss: 0.1426 - val_acc: 0.9432
Epoch 6/32
60/60 [==============================] - 5s 80ms/step - loss: 0.1968 - acc: 0.9260 - val_loss: 0.1484 - val_acc: 0.9367
Epoch 7/32
60/60 [==============================] - 5s 77ms/step - loss: 0.1621 - acc: 0.9319 - val_loss: 0.1141 - val_acc: 0.9578
Epoch 8/32
60/60 [==============================] - 5s 81ms/step - loss: 0.1600 - acc: 0.9361 - val_loss: 0.1229 - val_acc: 0.9513
Epoch 9/32
60/60 [==============================] - 4s 70ms/step - loss: 0.1358 - acc: 0.9462 - val_loss: 0.0884 - val_acc: 0.9692
Epoch 10/32
60/60 [==============================] - 4s 74ms/step - loss: 0.1193 - acc: 0.9542 - val_loss: 0.1232 - val_acc: 0.9529
Epoch 11/32
60/60 [==============================] - 5s 79ms/step - loss: 0.1075 - acc: 0.9595 - val_loss: 0.0865 - val_acc: 0.9724
Epoch 12/32
60/60 [==============================] - 4s 73ms/step - loss: 0.1209 - acc: 0.9531 - val_loss: 0.1067 - val_acc: 0.9497
Epoch 13/32
60/60 [==============================] - 4s 73ms/step - loss: 0.1135 - acc: 0.9609 - val_loss: 0.0860 - val_acc: 0.9838
Epoch 14/32
60/60 [==============================] - 4s 70ms/step - loss: 0.0869 - acc: 0.9682 - val_loss: 0.0907 - val_acc: 0.9675
Epoch 15/32
60/60 [==============================] - 4s 71ms/step - loss: 0.0960 - acc: 0.9637 - val_loss: 0.0996 - val_acc: 0.9643
Epoch 16/32
60/60 [==============================] - 4s 73ms/step - loss: 0.0951 - acc: 0.9625 - val_loss: 0.1223 - val_acc: 0.9481
Epoch 17/32
60/60 [==============================] - 4s 70ms/step - loss: 0.0685 - acc: 0.9729 - val_loss: 0.1220 - val_acc: 0.9513
Epoch 18/32
60/60 [==============================] - 4s 73ms/step - loss: 0.0791 - acc: 0.9715 - val_loss: 0.0959 - val_acc: 0.9692
Epoch 19/32
60/60 [==============================] - 4s 71ms/step - loss: 0.0595 - acc: 0.9802 - val_loss: 0.0648 - val_acc: 0.9773
Epoch 20/32
60/60 [==============================] - 4s 71ms/step - loss: 0.0486 - acc: 0.9844 - val_loss: 0.0691 - val_acc: 0.9838
Epoch 21/32
60/60 [==============================] - 4s 70ms/step - loss: 0.0499 - acc: 0.9812 - val_loss: 0.1166 - val_acc: 0.9627
Epoch 22/32
60/60 [==============================] - 4s 71ms/step - loss: 0.0481 - acc: 0.9844 - val_loss: 0.0875 - val_acc: 0.9734
Epoch 23/32
60/60 [==============================] - 4s 70ms/step - loss: 0.0533 - acc: 0.9814 - val_loss: 0.1094 - val_acc: 0.9724
Epoch 24/32
60/60 [==============================] - 4s 70ms/step - loss: 0.0487 - acc: 0.9812 - val_loss: 0.0722 - val_acc: 0.9740
Epoch 25/32
60/60 [==============================] - 4s 72ms/step - loss: 0.0441 - acc: 0.9828 - val_loss: 0.0992 - val_acc: 0.9773
Epoch 26/32
60/60 [==============================] - 4s 71ms/step - loss: 0.0667 - acc: 0.9726 - val_loss: 0.0964 - val_acc: 0.9643
Epoch 27/32
60/60 [==============================] - 4s 73ms/step - loss: 0.0436 - acc: 0.9835 - val_loss: 0.0771 - val_acc: 0.9708
Epoch 28/32
60/60 [==============================] - 4s 71ms/step - loss: 0.0322 - acc: 0.9896 - val_loss: 0.0872 - val_acc: 0.9756
Epoch 29/32
60/60 [==============================] - 5s 80ms/step - loss: 0.0294 - acc: 0.9943 - val_loss: 0.1414 - val_acc: 0.9578
Epoch 30/32
60/60 [==============================] - 5s 76ms/step - loss: 0.0348 - acc: 0.9870 - val_loss: 0.1102 - val_acc: 0.9659
Epoch 31/32
60/60 [==============================] - 5s 76ms/step - loss: 0.0306 - acc: 0.9922 - val_loss: 0.0794 - val_acc: 0.9659
Epoch 32/32
60/60 [==============================] - 5s 76ms/step - loss: 0.0152 - acc: 0.9953 - val_loss: 0.1051 - val_acc: 0.9724
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 113, 43, 32) 896
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 56, 21, 32) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 54, 19, 32) 9248
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 27, 9, 32) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 7776) 0
_________________________________________________________________
dense_1 (Dense) (None, 128) 995456
_________________________________________________________________
dense_2 (Dense) (None, 1) 129
=================================================================
Total params: 1,005,729
Trainable params: 1,005,729
Non-trainable params: 0
So everything looks great, but when I tried to predict the category of 2,000 samples I got very strange results, with an accuracy < 70%.
At first I thought this sample might be biased, so I tried, instead, to predict the images in the validation dataset.
I should have a 98.38% accuracy, and a perfect 50-50 split, but instead, once again I got:
170 images predicted right, instead of 324, with an accuracy of 98.8%
478 images predicted left, instead of 324, with an accuracy of 67.3%
Average accuracy: 75.69% and not 98.38%
I guess something is wrong either in my CNN or my prediction script.
CNN classifier code:
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True
# Init CNN
classifier = Sequential()
# Step 1 - Convolution
classifier.add(Conv2D(32, (3, 3), input_shape = (115, 45, 3), activation = 'relu'))
# Step 2 - Pooling
classifier.add(MaxPooling2D(pool_size = (2, 2)))
# Adding a second convolutional layer
classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
# Step 3 - Flattening
classifier.add(Flatten())
# Step 4 - Full connection
classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dense(units = 1, activation = 'sigmoid'))
# Compiling the CNN
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
# Part 2 - Fitting the CNN to the images
from keras.preprocessing.image import ImageDataGenerator
import numpy
train_datagen = ImageDataGenerator(rescale = 1./255, shear_range = 0.2, zoom_range = 0.2, horizontal_flip = False)
test_datagen = ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow_from_directory('./dataset/training_set',
target_size = (115, 45),
batch_size = 32,
class_mode = 'binary')
test_set = test_datagen.flow_from_directory('./dataset/test_set',
target_size = (115, 45),
batch_size = 32,
class_mode = 'binary')
classifier.fit_generator(training_set,
steps_per_epoch = 1939/32, # total samples / batch size
epochs = 32,
validation_data = test_set,
validation_steps = 648/32)
# Save the classifier
classifier.evaluate_generator(generator=test_set)
classifier.summary()
classifier.save('./classifier.h5')
Prediction code:
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
from keras.models import load_model
from keras.preprocessing.image import ImageDataGenerator
import os
import numpy as np
from keras.preprocessing import image
from shutil import copyfile
classifier = load_model('./classifier.h5')
folder = './small/'
files = os.listdir(folder)
pleft = 0
pright = 0
for f in files:
test_image = image.load_img(folder+f, target_size = (115, 45))
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis = 0)
result = classifier.predict(test_image)
#print training_set.class_indices
if result[0][0] == 1:
pright=pright+1
prediction = 'right'
copyfile(folder+'../'+f, '/found_right/'+f)
else:
prediction = 'left'
copyfile(folder+'../'+f, '/found_left/'+f)
pleft=pleft+1
ptot = pleft + pright
print 'Left = '+str(pleft)+' ('+str(pleft / (ptot / 100))+'%)'
print 'Right = '+str(pright)
print 'Total = '+str(ptot)
Output:
Left = 478 (79%)
Right = 170
Total = 648
Your help will be much appreciated.
I resolved this issue by doing two things:
As #Matias Valdenegro suggested, I had to rescale the image values before predicting, I added test_image /= 255. before calling predict().
As my val_loss was still a bit high, I added an EarlyStopping callback as well as two Dropout() before my Dense layers.
My prediction results are now consistent with the ones obtained during training/validation.

Why is the training time of each epoch different heavily?

I am training a CNN model in Keras. I find that the time of each epoch is nearly same in the fist 10 epochs, about 140s every epoch. But in the successive epochs,the training time increases to about 500s every epoch.
So, what‘s the problem?
184s - loss: 0.2587 - fscore_cloud: 0.8348 - val_loss: 0.1987 - val_fscore_cloud: 0.8781
Epoch 2/2000
163s - loss: 0.1899 - fscore_cloud: 0.8868 - val_loss: 0.1927 - val_fscore_cloud: 0.8877
Epoch 3/2000
144s - loss: 0.1821 - fscore_cloud: 0.8915 - val_loss: 0.1885 - val_fscore_cloud: 0.8910
Epoch 4/2000
143s - loss: 0.1794 - fscore_cloud: 0.8931 - val_loss: 0.1856 - val_fscore_cloud: 0.8930
Epoch 5/2000
142s - loss: 0.1784 - fscore_cloud: 0.8937 - val_loss: 0.1846 - val_fscore_cloud: 0.8935
Epoch 6/2000
142s - loss: 0.1774 - fscore_cloud: 0.8939 - val_loss: 0.1835 - val_fscore_cloud: 0.8940
Epoch 7/2000
144s - loss: 0.1766 - fscore_cloud: 0.8942 - val_loss: 0.1827 - val_fscore_cloud: 0.8944
Epoch 8/2000
141s - loss: 0.1759 - fscore_cloud: 0.8944 - val_loss: 0.1820 - val_fscore_cloud: 0.8947
Epoch 9/2000
139s - loss: 0.1754 - fscore_cloud: 0.8946 - val_loss: 0.1813 - val_fscore_cloud: 0.8950
Epoch 10/2000
184s - loss: 0.1749 - fscore_cloud: 0.8947 - val_loss: 0.1806 - val_fscore_cloud: 0.8952
Epoch 11/2000
544s - loss: 0.1743 - fscore_cloud: 0.8948 - val_loss: 0.1800 - val_fscore_cloud: 0.8954
Epoch 12/2000
545s - loss: 0.1738 - fscore_cloud: 0.8950 - val_loss: 0.1796 - val_fscore_cloud: 0.8955
Epoch 13/2000
553s - loss: 0.1731 - fscore_cloud: 0.8952 - val_loss: 0.1791 - val_fscore_cloud: 0.8957
Epoch 14/2000
214s - loss: 0.1723 - fscore_cloud: 0.8955 - val_loss: 0.1776 - val_fscore_cloud: 0.8961
Epoch 15/2000
145s - loss: 0.1706 - fscore_cloud: 0.8965 - val_loss: 0.1768 - val_fscore_cloud: 0.8964
Epoch 16/2000
146s - loss: 0.1683 - fscore_cloud: 0.8975 - val_loss: 0.1743 - val_fscore_cloud: 0.8980
Epoch 17/2000
140s - loss: 0.1658 - fscore_cloud: 0.8983 - val_loss: 0.1734 - val_fscore_cloud: 0.8986
Epoch 18/2000
142s - loss: 0.1640 - fscore_cloud: 0.8987 - val_loss: 0.1719 - val_fscore_cloud: 0.8990
Epoch 19/2000
137s - loss: 0.1621 - fscore_cloud: 0.8996 - val_loss: 0.1699 - val_fscore_cloud: 0.9001
Epoch 20/2000
277s - loss: 0.1601 - fscore_cloud: 0.9007 - val_loss: 0.1678 - val_fscore_cloud: 0.9015
Epoch 21/2000
310s - loss: 0.1579 - fscore_cloud: 0.9018 - val_loss: 0.1655 - val_fscore_cloud: 0.9028
Epoch 22/2000
345s - loss: 0.1558 - fscore_cloud: 0.9031 - val_loss: 0.1635 - val_fscore_cloud: 0.9042
Epoch 23/2000
587s - loss: 0.1538 - fscore_cloud: 0.9044 - val_loss: 0.1621 - val_fscore_cloud: 0.9054
Epoch 24/2000
525s - loss: 0.1519 - fscore_cloud: 0.9056 - val_loss: 0.1610 - val_fscore_cloud: 0.9061
Epoch 25/2000
579s - loss: 0.1500 - fscore_cloud: 0.9068 - val_loss: 0.1597 - val_fscore_cloud: 0.9069
Epoch 26/2000
557s - loss: 0.1485 - fscore_cloud: 0.9075 - val_loss: 0.1575 - val_fscore_cloud: 0.9078
Epoch 27/2000
530s - loss: 0.1469 - fscore_cloud: 0.9084 - val_loss: 0.1561 - val_fscore_cloud: 0.9083
Epoch 28/2000
I also found this problem. Even when we just run torch.cuda.FloatTensor(a,b).normal_() every epoch, the latter epochs will take longer than previous epochs. I guess that this phenomenon is caused by memory usage.

Resources