torch7 size mismatch when feeding an image - lua

I'm trying to do some stuff with a neural network in torch7. However when I run the code I get the error /home/thijser/torch/install/share/lua/5.1/nn/Linear.lua:57: size mismatch at /tmp/luarocks_cutorch-scm-1-6477/cutorch/lib/THC/generic/THCTensorMathBlas.cu:52
here is the code (or at least the minimal example where the problem occurs)
require 'torch'
require 'nn'
require 'image'
require 'optim'
require 'cutorch'
require 'cunn'
require 'loadcaffe'
local cmd = torch.CmdLine()
local function main(params)
cutorch.setDevice(1)
local loadcaffe_backend = 'nn'
local cnn = loadcaffe.load('models/VGG_ILSVRC_19_layers-deploy.prototxt', 'models/VGG_ILSVRC_19_layers.caffemodel', loadcaffe_backend):float()
cnn:cuda()
targetImage_caffe = image.load('tank.jpg', 3)
targetImage_caffe = targetImage_caffe:cuda()
netimage=cnn:forward(targetImage_caffe)
end
local params = cmd:parse(arg)
main(params)
And the full error log
/home/thijser/torch/install/bin/luajit: /home/thijser/torch/install/share/lua/5.1/nn/Container.lua:67:
In 39 module of nn.Sequential:
/home/thijser/torch/install/share/lua/5.1/nn/Linear.lua:57: size mismatch at /tmp/luarocks_cutorch-scm-1-6477/cutorch/lib/THC/generic/THCTensorMathBlas.cu:52
stack traceback:
[C]: in function 'addmv'
/home/thijser/torch/install/share/lua/5.1/nn/Linear.lua:57: in function </home/thijser/torch/install/share/lua/5.1/nn/Linear.lua:53>
[C]: in function 'xpcall'
/home/thijser/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
/home/thijser/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
temp.lua:24: in function 'main'
temp.lua:37: in main chunk
[C]: in function 'dofile'
...jser/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x5599d0cfa470
WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
[C]: in function 'error'
/home/thijser/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
/home/thijser/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
temp.lua:24: in function 'main'
temp.lua:37: in main chunk
[C]: in function 'dofile'
...jser/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x5599d0cfa470
The models can be downloaded by
cd models
wget -c https://gist.githubusercontent.com/ksimonyan/3785162f95cd2d5fee77/raw/bb2b4fe0a9bb0669211cf3d0bc949dfdda173e9e/VGG_ILSVRC_19_layers_deploy.prototxt
wget -c --no-check-certificate https://bethgelab.org/media/uploads/deeptextures/vgg_normalised.caffemodel
wget -c http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_19_layers.caffemodel
cd ..
print(cnn) gives an output of
nn.Sequential {
[input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> (22) -> (23) -> (24) -> (25) -> (26) -> (27) -> (28) -> (29) -> (30) -> (31) -> (32) -> (33) -> (34) -> (35) -> (36) -> (37) -> (38) -> (39) -> (40) -> (41) -> (42) -> (43) -> (44) -> (45) -> (46) -> output]
(1): nn.SpatialConvolution(3 -> 64, 3x3, 1,1, 1,1)
(2): nn.ReLU
(3): nn.SpatialConvolution(64 -> 64, 3x3, 1,1, 1,1)
(4): nn.ReLU
(5): nn.SpatialMaxPooling(2x2, 2,2)
(6): nn.SpatialConvolution(64 -> 128, 3x3, 1,1, 1,1)
(7): nn.ReLU
(8): nn.SpatialConvolution(128 -> 128, 3x3, 1,1, 1,1)
(9): nn.ReLU
(10): nn.SpatialMaxPooling(2x2, 2,2)
(11): nn.SpatialConvolution(128 -> 256, 3x3, 1,1, 1,1)
(12): nn.ReLU
(13): nn.SpatialConvolution(256 -> 256, 3x3, 1,1, 1,1)
(14): nn.ReLU
(15): nn.SpatialConvolution(256 -> 256, 3x3, 1,1, 1,1)
(16): nn.ReLU
(17): nn.SpatialConvolution(256 -> 256, 3x3, 1,1, 1,1)
(18): nn.ReLU
(19): nn.SpatialMaxPooling(2x2, 2,2)
(20): nn.SpatialConvolution(256 -> 512, 3x3, 1,1, 1,1)
(21): nn.ReLU
(22): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
(23): nn.ReLU
(24): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
(25): nn.ReLU
(26): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
(27): nn.ReLU
(28): nn.SpatialMaxPooling(2x2, 2,2)
(29): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
(30): nn.ReLU
(31): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
(32): nn.ReLU
(33): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
(34): nn.ReLU
(35): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
(36): nn.ReLU
(37): nn.SpatialMaxPooling(2x2, 2,2)
(38): nn.View(-1)
(39): nn.Linear(25088 -> 4096)
(40): nn.ReLU
(41): nn.Dropout(0.500000)
(42): nn.Linear(4096 -> 4096)
(43): nn.ReLU
(44): nn.Dropout(0.500000)
(45): nn.Linear(4096 -> 1000)
(46): nn.SoftMax
}
While print(targetImage_caffe:size()) gives me
3
660
1045
[torch.LongStorage of size 3]
Anybody know how to fix this or what I'm doing wrong?

The problem comes fro the fact that you are using VGG19 which is designed to be fed with 224 x 224 images. Since you are using a 660 x 1045 image (which is typically strange since most of the convnets use squared images) an error occur at module 39 (you can see it in the stack trace) because you want to aply a linear module with 25088 input dimensions to a tensor which has now around 327680 values (each pooling layer roughly divide the image's size by 4 and you have 512 features maps).
The solution is therefore to use 224 x 224 images. Therefore after the 5 pooling layers you will have a image of dimension (224 / 2^5) x (224 / 2^5) x 512 = 25088.

Related

How to know the input shape of my model , I am putting it 32*640 with 3 channels , still it gives me error

I have this model:
unary = Sequential([
Conv2D(filters=32, kernel_size=(3, 3), activation='relu',input_shape = (32,640,3)),
Conv2D(filters=32, kernel_size=(3, 3), activation='relu'),
MaxPooling2D((2, 2)),
Conv2D(filters=64, kernel_size=(3, 3), activation='relu'),
Conv2D(filters=64, kernel_size=(3, 3), activation='relu'),
MaxPooling2D((2, 2)),
Conv2D(filters=128, kernel_size=(3, 3), activation='relu'),
Conv2D(filters=256, kernel_size=(3, 3), activation='relu'),
Flatten(),
Dense(1024,activation='relu'),
Dense(4, activation='softmax')
])
unary.summary()
When I am trying it to Predict for further classification I am getting this error:
ValueError: Input 0 of layer sequential_15 is incompatible with the layer: expected axis -1 of input shape to have value 3 but received input with shape (None, 32, 640, 3, 1)
Full Error Traceback:
--------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/tmp/ipykernel_23/1300371096.py in <module>
----> 1 x_train, y_train = get_crf_training_data()
/tmp/ipykernel_23/2273784861.py in get_crf_training_data()
14 x_train_u, y_train_u = get_unary_data_for_page(annotation_filename, cnn=False)
15 x_train_p, _ = get_pairwise_data_for_page(annotation_filename)
---> 16 unary_potential_list = np.array(get_unary_potentials(x_train_u))
17 pairwise_potential_list = np.array(get_pairwise_potentials(x_train_p))
18
/tmp/ipykernel_23/3666745350.py in get_unary_potentials(x)
2 unary = tf.keras.models.load_model('./unary/')
3 x = np.expand_dims(x,axis = -1)
----> 4 return unary.predict(x)
How to resolve this dimension problem?

Max Pooling in VGG16 Before Global Average Pooling (GAP)?

I am currently using VGG16 with Global Average Pooling (GAP) before final classification layer. The VGG16 model used is the one provided by torchvision.
However, I noticed that before the GAP layer, there is a Max Pooling layer. Is this okay or should the Max Pooling layer be removed before the GAP layer? The network architecture can be seen below.
VGG(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace=True)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): ReLU(inplace=True)
(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(6): ReLU(inplace=True)
(7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(8): ReLU(inplace=True)
(9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace=True)
(12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(13): ReLU(inplace=True)
(14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(15): ReLU(inplace=True)
(16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(18): ReLU(inplace=True)
(19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(20): ReLU(inplace=True)
(21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(22): ReLU(inplace=True)
(23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(25): ReLU(inplace=True)
(26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(27): ReLU(inplace=True)
(28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(29): ReLU(inplace=True)
(30): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(avgpool): AdaptiveAvgPool2d(output_size=1) #GAP Layer
(classifier): Sequential(
(0): Linear(in_features=512, out_features=7, bias=True)
)
)
Thanks in advance.
If you are going to train the classifier, it should be okay. Nonetheless, I wouldn't remove it either way.
It is worth mentioning that the max-pooling is part of the original architecture, as can be seen in Table 1 of the original paper: https://arxiv.org/pdf/1409.1556.pdf.

Incompatible shapes of 1 using auto encoder

I'm trying to use a auto-encoder on time series. When I use padding on the data all is working, but when I'm using variable data length I have small data shape issues: Incompatible shapes: [1,125,4] vs. [1,126,4]
input_series = Input(shape=(None, 4))
x = Conv1D(4, 2, activation='relu', padding='same')(input_series)
x = MaxPooling1D(1, padding='same')(x)
x = Conv1D(4, 3, activation='relu', padding='same')(x)
x = MaxPooling1D(1, padding='same')(x)
x = Conv1D(4, 3, activation='relu', padding='same')(x)
encoder = MaxPooling1D(1, padding='same', name='encoder')(x)
x = Conv1D(4, 3, activation='relu', padding='same')(encoder)
x = UpSampling1D(1)(x)
x = Conv1D(4, 3, activation='relu', padding='same')(x)
x = UpSampling1D(1)(x)
x = Conv1D(16, 2, activation='relu')(x)
x = UpSampling1D(1)(x)
decoder = Conv1D(4, 2, activation='sigmoid', padding='same')(x)
autoencoder = Model(input_series, decoder)
autoencoder.compile(loss='mse', optimizer='adam')
autoencoder.summary()
Summary:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_25 (InputLayer) (None, None, 4) 0
_________________________________________________________________
conv1d_169 (Conv1D) (None, None, 4) 36
_________________________________________________________________
max_pooling1d_49 (MaxPooling (None, None, 4) 0
_________________________________________________________________
conv1d_170 (Conv1D) (None, None, 4) 52
_________________________________________________________________
max_pooling1d_50 (MaxPooling (None, None, 4) 0
_________________________________________________________________
conv1d_171 (Conv1D) (None, None, 4) 52
_________________________________________________________________
encoder (MaxPooling1D) (None, None, 4) 0
_________________________________________________________________
conv1d_172 (Conv1D) (None, None, 4) 52
_________________________________________________________________
up_sampling1d_73 (UpSampling (None, None, 4) 0
_________________________________________________________________
conv1d_173 (Conv1D) (None, None, 4) 52
_________________________________________________________________
up_sampling1d_74 (UpSampling (None, None, 4) 0
_________________________________________________________________
conv1d_174 (Conv1D) (None, None, 16) 144
_________________________________________________________________
up_sampling1d_75 (UpSampling (None, None, 16) 0
_________________________________________________________________
conv1d_175 (Conv1D) (None, None, 4) 132
=================================================================
Total params: 520
Trainable params: 520
Non-trainable params: 0
_________________________________________________________________
Error:
Epoch 1/50
---------------------------------------------------------------------------
InvalidArgumentError Traceback (most recent call last)
C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _do_call(self, fn, *args)
1321 try:
-> 1322 return fn(*args)
1323 except errors.OpError as e:
C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _run_fn(feed_dict, fetch_list, target_list, options, run_metadata)
1306 return self._call_tf_sessionrun(
-> 1307 options, feed_dict, fetch_list, target_list, run_metadata)
1308
C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list, run_metadata)
1408 self._session, options, feed_dict, fetch_list, target_list,
-> 1409 run_metadata)
1410 else:
InvalidArgumentError: Incompatible shapes: [1,125,4] vs. [1,126,4]
[[Node: loss_22/conv1d_175_loss/sub = Sub[T=DT_FLOAT, _class=["loc:#training_18/Adam/gradients/loss_22/conv1d_175_loss/sub_grad/Reshape"], _device="/job:localhost/replica:0/task:0/device:GPU:0"](conv1d_175/Sigmoid, _arg_conv1d_175_target_0_1/_4489)]]
[[Node: loss_22/mul/_4613 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1245_loss_22/mul", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
During handling of the above exception, another exception occurred:
InvalidArgumentError Traceback (most recent call last)
<ipython-input-101-a6e405699326> in <module>()
6 train_generator(X_train),
7 epochs=50,
----> 8 steps_per_epoch=len(X_train))
9
10
C:\ProgramData\Anaconda3\lib\site-packages\keras\legacy\interfaces.py in wrapper(*args, **kwargs)
89 warnings.warn('Update your `' + object_name +
90 '` call to the Keras 2 API: ' + signature, stacklevel=2)
---> 91 return func(*args, **kwargs)
92 wrapper._original_function = func
93 return wrapper
C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
2228 outs = self.train_on_batch(x, y,
2229 sample_weight=sample_weight,
-> 2230 class_weight=class_weight)
2231
2232 if not isinstance(outs, list):
C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\training.py in train_on_batch(self, x, y, sample_weight, class_weight)
1881 ins = x + y + sample_weights
1882 self._make_train_function()
-> 1883 outputs = self.train_function(ins)
1884 if len(outputs) == 1:
1885 return outputs[0]
C:\ProgramData\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py in __call__(self, inputs)
2480 session = get_session()
2481 updated = session.run(fetches=fetches, feed_dict=feed_dict,
-> 2482 **self.session_kwargs)
2483 return updated[:len(self.outputs)]
2484
C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in run(self, fetches, feed_dict, options, run_metadata)
898 try:
899 result = self._run(None, fetches, feed_dict, options_ptr,
--> 900 run_metadata_ptr)
901 if run_metadata:
902 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)
C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
1133 if final_fetches or final_targets or (handle and feed_dict_tensor):
1134 results = self._do_run(handle, final_targets, final_fetches,
-> 1135 feed_dict_tensor, options, run_metadata)
1136 else:
1137 results = []
C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
1314 if handle is None:
1315 return self._do_call(_run_fn, feeds, fetches, targets, options,
-> 1316 run_metadata)
1317 else:
1318 return self._do_call(_prun_fn, handle, feeds, fetches)
C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in _do_call(self, fn, *args)
1333 except KeyError:
1334 pass
-> 1335 raise type(e)(node_def, op, message)
1336
1337 def _extend_graph(self):
InvalidArgumentError: Incompatible shapes: [1,125,4] vs. [1,126,4]
[[Node: loss_22/conv1d_175_loss/sub = Sub[T=DT_FLOAT, _class=["loc:#training_18/Adam/gradients/loss_22/conv1d_175_loss/sub_grad/Reshape"], _device="/job:localhost/replica:0/task:0/device:GPU:0"](conv1d_175/Sigmoid, _arg_conv1d_175_target_0_1/_4489)]]
[[Node: loss_22/mul/_4613 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1245_loss_22/mul", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Caused by op 'loss_22/conv1d_175_loss/sub', defined at:
File "C:\ProgramData\Anaconda3\lib\runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "C:\ProgramData\Anaconda3\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py", line 16, in <module>
app.launch_new_instance()
File "C:\ProgramData\Anaconda3\lib\site-packages\traitlets\config\application.py", line 658, in launch_instance
app.start()
File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelapp.py", line 478, in start
self.io_loop.start()
File "C:\ProgramData\Anaconda3\lib\site-packages\zmq\eventloop\ioloop.py", line 177, in start
super(ZMQIOLoop, self).start()
File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\ioloop.py", line 888, in start
handler_func(fd_obj, events)
File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\stack_context.py", line 277, in null_wrapper
return fn(*args, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 440, in _handle_events
self._handle_recv()
File "C:\ProgramData\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 472, in _handle_recv
self._run_callback(callback, msg)
File "C:\ProgramData\Anaconda3\lib\site-packages\zmq\eventloop\zmqstream.py", line 414, in _run_callback
callback(*args, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\stack_context.py", line 277, in null_wrapper
return fn(*args, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 283, in dispatcher
return self.dispatch_shell(stream, msg)
File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 233, in dispatch_shell
handler(stream, idents, msg)
File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 399, in execute_request
user_expressions, allow_stdin)
File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\ipkernel.py", line 208, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\zmqshell.py", line 537, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2728, in run_cell
interactivity=interactivity, compiler=compiler, result=result)
File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2850, in run_ast_nodes
if self.run_code(code, result):
File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2910, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-100-ddd3b57d5f0b>", line 22, in <module>
autoencoder.compile(loss='mse', optimizer='adam')
File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\training.py", line 830, in compile
sample_weight, mask)
File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\training.py", line 429, in weighted
score_array = fn(y_true, y_pred)
File "C:\ProgramData\Anaconda3\lib\site-packages\keras\losses.py", line 14, in mean_squared_error
return K.mean(K.square(y_pred - y_true), axis=-1)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\ops\math_ops.py", line 979, in binary_op_wrapper
return func(x, y, name=name)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 8582, in sub
"Sub", x=x, y=y, name=name)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 3392, in create_op
op_def=op_def)
File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1718, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InvalidArgumentError (see above for traceback): Incompatible shapes: [1,125,4] vs. [1,126,4]
[[Node: loss_22/conv1d_175_loss/sub = Sub[T=DT_FLOAT, _class=["loc:#training_18/Adam/gradients/loss_22/conv1d_175_loss/sub_grad/Reshape"], _device="/job:localhost/replica:0/task:0/device:GPU:0"](conv1d_175/Sigmoid, _arg_conv1d_175_target_0_1/_4489)]]
[[Node: loss_22/mul/_4613 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1245_loss_22/mul", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
One of your Conv1D layers is not using padding='same'.
But there is something very weird there: why would you use MaxPooling with pool_size=1? It does nothing.
Now suppose you use pool_size=2, then you'd need to pad the inputs anyway, because you'd need inputs with length multiple of 8 (2³) to be able to end up with the same shape after the upsamplings.
For a variable length autoencoder, there is an example here: Variable length output in keras
For all effects, LSTM layers treat shapes exactly the same way Conv1D layers do.

in Pytorch, restore the model parameters but the same initial loss

I am training a dnn (CRNN) with Pytorch, but some abnormal things happened in terms of loss val.
The program can print avg_loss for every 20 batches and save the model_parameters every 100 batches. And the initial loss is about 20-30. Some problems happened in my program, so the training process is interrupted. After loading the parameters from the saved model, I continue training but find the initial loss still start from 20-30. By the way, I have a dataset about 10 million pictures and I have trained about 3 million of them.
I want to figure about where the problem is, pytorch mechanism or program bugs.
Here is more detailed:
1. CRNN structure:
CRNN (
(cnn): Sequential (
(conv0): Conv2d(1, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(relu0): ReLU (inplace)
(pooling0): MaxPool2d (size=(2, 2), stride=(2, 2), dilation=(1, 1))
(conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(relu1): ReLU (inplace)
(pooling1): MaxPool2d (size=(2, 2), stride=(2, 2), dilation=(1, 1))
(conv2): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(batchnorm2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True)
(relu2): ReLU (inplace)
(conv3): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(relu3): ReLU (inplace)
(pooling2): MaxPool2d (size=(2, 2), stride=(2, 1), dilation=(1, 1))
(conv4): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(batchnorm4): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(relu4): ReLU (inplace)
(conv5): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(relu5): ReLU (inplace)
(pooling3): MaxPool2d (size=(2, 2), stride=(2, 1), dilation=(1, 1))
(conv6): Conv2d(512, 512, kernel_size=(2, 2), stride=(1, 1))
(batchnorm6): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True)
(relu6): ReLU (inplace)
)
(rnn): Sequential (
(0): BidirectionalLSTM (
(rnn): LSTM(512, 256, bidirectional=True)
(embedding): Linear (512 -> 256)
)
(1): BidirectionalLSTM (
(rnn): LSTM(256, 256, bidirectional=True)
(embedding): Linear (512 -> 5530)
)
)
)
2. model init and parameters loading.
def crnnSource():
alphabet = keys.alphabet
converter = util.strLabelConverter(alphabet)
model = crnn.CRNN(32, 1 ,len(alphabet)+1, 256, 1) #need 1?
model.apply(weights_init)
path = './models/crnn_OCR.pkl'
model.load_state_dict(torch.load(path))
return model, converter
3. training code
def trainProc(net ,trainset, converter):
print ("--------------------------------")
print ("Start to Train.")
criterion = CTCLoss().cuda()
loss_avg = util.averager()
optimizer = optim.RMSprop(net.parameters(), lr = 0.001)
image = torch.FloatTensor(BATCH_SIZE, 3, 32, 100) #opt.imgH
text = torch.IntTensor(BATCH_SIZE * 5)
length = torch.IntTensor(BATCH_SIZE)
image = image.cuda()
image = Variable(image)
text = Variable(text)
length = Variable(length)
sav_inv = 0
for epoch in range(TRAIN_EPOCHS):
sav_inv = 0
timer = time.time()
for i,data in enumerate(trainset, 0):
img, txt = data
img = ConvtFileToTensor(img)
batch_size = img.size(0)
util.loadData(image, img)
t, l = converter.encode(txt)
util.loadData(text,t)
util.loadData(length,l)
preds = net(image)
preds_size = Variable(torch.IntTensor([preds.size(0)] * batch_size))
cost = criterion(preds, text, preds_size, length) / batch_size
net.zero_grad()
cost.backward()
optimizer.step()
loss_avg.add(cost)
#running_loss += loss.data[0]
if i % 20 == 19:
time2 = time.time()
print ("[%d, %5d] loss: %.6f TIME: %.6f" %(epoch+1, i+1, loss_avg.val(),time2 - timer))
print (cost)
loss_avg.reset()
timer = time.time()
if sav_inv == SAV_INV-1:
torch.save(net.state_dict(),'./models/crnn_OCR.pkl')
sav_inv = 0
else:
sav_inv += 1
torch.save(net.state_dict(),'./models/crnn_OCR.pkl')
print ("Finished Training.")
return net

Can't fit data to 3d convolutional U-net Keras

I have a problem. I want to make 3D convolutional U-net. For this purpose I'm using Keras.
My data are MRI images from Data Science Bowl 2017 Competition. All MRI's were saved in numpy arrays (all pixels are scaled from 0 to 1) with shape:
data_ch.shape
(94, 50, 50, 50, 1)
94 - patients, 50 MRI slices of 50x50 images, 1 channel:
I want to make 3D Convolutional U-net, so the inputs and outputs of this net are same 3d arrays.
The 3D U-net:
input_img= Input(shape=(data_ch.shape[1], data_ch.shape[2], data_ch.shape[3], data_ch.shape[4]))
x=Conv3D(filters=8, kernel_size=(3, 3, 3), activation='relu', padding='same')(input_img)
x=MaxPooling3D(pool_size=(2, 2, 2), padding='same')(x)
x=Conv3D(filters=8, kernel_size=(3, 3, 3), activation='relu', padding='same')(x)
x=MaxPooling3D(pool_size=(2, 2, 2), padding='same')(x)
x=UpSampling3D(size=(2, 2, 2))(x)
x=Conv3D(filters=8, kernel_size=(3, 3, 3), activation='relu', padding='same')(x) # PADDING IS NOT THE SAME!!!!!
x=UpSampling3D(size=(2, 2, 2))(x)
x=Conv3D(filters=1, kernel_size=(3, 3, 3), activation='sigmoid')(x)
model=Model(input_img, x)
model.compile(optimizer='adadelta', loss='binary_crossentropy')
model.summary()
Layer (type) Output Shape Param #
=================================================================
input_5 (InputLayer) (None, 50, 50, 50, 1) 0
_________________________________________________________________
conv3d_27 (Conv3D) (None, 50, 50, 50, 8) 224
_________________________________________________________________
max_pooling3d_12 (MaxPooling (None, 25, 25, 25, 8) 0
_________________________________________________________________
conv3d_28 (Conv3D) (None, 25, 25, 25, 8) 1736
_________________________________________________________________
max_pooling3d_13 (MaxPooling (None, 13, 13, 13, 8) 0
_________________________________________________________________
up_sampling3d_12 (UpSampling (None, 26, 26, 26, 8) 0
_________________________________________________________________
conv3d_29 (Conv3D) (None, 26, 26, 26, 8) 1736
_________________________________________________________________
up_sampling3d_13 (UpSampling (None, 52, 52, 52, 8) 0
_________________________________________________________________
conv3d_30 (Conv3D) (None, 50, 50, 50, 1) 217
=================================================================
Total params: 3,913
Trainable params: 3,913
Non-trainable params: 0
But, when I attempted to fit data to this net:
model.fit(data_ch, data_ch, epochs=1, batch_size=10, shuffle=True, verbose=1)
the program displayed an error:
ValueError Traceback (most recent call last)
C:\Users\Taranov\Anaconda3\lib\site-packages\theano\compile\function_module.py in __call__(self, *args, **kwargs)
883 outputs =\
--> 884 self.fn() if output_subset is None else\
885 self.fn(output_subset=output_subset)
ValueError: CudaNdarray_CopyFromCudaNdarray: need same dimensions for dim 1, destination=13, source=14
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
<ipython-input-26-b334d38d9608> in <module>()
----> 1 model.fit(data_ch, data_ch, epochs=1, batch_size=10, shuffle=True, verbose=1)
C:\Users\Taranov\Anaconda3\lib\site-packages\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, **kwargs)
1496 val_f=val_f, val_ins=val_ins, shuffle=shuffle,
1497 callback_metrics=callback_metrics,
-> 1498 initial_epoch=initial_epoch)
1499
1500 def evaluate(self, x, y, batch_size=32, verbose=1, sample_weight=None):
C:\Users\Taranov\Anaconda3\lib\site-packages\keras\engine\training.py in _fit_loop(self, f, ins, out_labels, batch_size, epochs, verbose, callbacks, val_f, val_ins, shuffle, callback_metrics, initial_epoch)
1150 batch_logs['size'] = len(batch_ids)
1151 callbacks.on_batch_begin(batch_index, batch_logs)
-> 1152 outs = f(ins_batch)
1153 if not isinstance(outs, list):
1154 outs = [outs]
C:\Users\Taranov\Anaconda3\lib\site-packages\keras\backend\theano_backend.py in __call__(self, inputs)
1156 def __call__(self, inputs):
1157 assert isinstance(inputs, (list, tuple))
-> 1158 return self.function(*inputs)
1159
1160
C:\Users\Taranov\Anaconda3\lib\site-packages\theano\compile\function_module.py in __call__(self, *args, **kwargs)
896 node=self.fn.nodes[self.fn.position_of_error],
897 thunk=thunk,
--> 898 storage_map=getattr(self.fn, 'storage_map', None))
899 else:
900 # old-style linkers raise their own exceptions
C:\Users\Taranov\Anaconda3\lib\site-packages\theano\gof\link.py in raise_with_op(node, thunk, exc_info, storage_map)
323 # extra long error message in that case.
324 pass
--> 325 reraise(exc_type, exc_value, exc_trace)
326
327
C:\Users\Taranov\Anaconda3\lib\site-packages\six.py in reraise(tp, value, tb)
683 value = tp()
684 if value.__traceback__ is not tb:
--> 685 raise value.with_traceback(tb)
686 raise value
687
C:\Users\Taranov\Anaconda3\lib\site-packages\theano\compile\function_module.py in __call__(self, *args, **kwargs)
882 try:
883 outputs =\
--> 884 self.fn() if output_subset is None else\
885 self.fn(output_subset=output_subset)
886 except Exception:
ValueError: CudaNdarray_CopyFromCudaNdarray: need same dimensions for dim 1, destination=13, source=14
Apply node that caused the error: GpuAlloc(GpuDimShuffle{0,2,x,3,4,1}.0, Shape_i{0}.0, TensorConstant{13}, TensorConstant{2}, TensorConstant{13}, TensorConstant{13}, TensorConstant{8})
Toposort index: 163
Inputs types: [CudaNdarrayType(float32, (False, False, True, False, False, False)), TensorType(int64, scalar), TensorType(int64, scalar), TensorType(int8, scalar), TensorType(int64, scalar), TensorType(int64, scalar), TensorType(int64, scalar)]
Inputs shapes: [(10, 14, 1, 14, 14, 8), (), (), (), (), (), ()]
Inputs strides: [(21952, 196, 0, 14, 1, 2744), (), (), (), (), (), ()]
Inputs values: ['not shown', array(10, dtype=int64), array(13, dtype=int64), array(2, dtype=int8), array(13, dtype=int64), array(13, dtype=int64), array(8, dtype=int64)]
Outputs clients: [[GpuReshape{5}(GpuAlloc.0, MakeVector{dtype='int64'}.0)]]
HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
I tried to follow recommendations and use theano flags:
import theano
import os
os.environ["THEANO_FLAGS"] = "mode=FAST_RUN,device=gpu,floatX=float32, optimizer='None',exception_verbosity=high"
But it still doesn't work.
Could you help me?
Many thanks!
Ok.... that sounds weird, but MaxPooling3D has some kind of bug with padding='same'. So I wrote your code without it, and added an initial padding just to make your dimensions compatible:
import keras.backend as K
inputShape = (data_ch.shape[1], data_ch.shape[2], data_ch.shape[3], data_ch.shape[4])
paddedShape = (data_ch.shape[1]+2, data_ch.shape[2]+2, data_ch.shape[3]+2, data_ch.shape[4])
#initial padding
input_img= Input(shape=inputShape)
x = Lambda(lambda x: K.spatial_3d_padding(x, padding=((1, 1), (1, 1), (1, 1))),
output_shape=paddedShape)(input_img) #Lambda layers require output_shape
#your original code without padding for MaxPooling layers (replace input_img with x)
x=Conv3D(filters=8, kernel_size=3, activation='relu', padding='same')(x)
x=MaxPooling3D(pool_size=2)(x)
x=Conv3D(filters=8, kernel_size=3, activation='relu', padding='same')(x)
x=MaxPooling3D(pool_size=2)(x)
x=UpSampling3D(size=2)(x)
x=Conv3D(filters=8, kernel_size=3, activation='relu', padding='same')(x) # PADDING IS NOT THE SAME!!!!!
x=UpSampling3D(size=2)(x)
x=Conv3D(filters=1, kernel_size=3, activation='sigmoid')(x)
model=Model(input_img, x)
model.compile(optimizer='adadelta', loss='binary_crossentropy')
model.summary()
print(model.predict(data_ch)[1])
model.fit(data_ch,data_ch,epochs=1,verbose=2,batch_size=10)
Try reducing the batch size to something like 2, and if you see, your network needs more GPU, So try upgrading that as well.

Resources