Inception-ResNet-v2 model consists of how many layers? I have counted them to be 96 but I am not sure. Kindly confirm me
https://pic2.zhimg.com/v2-04824ca7ee62de1a91a2989f324b61ec_r.jpg
Also my training and testing data consists of 600 and 62 images respectively. I am using three models: ResNet-152, Inception-ResNet and DenseNet-161 and they have following number of parameters:
ResNet-152:
Total params: 58,450,754
Trainable params: 58,299,330
Non-trainable params: 151,424
DenseNet-161:
Total params: 26,696,354
Trainable params: 26,476,418
Non-trainable params: 219,936
Inception-ResNet:
Total params: 54,339,810
Trainable params: 54,279,266
Non-trainable params: 60,544
Is the data too scarce for the models? Also ResNet model validation/testing curve is the most smooth, then is DenseNet's curve and the Inception-ResNet model is the most bumpy. Why is it so?
Based on Inception ResNet V2 as apears in https://github.com/titu1994/Inception-v4/blob/master/inception_resnet_v2.py
ResNet V2 has 467 layers, as follows
input_1
conv2d_1
conv2d_2
conv2d_3
max_pooling2d_1
conv2d_4
merge_1
conv2d_7
conv2d_8
conv2d_5
conv2d_9
conv2d_6
conv2d_10
merge_2
max_pooling2d_2
conv2d_11
merge_3
batch_normalization_1
activation_1
conv2d_15
conv2d_13
conv2d_16
conv2d_12
conv2d_14
conv2d_17
merge_4
conv2d_18
lambda_1
merge_5
batch_normalization_2
activation_2
conv2d_22
conv2d_20
conv2d_23
conv2d_19
conv2d_21
conv2d_24
merge_6
conv2d_25
lambda_2
merge_7
batch_normalization_3
activation_3
conv2d_29
conv2d_27
conv2d_30
conv2d_26
conv2d_28
conv2d_31
merge_8
conv2d_32
lambda_3
merge_9
batch_normalization_4
activation_4
conv2d_36
conv2d_34
conv2d_37
conv2d_33
conv2d_35
conv2d_38
merge_10
conv2d_39
lambda_4
merge_11
batch_normalization_5
activation_5
conv2d_43
conv2d_41
conv2d_44
conv2d_40
conv2d_42
conv2d_45
merge_12
conv2d_46
lambda_5
merge_13
batch_normalization_6
activation_6
conv2d_50
conv2d_48
conv2d_51
conv2d_47
conv2d_49
conv2d_52
merge_14
conv2d_53
lambda_6
merge_15
batch_normalization_7
activation_7
conv2d_57
conv2d_55
conv2d_58
conv2d_54
conv2d_56
conv2d_59
merge_16
conv2d_60
lambda_7
merge_17
batch_normalization_8
activation_8
conv2d_64
conv2d_62
conv2d_65
conv2d_61
conv2d_63
conv2d_66
merge_18
conv2d_67
lambda_8
merge_19
batch_normalization_9
activation_9
conv2d_71
conv2d_69
conv2d_72
conv2d_68
conv2d_70
conv2d_73
merge_20
conv2d_74
lambda_9
merge_21
batch_normalization_10
activation_10
conv2d_78
conv2d_76
conv2d_79
conv2d_75
conv2d_77
conv2d_80
merge_22
conv2d_81
lambda_10
merge_23
batch_normalization_11
activation_11
conv2d_83
conv2d_84
max_pooling2d_3
conv2d_82
conv2d_85
merge_24
batch_normalization_12
activation_12
conv2d_87
conv2d_88
conv2d_86
conv2d_89
merge_25
conv2d_90
lambda_11
merge_26
batch_normalization_13
activation_13
conv2d_92
conv2d_93
conv2d_91
conv2d_94
merge_27
conv2d_95
lambda_12
merge_28
batch_normalization_14
activation_14
conv2d_97
conv2d_98
conv2d_96
conv2d_99
merge_29
conv2d_100
lambda_13
merge_30
batch_normalization_15
activation_15
conv2d_102
conv2d_103
conv2d_101
conv2d_104
merge_31
conv2d_105
lambda_14
merge_32
batch_normalization_16
activation_16
conv2d_107
conv2d_108
conv2d_106
conv2d_109
merge_33
conv2d_110
lambda_15
merge_34
batch_normalization_17
activation_17
conv2d_112
conv2d_113
conv2d_111
conv2d_114
merge_35
conv2d_115
lambda_16
merge_36
batch_normalization_18
activation_18
conv2d_117
conv2d_118
conv2d_116
conv2d_119
merge_37
conv2d_120
lambda_17
merge_38
batch_normalization_19
activation_19
conv2d_122
conv2d_123
conv2d_121
conv2d_124
merge_39
conv2d_125
lambda_18
merge_40
batch_normalization_20
activation_20
conv2d_127
conv2d_128
conv2d_126
conv2d_129
merge_41
conv2d_130
lambda_19
merge_42
batch_normalization_21
activation_21
conv2d_132
conv2d_133
conv2d_131
conv2d_134
merge_43
conv2d_135
lambda_20
merge_44
batch_normalization_22
activation_22
conv2d_137
conv2d_138
conv2d_136
conv2d_139
merge_45
conv2d_140
lambda_21
merge_46
batch_normalization_23
activation_23
conv2d_142
conv2d_143
conv2d_141
conv2d_144
merge_47
conv2d_145
lambda_22
merge_48
batch_normalization_24
activation_24
conv2d_147
conv2d_148
conv2d_146
conv2d_149
merge_49
conv2d_150
lambda_23
merge_50
batch_normalization_25
activation_25
conv2d_152
conv2d_153
conv2d_151
conv2d_154
merge_51
conv2d_155
lambda_24
merge_52
batch_normalization_26
activation_26
conv2d_157
conv2d_158
conv2d_156
conv2d_159
merge_53
conv2d_160
lambda_25
merge_54
batch_normalization_27
activation_27
conv2d_162
conv2d_163
conv2d_161
conv2d_164
merge_55
conv2d_165
lambda_26
merge_56
batch_normalization_28
activation_28
conv2d_167
conv2d_168
conv2d_166
conv2d_169
merge_57
conv2d_170
lambda_27
merge_58
batch_normalization_29
activation_29
conv2d_172
conv2d_173
conv2d_171
conv2d_174
merge_59
conv2d_175
lambda_28
merge_60
batch_normalization_30
activation_30
conv2d_177
conv2d_178
conv2d_176
conv2d_179
merge_61
conv2d_180
lambda_29
merge_62
batch_normalization_31
activation_31
conv2d_182
conv2d_183
conv2d_181
conv2d_184
merge_63
conv2d_185
lambda_30
merge_64
batch_normalization_32
activation_32
conv2d_192
conv2d_188
conv2d_190
conv2d_193
max_pooling2d_4
conv2d_189
conv2d_191
conv2d_194
merge_65
batch_normalization_33
activation_33
conv2d_196
conv2d_197
conv2d_195
conv2d_198
merge_66
conv2d_199
lambda_31
merge_67
batch_normalization_34
activation_34
conv2d_201
conv2d_202
conv2d_200
conv2d_203
merge_68
conv2d_204
lambda_32
merge_69
batch_normalization_35
activation_35
conv2d_206
conv2d_207
conv2d_205
conv2d_208
merge_70
conv2d_209
lambda_33
merge_71
batch_normalization_36
activation_36
conv2d_211
conv2d_212
conv2d_210
conv2d_213
merge_72
conv2d_214
lambda_34
merge_73
batch_normalization_37
activation_37
conv2d_216
conv2d_217
conv2d_215
conv2d_218
merge_74
conv2d_219
lambda_35
merge_75
batch_normalization_38
activation_38
conv2d_221
conv2d_222
conv2d_220
conv2d_223
merge_76
conv2d_224
lambda_36
merge_77
batch_normalization_39
activation_39
conv2d_226
conv2d_227
conv2d_225
conv2d_228
merge_78
conv2d_229
lambda_37
merge_79
batch_normalization_40
activation_40
conv2d_231
conv2d_232
conv2d_230
conv2d_233
merge_80
conv2d_234
lambda_38
merge_81
batch_normalization_41
activation_41
conv2d_236
conv2d_237
conv2d_235
conv2d_238
merge_82
conv2d_239
lambda_39
merge_83
batch_normalization_42
activation_42
conv2d_241
conv2d_242
conv2d_240
conv2d_243
merge_84
conv2d_244
lambda_40
merge_85
batch_normalization_43
activation_43
average_pooling2d_1
average_pooling2d_2
conv2d_186
dropout_1
conv2d_187
flatten_2
flatten_1
dense_2
dense_1
To view the full description of the layers, you can download the inception_resnet_v2.py file and add these two lines at its end:
res2=create_inception_resnet_v2()
print(res2.summary())
Regarding your second question (next time I suggest you split the questions rather than writing them together, by the way) - Yes, this data would most probably not be sufficient at all for training any of these networks. Frankly, it would be insufficient even for the humble VGG, unless augmentation is used in a smart way - and even then it would be a close call, in my opinion.
You should consider using the published weights if applicable, or at the very least use them for transfer learning.
Related
I am currently trying to use Alphapose keypoints output.
I have a few questions.
I am wondering how can I use these output to extract the features like cadence, stride length and more.
I am wondering how many frames are generated each minute. My below json actually contained more than hundreds of jpegs. below is just a sample.
-> Here is the output examples:
[{'image_id': '3.jpg',
'category_id': 1,
'keypoints': [3084.453125,
766.064453125,
0.27073606848716736,
3109.76806640625,
808.2561645507812,
0.21385984122753143,
3084.453125,
766.064453125,
0.24748951196670532,
3109.76806640625,
875.7628173828125,
0.3147721290588379,
3084.453125,
715.4345092773438,
0.20229308307170868,
3067.576416015625,
1019.2144775390625,
0.4297007918357849,
3050.69970703125,
732.3111572265625,
0.32160425186157227,
2881.93310546875,
968.58447265625,
0.32375261187553406,
2848.1796875,
681.68115234375,
0.3226509690284729,
2730.043212890625,
867.324462890625,
0.5148664116859436,
2730.043212890625,
782.941162109375,
0.33549684286117554,
2763.79638671875,
884.201171875,
0.29956817626953125,
2831.30322265625,
766.064453125,
0.22447820007801056,
2308.12646484375,
884.201171875,
0.2733159065246582,
2325.003173828125,
901.0778198242188,
0.2143297791481018,
1717.4429931640625,
917.9544677734375,
0.3410451412200928,
1683.689697265625,
934.8311767578125,
0.3064221441745758],
'score': 1.1431528329849243,
'box': [1685.37744140625,
665.083251953125,
1296.1279296875,
421.359130859375],
'idx': [0.0]},
{'image_id': '6.jpg',
'category_id': 1,
'keypoints': [2716.578125,
708.694580078125,
0.20404888689517975,
2760.772705078125,
693.9630126953125,
0.24653863906860352,
2731.3095703125,
679.2314453125,
0.24123729765415192,
3003.84375,
878.107666015625,
0.1836661398410797,
2981.746337890625,
679.2314453125,
0.16582731902599335,
3003.84375,
966.4970703125,
0.1983313262462616,
2967.014892578125,
708.694580078125,
0.313667356967926,
3003.84375,
1025.42333984375,
0.3715871572494507,
2760.772705078125,
826.5471801757812,
0.2802092432975769,
2760.772705078125,
856.0103149414062,
0.32904109358787537,
2760.772705078125,
826.5471801757812,
0.27431410551071167,
2701.846435546875,
929.6681518554688,
0.16013894975185394,
2628.188720703125,
885.4734497070312,
0.18557937443256378,
2318.82568359375,
914.9365844726562,
0.33809077739715576,
2333.55712890625,
900.2050170898438,
0.24967674911022186,
1714.8311767578125,
929.6681518554688,
0.40348780155181885,
1744.2943115234375,
914.9365844726562,
0.271779328584671],
'score': 0.8994148969650269,
'box': [1760.4990234375, 702.509521484375, 1131.384765625, 410.12255859375],
'idx': [0.0]}]
Here is the reference I used.
https://github.com/MVIG-SJTU/AlphaPose
If I use optim.SGD(model_conv.fc.parameters() I'm getting an error:
optimizer got an empty parameter list
This error is, when model_conv.fc is nn.Hardtanh(...) (also when I try to use ReLu).
But with nn.Linear it works fine.
What could be the reason?
model_conv.fc = nn.Hardtanh(min_val=0.0, max_val=1.0) #not ok --> optimizer got an empy parameter list
#model_conv.fc = nn.ReLU() #also Not OK
# num_ftrs = model_conv.fc.in_features
# model_conv.fc = nn.Linear(num_ftrs, 1) #it works fine
model_conv = model_conv.to(config.device())
optimizer_conv = optim.SGD(model_conv.fc.parameters(), lr=config.learning_rate, momentum=config.momentum) #error is here
Hardtanh and ReLU are parameter-free layers but Linear has parameters.
Activation functions are used to add non-linearity to your model which is parameter-free, so you should pass a nn.Linear as a fully-connected (FC) layer
num_ftrs = model_conv.fc.in_features
model_conv.fc = nn.Sequential( list_of_FC_layers )
e.g
model_conv.fc = nn.Sequential( nn.Linear(in_features, hidden_neurons),
nn.Linear(hidden_neurons, out_channels) )
or
model_conv.fc = nn.Linear( in_features, out_channels)
out_channels in case of binary-classification task is 1 and in case of multi-class classification is num_classes
NOTE.1: for multi-class classification, Do Not use the softmax layer as it is done in CrossEntropyLoss
NOTE.2 : You can use Sigmoid activation for binary classification use
What I am trying to do:
I want to connect any Layers from different models to create a new keras model.
What I found so far:
https://github.com/keras-team/keras/issues/4205: using the Model's call class to change the input of another model. My problems with this approach:
Can only change the input of the Model, no other layers. So if I want to cut off some layers at the beginning of the encoder, that is not possible
Not a fan of the nested array structure when getting the config file. Would prefer to have a 1D-array
When using model.summary() or plot_model(), the encoder only shows as "Model". If anything I would say both models should be wrapped. So the config should show [model_base, model_encoder] and not [base_input, base_conv2D, ..., encoder_model]
To be fair, with this approach: https://github.com/keras-team/keras/issues/3021, the point above is actually possible, but again, it is very inflexible. As soon as I want to cut off some layers at the top or bottom of the base or encoder network, this approach fails
https://github.com/keras-team/keras/issues/3465: Adding new layers to a base model by using any output of the base model. Problems here:
While it is possible to use any layer from the base model, which means I can cut off layers from the base model, I can not load the encoder as a keras model. The top models always must be created new.
What I have tried:
My approach to connecting any layers from different models:
Clear inbound nodes of input layer
use the call() method of the output layer with the tensor of the output layer
Clean up the outbound nodes of the output tensor by switching out the new created tensor with the previous output tensor
I was really optimistic at first, as the summary() and the plot_model() got me exactly what I wanted, thus the Node graph should be fine, right? But I ran into errors when training. While the approach in the "What I found so far" section trained fine, I ran into an error with my approach. This is the error message:
File "C:\Anaconda\envs\dlpipe\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 508, in apply_op
(input_name, err))
ValueError: Tried to convert 'x' to a tensor and failed. Error: None values not supported.
Might be an important info, that I am using Tensorflow as backend. I was able to trace back the root of this error. It seems like there is an error when the gradients are calculated. Usually, there is a gradient calculation for each node, but all the nodes of the base network have "None" when using my approach. So basically in keras/optimizers.py, get_updates() when the gradients are calculated (grad = self.get_gradients(loss, params)).
Here is the code (without the training), with all three approaches implemented:
def create_base():
in_layer = Input(shape=(32, 32, 3), name="base_input")
x = Conv2D(32, (3, 3), padding='same', activation="relu", name="base_conv2d_1")(in_layer)
x = Conv2D(32, (3, 3), padding='same', activation="relu", name="base_conv2d_2")(x)
x = MaxPooling2D(pool_size=(2, 2), name="base_maxpooling_2d_1")(x)
x = Dropout(0.25, name="base_dropout")(x)
x = Conv2D(64, (3, 3), padding='same', activation="relu", name="base_conv2d_3")(x)
x = Conv2D(64, (3, 3), padding='same', activation="relu", name="base_conv2d_4")(x)
x = MaxPooling2D(pool_size=(2, 2), name="base_maxpooling2d_2")(x)
x = Dropout(0.25, name="base_dropout_2")(x)
return Model(inputs=in_layer, outputs=x, name="base_model")
def create_encoder():
in_layer = Input(shape=(8, 8, 64))
x = Flatten(name="encoder_flatten")(in_layer)
x = Dense(512, activation="relu", name="encoder_dense_1")(x)
x = Dropout(0.5, name="encoder_dropout_2")(x)
x = Dense(10, activation="softmax", name="encoder_dense_2")(x)
return Model(inputs=in_layer, outputs=x, name="encoder_model")
def extend_base(input_model):
x = Flatten(name="custom_flatten")(input_model.output)
x = Dense(512, activation="relu", name="custom_dense_1")(x)
x = Dropout(0.5, name="custom_dropout_2")(x)
x = Dense(10, activation="softmax", name="custom_dense_2")(x)
return Model(inputs=input_model.input, outputs=x, name="custom_edit")
def connect_layers(from_tensor, to_layer, clear_inbound_nodes=True):
try:
tmp_output = to_layer.output
except AttributeError:
raise ValueError("Connecting to shared layers is not supported!")
if clear_inbound_nodes:
to_layer.inbound_nodes = []
else:
tensor_list = to_layer.inbound_nodes[0].input_tensors
tensor_list.append(from_tensor)
from_tensor = tensor_list
to_layer.inbound_nodes = []
new_output = to_layer(from_tensor)
for out_node in to_layer.outbound_nodes:
for i, in_tensor in enumerate(out_node.input_tensors):
if in_tensor == tmp_output:
out_node.input_tensors[i] = new_output
if __name__ == "__main__":
base = create_base()
encoder = create_encoder()
#new_model_1 = Model(inputs=base.input, outputs=encoder(base.output))
#plot_model(new_model_1, to_file="plots/new_model_1.png")
new_model_2 = extend_base(base)
plot_model(new_model_2, to_file="plots/new_model_2.png")
print(new_model_2.summary())
base_layer = base.get_layer("base_dropout_2")
top_layer = encoder.get_layer("encoder_flatten")
connect_layers(base_layer.output, top_layer)
new_model_3 = Model(inputs=base.input, outputs=encoder.output)
plot_model(new_model_3, to_file="plots/new_model_3.png")
print(new_model_3.summary())
I know this is a lot of text and a lot of code. But I feel like it is needed to explain the issue here.
EDIT: I just tried thenao and I think the error gives away more information:
theano.gradient.DisconnectedInputError:
Backtrace when that variable is created:
It seems like every layer from the encoder model has some connection with the encoder input layer via TensorVariables.
So this is what I ended up with for the connect_layer() function:
def connect_layers(from_tensor, to_layer, old_tensor=None):
# if there is any shared layer after the to_layer, it is not supported
try:
tmp_output = to_layer.output
except AttributeError:
raise ValueError("Connecting to shared layers is not supported!")
# check if to_layer has multiple input_tensors, and therefore some sort of merge layer
if len(to_layer.inbound_nodes[0].input_tensors) > 1:
tensor_list = to_layer.inbound_nodes[0].input_tensors
found_tensor = False
for i, tensor in enumerate(tensor_list):
# exchange the old tensor with the new created tensor
if tensor == old_tensor:
tensor_list[i] = from_tensor
found_tensor = True
break
if not found_tensor:
tensor_list.append(from_tensor)
from_tensor = tensor_list
to_layer.inbound_nodes = []
else:
to_layer.inbound_nodes = []
new_output = to_layer(from_tensor)
tmp_out_nodes = to_layer.outbound_nodes[:]
to_layer.outbound_nodes = []
# recursively connect all layers after the current to_layer
for out_node in tmp_out_nodes:
l = out_node.outbound_layer
print("Connecting: " + str(to_layer) + " ----> " + str(l))
connect_layers(new_output, l, tmp_output)
As each Tensor has all the information about it's root tensor via -> owner.inputs -> owner.inputs -> ..., all tensor following the new_output tensor must be updated.
It was a lot easier to debug that with theano then with tensorflow backend.
I still need to figure out how to deal with shared layers. With the current implementation it is not possible to connect other models that contain a shared layer after the first to_layer.
I have a classification model in TF and can get a list of probabilities for the next class (preds). Now I want to select the highest element (argmax) and display its class label.
This may seems silly, but how can I get the class label that matches a position in the predictions tensor?
feed_dict={g['x']: current_char}
preds, state = sess.run([g['preds'],g['final_state']], feed_dict)
prediction = tf.argmax(preds, 1)
preds gives me a vector of predictions for each class. Surely there must be an easy way to just output the most likely class (label)?
Some info about my model:
x = tf.placeholder(tf.int32, [None, num_steps], name='input_placeholder')
y = tf.placeholder(tf.int32, [None, 1], name='labels_placeholder')
batch_size = batch_size = tf.shape(x)[0]
x_one_hot = tf.one_hot(x, num_classes)
rnn_inputs = [tf.squeeze(i, squeeze_dims=[1]) for i in
tf.split(x_one_hot, num_steps, 1)]
tmp = tf.stack(rnn_inputs)
print(tmp.get_shape())
tmp2 = tf.transpose(tmp, perm=[1, 0, 2])
print(tmp2.get_shape())
rnn_inputs = tmp2
with tf.variable_scope('softmax'):
W = tf.get_variable('W', [state_size, num_classes])
b = tf.get_variable('b', [num_classes], initializer=tf.constant_initializer(0.0))
rnn_outputs = rnn_outputs[:, num_steps - 1, :]
rnn_outputs = tf.reshape(rnn_outputs, [-1, state_size])
y_reshaped = tf.reshape(y, [-1])
logits = tf.matmul(rnn_outputs, W) + b
predictions = tf.nn.softmax(logits)
A prediction is an array of n types of classes(labels). It represents the model's "confidence" that the image corresponds to each of its classes(labels). You can check which label has the highest confidence value by using:
prediction = np.argmax(preds, 1)
After getting this highest element index using (argmax function) out of other probabilities, you need to place this index into class labels to find the exact class name associated with this index.
class_names[prediction]
Please refer to this link for more understanding.
You can use tf.reduce_max() for this. I would refer you to this answer.
Let me know if it works - will edit if it doesn't.
Mind that there are sometimes several ways to load a dataset. For instance with fashion MNIST the tutorial could lead you to use load_data() and then to create your own structure to interpret a prediction. However you can also load these data by using tensorflow_datasets.load(...) like here after installing tensorflow-datasets which gives you access to some DatasetInfo. So for instance if your prediction is 9 you can tell it's a boot with:
import tensorflow_datasets as tfds
_, ds_info = tfds.load('fashion_mnist', with_info=True)
print(ds_info.features['label'].names[9])
When you use softmax, the labels you train the model on are either numbers 0..n or one-hot encoded values. So if original labels of your data are let's say string names, you must map them to integers first and keep the mapping as a variable (such as 0 -> "apple", 1 -> "orange", 2 -> "pear" ...).
When using integers (with loss='sparse_categorical_crossentropy'), you get predictions as an array of probabilities, you just find the array index with the max value. You can use this predicted index to reverse-map to your label:
predictedIndex = np.argmax(predictions) // 2
predictedLabel = indexToLabelMap[predictedIndex] // "pear"
If you use one-hot encoded labels (with loss='categorical_crossentropy'), the predicted index corresponds with the "hot" index of your label.
Just for reference, I needed this info when I was working with MNIST dataset used in Google's Machine learning crash course. There is also a good classification tutorial in the Tensorflow docs.
I create a neural network with the initial tensors like this
tensor_dict = {
'model_conv1_weights': tf.get_variable('model_conv1_weights',
shape=[4, 4, 1, 64],
initializer=tf.truncated_normal_initializer(mean=10.0, stddev=2.0))
'model_conv1_biases': tf.get_variable('model_conv1_biases',
shape=[64],
initializer=tf.truncated_normal_initializer(mean=2.0, stddev=1.0))
'model_conv2_weights': tf.get_variable('model_conv2_weights',
shape=[4, 4, 64, 32],
initializer=tf.truncated_normal_initializer(mean=10.0, stddev=2.0))
'model_conv2_biases': tf.get_variable('model_conv2_biases',
shape=[32],
initializer=tf.truncated_normal_initializer(mean=2.0, stddev=1.0))
}
But when I start to train the model, the initial values of those tensors are very different from what I configured.
Did anyone here meet this issue before?