How can I define multiply constant layer in Caffe (like MulConstant in Torch). I need to add it predefined const to existing network.
Caffe fails to parse my attempt to scale everything by 0.85:
layers {
name: "caffe.ConstantMul_0"
type: "Eltwise"
bottom: "caffe.SpatialConvolution_0"
top: "caffe.ConstantMul_0"
eltwise_param {
op: MUL
coeff: 0.85
}
}
It is possible to do with Power Layer, just set up power to 1 and scale to whatever you need:
layer {
name: "caffe.ConstantMul_1"
bottom: "caffe.SpatialConvolution_3"
top: "caffe.ConstantMul_1"
type: "Power"
power_param {
power: 1
scale: 0.85
shift: 0
}
}
Eltwise layer can do three types of operations - PROD, SUM, MAX. You can see more about this here
In your case, the op paramter should be set as PROD.
layers {
name: "caffe.ConstantMul_0"
type: "Eltwise"
bottom: "caffe.SpatialConvolution_0"
top: "caffe.ConstantMul_0"
eltwise_param {
op: MUL
coeff: 0.85
}
}
Related
I've got a huge data set in LMDB (40Gb) that I use for training a binary classifier with caffe.
Data layer in Caffe contains integer labels.
Are there any ready layers that could transform them into floats with adding some random jitter, so I could apply label smoothing technique, as described in 7.5.1 here
I have seen examples with HDF5, but they require regenerating data set, and I would like to avoid it.
You can use DummyData layer to generate the random noise you wish to add to the labels. Once you have the noise, use Eltwise layer to sum them up:
layer {
name: "noise"
type: "DummyData"
top: "noise"
dummy_data_param {
shape { dim: 10 dim: 1 dim: 1 dim: 1 } # assuming batch size = 10
data_filler { type: "uniform" min: -0.1 max: 0.1 } # noise ~U(-0.1, 0.1)
}
}
layer {
name: "label_noise"
type: "Eltwise"
bottom: "label" # the input integer labels
bottom: "noise"
top: "label_noise"
eltwise_param { operation: SUM }
}
After some struggling, I decided to try a most simple task, training a network to classify weither a number is non-negtive. And I failed...
I generated the data with following code. And I'm not sure if it is right. I read the data back from the file, and it looked right, though...
#pragma comment(lib, "hdf5")
#pragma comment(lib, "hdf5_cpp")
#include <cstdint>
#include <array>
#include <random>
#include <vector>
using namespace std;
#include <H5Cpp.h>
using namespace H5;
mt19937 rng;
float randf(float i_min, float i_max)
{
return rng() * ((i_max - i_min) / 0x100000000) + i_min;
}
#define NAME "pos_neg"
#define TRAIN_SET_SIZE 0x100000
#define TEST_SET_SIZE 0x10000
void make(const string &i_cat, uint32_t i_count)
{
H5File file(NAME "." + i_cat + ".h5", H5F_ACC_TRUNC);
hsize_t dataDim[2] = { i_count, 1 };
hsize_t labelDim = i_count;
FloatType dataType(PredType::NATIVE_FLOAT);
DataSpace dataSpace(2, dataDim);
DataSet dataSet = file.createDataSet("data", dataType, dataSpace);
IntType labelType(PredType::NATIVE_INT);
DataSpace labelSpace(1, &labelDim);
DataSet labelSet = file.createDataSet("label", labelType, labelSpace);
vector<float> data(i_count);
vector<int> labels(i_count);
for (uint32_t i = 0; i < i_count / 2; ++i)
{
labels[i * 2] = 0;
data[i * 2] = randf(0.f, 1.f);
labels[i * 2 + 1] = 1;
data[i * 2 + 1] = randf(-1.f, 0.f);
}
dataSet.write(&data[0], PredType::NATIVE_FLOAT);
labelSet.write(&labels[0], PredType::NATIVE_INT);
}
int main()
{
make("train", TRAIN_SET_SIZE);
make("test", TEST_SET_SIZE);
}
And the network looks like this
name: "PosNegNet"
layer {
name: "data"
type: "HDF5Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
hdf5_data_param {
source: "pos_neg_train.txt"
batch_size: 64
}
}
layer {
name: "data"
type: "HDF5Data"
top: "data"
top: "label"
include {
phase: TEST
}
hdf5_data_param {
source: "pos_neg_test.txt"
batch_size: 65536
}
}
layer {
name: "fc1"
type: "InnerProduct"
bottom: "data"
top: "fc1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc1"
bottom: "label"
top: "loss"
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "fc1"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
And and one set of parameters I tried
net: "pos_neg.prototxt"
test_iter: 1
test_interval: 500
base_lr: 0.001
momentum: 0.9
momentum2: 0.999
lr_policy: "fixed"
display: 100
max_iter: 10000
snapshot: 5000
snapshot_prefix: "pos_neg"
type: "Adam"
solver_mode: GPU
And I ran caffe.exe on Windows. And I always got loss = 0, accuracy = 0.5.
I know I must have done something wrong, but I don't know from where to look, well, other than digging up source code...
And I found that caffe is fairly slow. I got only around 16 iterations per second for a float[64] data with 1024 item per batch on a 1080Ti. Was it normal or I did something wrong again?
Set num_output: 2 in your "fc1": when using "SoftmaxWithLoss" and/or "Accuracy" layers caffe expects your prediction to be a vector of class probabilities. In your case, you have two classes, thus this vector should be of length 2 (and not 1 as it currently stands).
Alternatively, you can keep num_output: 1 and switch the loss to "SigmoidCrossEntropyLoss" layer. However, you will not be able to use "Accuracy" layer anymore...
I am currently reading the paper on 'CMS-RCNN: Contextual Multi-Scale Region-based CNN for Unconstrained Face Detection', it is using the skip-connection to fuse conv3-3, conv4-3 and conv5-3 together, the steps are shown below
Extract the feature maps of the face region (at multiple scales conv3-3, conv4-3, conv5-3) and apply RoI-Pooling to it (i.e. convert to a fixed height and width).
L2-normalize each feature map.
Concatenate the (RoI-pooled and normalized) feature maps of the face (at multiple scales) with each other (creates one tensor).
Apply a 1x1 convolution to the face tensor.
Apply two fully connected layers to the face tensor, creating a vector.
I used the caffe and made a prototxt based on faster-RCNN VGG16 , the following parts are added into the original prototxt
# roi pooling the conv3-3 layer and L2 normalize it
layer {
name: "roi_pool3"
type: "ROIPooling"
bottom: "conv3_3"
bottom: "rois"
top: "pool3_roi"
roi_pooling_param {
pooled_w: 7
pooled_h: 7
spatial_scale: 0.25 # 1/4
}
}
layer {
name:"roi_pool3_l2norm"
type:"L2Norm"
bottom: "pool3_roi"
top:"pool3_roi"
}
-------------
# roi pooling the conv4-3 layer and L2 normalize it
layer {
name: "roi_pool4"
type: "ROIPooling"
bottom: "conv4_3"
bottom: "rois"
top: "pool4_roi"
roi_pooling_param {
pooled_w: 7
pooled_h: 7
spatial_scale: 0.125 # 1/8
}
}
layer {
name:"roi_pool4_l2norm"
type:"L2Norm"
bottom: "pool4_roi"
top:"pool4_roi"
}
--------------------------
# roi pooling the conv5-3 layer and L2 normalize it
layer {
name: "roi_pool5"
type: "ROIPooling"
bottom: "conv5_3"
bottom: "rois"
top: "pool5"
roi_pooling_param {
pooled_w: 7
pooled_h: 7
spatial_scale: 0.0625 # 1/16
}
}
layer {
name:"roi_pool5_l2norm"
type:"L2Norm"
bottom: "pool5"
top:"pool5"
}
# concat roi_pool3, roi_pool4, roi_pool5 and apply 1*1 conv
layer {
name:"roi_concat"
type: "Concat"
concat_param {
axis: 1
}
bottom: "pool5"
bottom: "pool4_roi"
bottom: "pool3_roi"
top:"roi_concat"
}
layer {
name:"roi_concat_1*1_conv"
type:"Convolution"
top:"roi_concat_1*1_conv"
bottom:"roi_concat"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 128
pad: 1
kernel_size: 1
weight_filler{
type:"xavier"
}
bias_filler{
type:"constant"
}
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "roi_concat_1*1_conv"
top: "fc6"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 4096
}
}
during the training, I met such a issue
F0616 16:43:02.899025 3712 net.cpp:757] Cannot copy param 0 weights from layer 'fc6'; shape mismatch. Source param shape is 1 1 4096 25088 (102760448); target param shape is 4096 10368 (42467328).
To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.
I could find out what goes wrong, I need some help from you if you can spot some problem or explanation.
Really appreciated!!
The error message you got is quite clear. You are trying to fine-tune the weights of the layers, but for "fc6" layer you have a problem:
The original net you copied the weights from had "fc6" layer with input dimension of 10368. On the other hand, your "fc6" layer has input dimension of 25088. You cannot use the same W matrix (aka param 0 of this layer) if the input dimension is different.
Now that you know the problem, look at the error message again:
Cannot copy param 0 weights from layer 'fc6'; shape mismatch.
Source param shape is 1 1 4096 25088 (102760448);
target param shape is 4096 10368 (42467328).
Caffe cannot copy W matrix (param 0) of "fc6" layer, its shape does not match the shape of W stored in .caffemodel you are trying to fine tune.
What can you do?
Simply read the next line of the error message:
To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.
Just rename the layer, and caffe will learn the weights from scratch (only for this layer).
I have a 5D blob like 1x8x128x128 and I have a Convolution layer which is able to process my 5D blob. When I want to use a pool layer though it does not work. How do you use a pool-layer with a 5D blob?
Check failed: 4 == bottom[0]->num_axes() (4 vs. 5) Input must have 4
axes, corresponding to (num, channels, height, width)
I think it is just not supported yet by caffe. Could I just use a convolution layer and do the pooling?
If you want to pool only the first two spatial dimensions, you can "Reshape" to 4D ("squashing" the channel and temporal dimensions), pool and then "Reshape" back to 5D:
layer {
name: "pool/reshape4D"
type: "Reshape"
bottom: "in"
top: "pool/reshape4D"
reshape_param { axis: 1 num_axes: 1 shape { dim: -1 } }
}
layer {
name: "pool"
type: "Pooling"
bottom: "pool/reshape4D"
top: "pool"
# pooling params here...
}
layer {
name: "pool/reshape5D"
type: "Reshape"
bottom: "pool"
top: "pool/reshape5D"
reshape_param { axis: 1 num_axes: 1 shape { dim: -1 dim: <temporal_dim> } } # replace <.> with the actual temporal dimension size.
}
See the definition of ReshapeParameter in caffe.proto for more details.
I'm working with some older branch of caffe. Now I need to modify the prototxt file by slicing the input layer.
I know that in the new syntax it looks like this:
layer {
name: "slice"
type: "Slice"
bottom: "labelAndMask"
## Example of layer with a shape N x 5 x Height x Width
top: "label"
top: "mask"
slice_param {
axis: 1
slice_point: 1
}
}
What would be the equivalent in the old prototxt format? Also, where in the caffe sources could I look this up by myself?
You should look at the bottom of $CAFFE_ROOT/src/caffe/proto/caffe.proto, you'll see the V1LayerParameter definition.
For old syntax slice layer:
layers {
type: SLICE # this is NOT a string, but an enum
name: "slice"
bottom: "labelAndMask"
## Example of layer with a shape N x 5 x Height x Width
top: "label"
top: "mask"
slice_param {
axis: 1
slice_point: 1
}
}