I have a volumetric data with size 256x128x256. Due to limit memory, I cannot use the whole data to directly feed to CAFFE. Hence, I will randomly choose n_sample patches 50x50x50 which extract from the volumetric data and store them into HDF5. I was successful to randomly extract the patches from raw data and its label by extract_patch_from_volumetric_data function.
I want to store these patches into the HDF5 data. The bellow code performs the task. Could you look at and verify help me my implementation? I am wondering about initial matrix raw_patches, label_patches (lines 2,3), order matrix storing in lines 8,9,10,11). Thank you for reading
num_sample=1000;
raw_patches = np.zeros([num_sample, 1, 50,50,50], dtype=np.float16)
label_patches = np.zeros([num_sample, 1, 50,50,50], dtype=np.int8)
for i in range(num_sample):
raw_patch, label_patch = extract_patch_from_volumetric_data(in_data)
#raw_patch shape: [50x50x50], label_patch shape [50x50x50]
# Store them in a array
raw_patches[i, 0, :, :, :] = raw_patch
label_patches[i, 0, :, :, :] = label_patch
raw_patches = raw_patches[0:num_sample, :, :, :, :]
label_patches = label_patches[0:num_sample, :, :, :, :]
#Store in HDF5 and txt path
with h5py.File('./trainMS_%s.h5' % index_file, 'w') as f:
f['data'] = raw_patches
f['label'] = label_patches
with open('./trainMS_list.txt', 'a') as f:
f.write('./trainMS_%s.h5\n' % index_file)
This is my prototxt
layer {
name: "data"
type: "HDF5Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
hdf5_data_param {
source: "./trainMS_list.txt"
batch_size: 1
}
}
layer {
name: "conv1a"
type: "Convolution"
bottom: "data"
top: "conv1a"
convolution_param {
num_output: 32
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
}
}
}
Update: Script to test convolution
filename='./trainMS_0.h5'
with h5py.File(filename) as hf:
data=hf.get('data')
npdata=np.array(data)
print npdata.shape # 100x1x50x50x50
label = hf.get('label')
nplabel = np.array(label)
print nplabel.shape # 100x50x50x50
caffe.set_mode_cpu()
net = caffe.Net('conv.prototxt', caffe.TEST)
im =npdata[20,0,:,:,:] # input is 50x50x50
im_input = im[np.newaxis, np.newaxis, :, :]
net.blobs['data'].reshape(*im_input.shape)
net.blobs['data'].data[...] = im_input
# pick first filter output
conv0 = net.blobs['conv'].data[0, 0]
print("pre-surgery output mean {:.2f}".format(conv0.mean())) #output is 0
# set first filter bias to 1
net.params['conv'][1].data[0] = 1.
net.forward()
conv_out=net.blobs['conv'].data[0, 0];
print conv_out.shape # 50x50x50
Related
I've got a huge data set in LMDB (40Gb) that I use for training a binary classifier with caffe.
Data layer in Caffe contains integer labels.
Are there any ready layers that could transform them into floats with adding some random jitter, so I could apply label smoothing technique, as described in 7.5.1 here
I have seen examples with HDF5, but they require regenerating data set, and I would like to avoid it.
You can use DummyData layer to generate the random noise you wish to add to the labels. Once you have the noise, use Eltwise layer to sum them up:
layer {
name: "noise"
type: "DummyData"
top: "noise"
dummy_data_param {
shape { dim: 10 dim: 1 dim: 1 dim: 1 } # assuming batch size = 10
data_filler { type: "uniform" min: -0.1 max: 0.1 } # noise ~U(-0.1, 0.1)
}
}
layer {
name: "label_noise"
type: "Eltwise"
bottom: "label" # the input integer labels
bottom: "noise"
top: "label_noise"
eltwise_param { operation: SUM }
}
I am currently reading the paper on 'CMS-RCNN: Contextual Multi-Scale Region-based CNN for Unconstrained Face Detection', it is using the skip-connection to fuse conv3-3, conv4-3 and conv5-3 together, the steps are shown below
Extract the feature maps of the face region (at multiple scales conv3-3, conv4-3, conv5-3) and apply RoI-Pooling to it (i.e. convert to a fixed height and width).
L2-normalize each feature map.
Concatenate the (RoI-pooled and normalized) feature maps of the face (at multiple scales) with each other (creates one tensor).
Apply a 1x1 convolution to the face tensor.
Apply two fully connected layers to the face tensor, creating a vector.
I used the caffe and made a prototxt based on faster-RCNN VGG16 , the following parts are added into the original prototxt
# roi pooling the conv3-3 layer and L2 normalize it
layer {
name: "roi_pool3"
type: "ROIPooling"
bottom: "conv3_3"
bottom: "rois"
top: "pool3_roi"
roi_pooling_param {
pooled_w: 7
pooled_h: 7
spatial_scale: 0.25 # 1/4
}
}
layer {
name:"roi_pool3_l2norm"
type:"L2Norm"
bottom: "pool3_roi"
top:"pool3_roi"
}
-------------
# roi pooling the conv4-3 layer and L2 normalize it
layer {
name: "roi_pool4"
type: "ROIPooling"
bottom: "conv4_3"
bottom: "rois"
top: "pool4_roi"
roi_pooling_param {
pooled_w: 7
pooled_h: 7
spatial_scale: 0.125 # 1/8
}
}
layer {
name:"roi_pool4_l2norm"
type:"L2Norm"
bottom: "pool4_roi"
top:"pool4_roi"
}
--------------------------
# roi pooling the conv5-3 layer and L2 normalize it
layer {
name: "roi_pool5"
type: "ROIPooling"
bottom: "conv5_3"
bottom: "rois"
top: "pool5"
roi_pooling_param {
pooled_w: 7
pooled_h: 7
spatial_scale: 0.0625 # 1/16
}
}
layer {
name:"roi_pool5_l2norm"
type:"L2Norm"
bottom: "pool5"
top:"pool5"
}
# concat roi_pool3, roi_pool4, roi_pool5 and apply 1*1 conv
layer {
name:"roi_concat"
type: "Concat"
concat_param {
axis: 1
}
bottom: "pool5"
bottom: "pool4_roi"
bottom: "pool3_roi"
top:"roi_concat"
}
layer {
name:"roi_concat_1*1_conv"
type:"Convolution"
top:"roi_concat_1*1_conv"
bottom:"roi_concat"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 128
pad: 1
kernel_size: 1
weight_filler{
type:"xavier"
}
bias_filler{
type:"constant"
}
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "roi_concat_1*1_conv"
top: "fc6"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 4096
}
}
during the training, I met such a issue
F0616 16:43:02.899025 3712 net.cpp:757] Cannot copy param 0 weights from layer 'fc6'; shape mismatch. Source param shape is 1 1 4096 25088 (102760448); target param shape is 4096 10368 (42467328).
To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.
I could find out what goes wrong, I need some help from you if you can spot some problem or explanation.
Really appreciated!!
The error message you got is quite clear. You are trying to fine-tune the weights of the layers, but for "fc6" layer you have a problem:
The original net you copied the weights from had "fc6" layer with input dimension of 10368. On the other hand, your "fc6" layer has input dimension of 25088. You cannot use the same W matrix (aka param 0 of this layer) if the input dimension is different.
Now that you know the problem, look at the error message again:
Cannot copy param 0 weights from layer 'fc6'; shape mismatch.
Source param shape is 1 1 4096 25088 (102760448);
target param shape is 4096 10368 (42467328).
Caffe cannot copy W matrix (param 0) of "fc6" layer, its shape does not match the shape of W stored in .caffemodel you are trying to fine tune.
What can you do?
Simply read the next line of the error message:
To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.
Just rename the layer, and caffe will learn the weights from scratch (only for this layer).
I design a net the same as FCN.Input data is 1*224*224,Input label is 1*224*224.but I meet error:
F0502 07:57:30.032742 18127 softmax_loss_layer.cpp:47] Check failed: outer_num_ * inner_num_ == bottom[1]->count() (50176 vs. 1) Number of labels must match number of predictions; e.g., if softmax axis == 1 and prediction shape is (N, C, H, W), label count (number of labels) must be N*H*W, with integer values in {0, 1, ..., C-1}.
here is the input structure:
layer {
name: "data"
type: "ImageData"
top: "data"
top: "label"
image_data_param {
ource: "/home/zhaimo/fcn-master/mo/train.txt"
batch_size: 1
shuffle: true
}
}
the softmax layers:
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "upscore1"
bottom: "label"
top: "loss"
loss_param {
ignore_label: 255
normalize: false
}
}
the train.txt file:
/home/zhaimo/fcn-master/data/vessel/train/original/01.png /home/zhaimo/SegNet/data/vessel/train/label/01.png
/home/zhaimo/fcn-master/data/vessel/train/original/02.png /home/zhaimo/SegNet/data/vessel/train/label/02.png
/home/zhaimo/fcn-master/data/vessel/train/original/03.png /home/zhaimo/SegNet/data/vessel/train/label/03.png
/home/zhaimo/fcn-master/data/vessel/train/original/04.png /home/zhaimo/SegNet/data/vessel/train/label/04.png
the first file name is input data and the second one is its label.
===========================update=======================================
I tried to use two ImageData layer as input:
layer {
name: "data"
type: "ImageData"
top: "data"
image_data_param {
source: "/home/zhaimo/fcn-master/mo/train_o.txt"
batch_size: 1
shuffle: false
}
}
layer {
name: "label"
type: "ImageData"
top: "label"
image_data_param {
source: "/home/zhaimo/fcn-master/mo/train_l.txt"
batch_size: 1
shuffle: false
}
}
but meet another error:
I0502 08:34:46.429774 19100 layer_factory.hpp:77] Creating layer data
I0502 08:34:46.429808 19100 net.cpp:100] Creating Layer data
I0502 08:34:46.429816 19100 net.cpp:408] data -> data
F0502 08:34:46.429834 19100 layer.hpp:389] Check failed: ExactNumTopBlobs() == top.size() (2 vs. 1) ImageData Layer produces 2 top blob(s) as output.
*** Check failure stack trace: ***
Aborted (core dumped)
train_o.txt:
/home/zhaimo/fcn-master/data/vessel/train/original/01.png
/home/zhaimo/fcn-master/data/vessel/train/original/02.png
/home/zhaimo/fcn-master/data/vessel/train/original/03.png
/home/zhaimo/fcn-master/data/vessel/train/original/04.png
/home/zhaimo/fcn-master/data/vessel/train/original/05.png
train_l.txt:
/home/zhaimo/SegNet/data/vessel/train/label/01.png
/home/zhaimo/SegNet/data/vessel/train/label/02.png
/home/zhaimo/SegNet/data/vessel/train/label/03.png
/home/zhaimo/SegNet/data/vessel/train/label/04.png
/home/zhaimo/SegNet/data/vessel/train/label/05.png
===============================Update2===================================
if I use two ImageData layers,how to modify the deploy.prototxt?
here is the file I wrote:
layer {
name: "data"
type: "ImageData"
top: "data"
top: "tmp0"
input_param { shape: { dim: 1 dim: 1 dim: 224 dim: 224 } }
}
and the forward.py file:
import numpy as np
from PIL import Image
caffe_root = '/home/zhaimo/'
import sys
sys.path.insert(0, caffe_root + 'caffe-master/python')
import caffe
# load image, switch to BGR, subtract mean, and make dims C x H x W for Caffe
im = Image.open('/home/zhaimo/fcn-master/data/vessel/test/13.png')
in_ = np.array(im, dtype=np.float32)
#in_ = in_[:,:,::-1]
#in_ -= np.array((104.00698793,116.66876762,122.67891434))
#in_ = in_.transpose((2,0,1))
# load net
net = caffe.Net('/home/zhaimo/fcn-master/mo/deploy.prototxt', '/home/zhaimo/fcn-master/mo/snapshot/train/_iter_200000.caffemodel', caffe.TEST)
# shape for input (data blob is N x C x H x W), set data
net.blobs['data'].reshape(1, *in_.shape)
net.blobs['data'].data[...] = in_
# run net and take argmax for prediction
net.forward()
out = net.blobs['score'].data[0].argmax(axis=0)
plt.axis('off')
plt.savefig('/home/zhaimo/fcn-master/mo/result/13.png')
but I meet the error:
F0504 08:16:46.423981 3383 layer.hpp:389] Check failed: ExactNumTopBlobs() == top.size() (2 vs. 1) ImageData Layer produces 2 top blob(s) as output.
how to modify the forward.py file,please?
Your problem is with the data top blob numbers. For two imagedata layer use this:
layer {
name: "data"
type: "ImageData"
top: "data"
top: "tmp"
image_data_param {
source: "/home/zhaimo/fcn-master/mo/train_o.txt"
batch_size: 1
shuffle: false
}
}
layer {
name: "label"
type: "ImageData"
top: "label"
top: "tmp1"
image_data_param {
// you probably also need
//is_color: false
source: "/home/zhaimo/fcn-master/mo/train_l.txt"
batch_size: 1
shuffle: false
}
}
In the text file just set all label to 0. You are not going to use tmp/tmp1 so it doesn't matter.
I try to migrate a Caffe network and model(weights) to tensorflow.
The original first layer is defined as shown at last, which is a stride one convolution on 1x128x128 gray image with kernel size 5x5, output channel 96.
I converted the weights from caffemodel file to numpy array following this procedure:
net = caffe.Net(model, caffe.TEST);
net.copy_from(weights);
weights = net.params[name][0].data
bais = net.params[name][1].data
if "fc" in name:
weights = weights.transpose()#2D
elif "conv" in name:
weights = weights.transpose(2, 3, 1, 0)
Caffe weights shape:(96, 1, 5, 5),biases shape:(96,). After the transpose, new array of 'weights shape:', (5, 5, 1, 96), 'biases shape:', (96,), are used to initialize tensorflow filter.
tensorflow code is as followed:
gray = tf.reduce_mean(images, axis=3, keep_dims=True)
self.gray = gray
conv1 = self._conv_layer(gray, name='conv1')
def _conv_layer(self, input_, output_dim=96,
k_h=3, k_w=3, d_h=1, d_w=1, stddev=0.02,
name="conv2d"):
#Note: currently kernel size and input output channel num are decided by loaded filter weights.
#only strides are decided by calling param.
with tf.variable_scope(name) as scope:
filt = self.get_conv_filter(name)
conv = tf.nn.conv2d(input_, filt, strides=[1, d_h, d_w, 1], padding='SAME')
conv_biases = self.get_bias(name)
return tf.nn.bias_add(conv, conv_biases)
def get_conv_filter(self, name):
init = tf.constant_initializer(value=weights,
dtype=tf.float32)
shape = weights.shape
var = tf.get_variable(name="filter", initializer=init, shape=shape)
return var
I checked the input data of Caffe net and tensorflow's tensor gray, they are the same numbers with the same 2D layout. (1,1,128,128) and (10, 128, 128, 1), tensorflow use a batch size of 10.
I also checked the kernel through Caffe's print(net.blobs['conv1'].data[0,0,...]) and the numpy array used to initalize tensorflow var with print(weights[:,:,:,0]).
the kernel's first layer screen shot is shown below:
the bias is -0.65039569 and the upper left corner of the image is:
0.30989584 0.30989584 0.29427084 0.21354167 0.16145833
0.30989584 0.30989584 0.29427084 0.21354167 0.16145833
0.28645834 0.28645834 0.27083334 0.19010417 0.09114584
However, the two's upper left corner of conv1's first feature map are different.(please ignore the irrrelevant 256)
Only the leftmost column is consistent. I manually calculated and checked the results, the first and the second value of Caffe's (-0.71238005 -0.74042225) are correct according to the definition of convolution, the second value in tensorflow's (-0.71238005 -0.31195271) is incorrect.
Taking into account the padding, the first value is from 3x3 block of the image, the second should be the 3x4 block.
Since tensorflow has the correct first value, computed from the 3x3 block of image corner, I assume the kernel layout and image layout and 'SAME' padding are correct. I thought it was a problem with stride that caused the incorrect second value, but the stride must be one, otherwise tensorflow's conv1 feature map's size won't be (10, 128, 128, 96).
Caffe's convolution layer def:
input_param {
shape: {
dim: 10
dim: 1
dim: 128
dim: 128
}
}
transform_param {
crop_size: 128
mirror: false
}
}
layer{
name: "conv1"
type: "Convolution"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 5
stride: 1
pad: 2
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
bottom: "data"
top: "conv1"
}
UPDATE:
Another contrived experiment(see code bolow) shows the tensorflow implementation is able to compute the correct second value. However, the error remains in the above situation. What is it that caused the error in the converted version?
input = np.random.rand(100,100)
input = input.reshape([1,100,100,1])
k = np.random.rand(5,5)
k = k.reshape([5,5,1,1])
input_tf = tf.constant(input,dtype=tf.float32)
init = tf.constant_initializer(value=k,
dtype=tf.float32)
filter = tf.get_variable(name="filter", initializer=init, shape=k.shape)
conv = tf.nn.conv2d(input_tf, filter, strides=[1,1,1,1], padding='SAME')
Is there anyway to use only G and B channels for training Caffe using "ImageData" input layer?
You can add a convolution layer on top of your input that will select G and B:
layer {
name: "select_B_G"
type: "Convolution"
bottom: "data"
top: "select_B_G"
convolution_param { kernel_size: 1 num_output: 2 bias_term: false }
param { lr_mult: 0 } # do not learn parameters for this layer
}
You'll need to do some net surgery prior to training to set the weights for this layer to be
net.params['select_B_G'][0].data[...] = np.array( [[1,0,0],[0,1,0]], dtype='f4')
Note: sometimes images loaded to caffe are going through channel-swap transformation, i.e., RGB -> BGR, therefore you need to be careful what channels you pick.
I wrote a simple python layer to do this, by the way, I did't test this code.
import caffe
class ExtractGBChannelLayer(caffe.Layer):
def setup(self,bottom,top):
pass
def reshape(self,bottom,top):
bottom_shape=bottom[0].data.shape
top_shape=[bottom_shape[0],2,bottom_shape[2],bottom_shape[3]] #because we only want G and B channels.
top[0].reshape(*top_shape)
def forward(self,bottom,top):
#copy G and B channel to top, note caffe BGR order!
top[0].data[:,0,...]=bottom[0].data[:,1,...]
top[0].data[:, 1, ...] = bottom[0].data[:, 0, ...]
def backward(self,top,propagate_down,bottom):
pass
You can save this file as MyPythonLayer.py
In you prototxt you can insert a layer after ImageDataLayer like this
layer {
name: "GB"
type: "Python"
bottom: "data"
top: "GB"
python_param {
module: "MyPythonLayer"
layer: "ExtractGBChannelLayer"
}
}
Hope this works fine.
This is the Matlab code I used and it works.
caffe.reset_all(); % reset caffe
caffe.set_mode_gpu();
gpu_id = 0; % we will use the first gpu in this demo
caffe.set_device(gpu_id);
net_model = ['net_images.prototxt'];
net = caffe.Net(net_model, 'train')
a = zeros(1,1,3,2);
a(1,1,:,:) = [[1,0,0];[0,1,0]]'; % caffe uses BGR color channel order
net.layers('select_B_G').params(1).set_data(a);
solver = caffe.Solver(solverFN);
solver.solve();
net.save(fullfile(model_dir, 'my_net.caffemodel'));