Video classification using HDF5 in CAFFE? - machine-learning

I am using hdf5 layer for video classification (C3D). This is my code to generate hdf5 file
import h5py
import numpy as np
import skvideo.datasets
import skvideo.io
videodata = skvideo.io.vread('./v_ApplyEyeMakeup_g01_c01.avi')
videodata=videodata.transpose(3,0,1,2) # To chanelxdepthxhxw
videodata=videodata[None,:,:,:]
with h5py.File('./data.h5','w') as f:
f['data'] = videodata
f['label'] = 1
Now, the data.h5 is saved in the file video.list. I perform the classification based on the prototxt
layer {
name: "data"
type: "HDF5Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
hdf5_data_param {
source: "./video.list"
batch_size: 1
shuffle: true
}
}
layer {
name: "conv1a"
type: "Convolution"
bottom: "data"
top: "conv1a"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: -0.1
}
axis: 1
}
}
layer {
name: "fc8"
type: "InnerProduct"
bottom: "conv1a"
top: "fc8"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 101
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc8"
bottom: "label"
top: "loss"
}
However, I got the error as
I0918 22:29:37.163431 32197 hdf5.cpp:35] Datatype class: H5T_INTEGER
F0918 22:29:37.164500 32197 blob.hpp:122] Check failed: axis_index < num_axes() (1 vs. 1) axis 1 out of range for 1-D Blob with shape 6 (6)
Update: When I change the code as f['label'] = 1, I also got the error F0918 23:04:39.884270 2138 hdf5.cpp:21] Check failed: ndims >= min_dim (0 vs. 1)
How should I fix it? I guess the hdf5 generating part has some error in label field. Thanks all

Please read carefully the answer you linked:
Your label should be an integer and not a 1-hot vector.
It seems like your data is of type integer. I suppose you would like to convert it to np.float32. And while you are at it, consider subtracting the mean.
Since your HDF5 file has only one sample, you cannot have label as a scalar ("0 dim array"). You need to make label as np.ones((1,1), dtype=np.float32).
Use h5ls ./data.h5 to verify that label is indeed an array and not a scalar.

Related

Caffe MLP example

I'm trying to learn caffe by making an xor example.
I'm following this link from the caffe website, but they are doing a CNN.
I'm trying to follow along the tutorial and I am stuck when it comes to compiling the model.
I made a prototxt file describes the model architecture, I am trying to make a two layered xor network. My code is below:
name: "xor_test"
layer {
name: "data"
type: "Data"
transform_param {
scale: 1
}
data_param {
source: "0 0 0
1 0 1
0 1 1
1 1 0"
backend: LMDB
batch_size: 1
}
top: "data"
top: "data"
}
layer {
name: "ip1"
type: "InnerProduct"
param { lr_mult: 1 }
param { lr_mult: 2 }
inner_product_param {
num_output: 3
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
bottom: "data"
top: "ip1"
}
layer {
name: "tanh1"
type: "Tanh"
bottom: "ip1"
top: "ip1"
}
layer {
name: "ip2"
type: "InnerProduct"
param { lr_mult: 1 }
param { lr_mult: 2 }
inner_product_param {
num_output: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
bottom: "ip1"
top: "ip2"
}
layer {
name: "tanh2"
type: "Tanh"
bottom: "ip2"
top: "ip2"
}
I don't know if this model is correct, I can't find other examples to reference.
After this, the tutorial says to create a solver prototxt file which referenced the previously created file.
net: "test.prototxt"
test_iter: 2
test_interval: 5
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
lr_policy: "inv"
gamma: 0.0001
power: 0.75
display: 5
# The maximum number of iterations
max_iter: 10
# solver mode: CPU or GPU
solver_mode: CPU
I'm not sure how to train or test the model since my inputs are not images.
Your input layer is incorrect. Since you are not using images as inputs, but rather simple binary vectors, you might consider using HDF5Data layer for input.
There is a good example here on how to construct and use this input data layer.

What is the default value of bias_filler type "constant" in Caffe's convolution layer?

In the example layer below, what is the default value of bias_filler type "constant" in Caffe's convolution layer?
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
num_output: 12
kernel_size: 3
stride: 2
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
I can't find the answer in the documentation at http://caffe.berkeleyvision.org/tutorial/layers/convolution.html.
You can find a lot of documentation/information in caffe.proto file. This file describe the syntax of caffe's prototxt/binaryproto.
Looking for the proto definition of constant filler (line 46):
optional float value = 2 [default = 0]; // the value in constant filler
You can see the default value is 0.

Caffe Copy PreTrained Weights of AlexNet to custom network that has Two AlexNets

I am trying to build a network that contains two image inputs. Each image will go through a network simultaneous with late fusion that will merge and give one outputs. I use a diagram below to show what I need (ps: sorry my english is not so good)
My network defined in caffe prototxt model definition file which contains exact AlexNet defined twice upto the Pool 5. For first network the layers has same name as in AlexNet while for the second net I added a "_1" suffix to each layer name.
My question is how can i copy the pretained weight?
For Eg: my convolution layer 1 of each network is as follows. Note that for conv1 the pretrained weights can easily be copied since the layer name is same as one in pretrained model. However of conv1_1 the same is different so I am afraid I cannot copy the pretrained weights? Or is there a way to do this even is the layer names are different?
layer {
name: "conv1"
type: "Convolution"
bottom: "data1"
top: "conv1"
param {
lr_mult: 0
decay_mult: 1
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "conv1_1"
type: "Convolution"
bottom: "data2"
top: "conv1_1"
param {
lr_mult: 0
decay_mult: 1
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
I presume that you are trying to finetune the whole network after initialization, otherwise you could simply use features extracted from AlexNet and start your training at the FC layer. For finetuning, you will need to copy the weights on the first network (one with the same names), and have second network share the weights with the first one. Take a look at this thread. Or rather this reply from Evan Shelmar.
I did something similar, here you can see the Siamese Network with Identical AlexNet. Identical AlexNet for Siamese Network. Here is prototxt file

Caffe : train network accuracy = 1 constant ! Accuracy issue

Right now, I am train network with with 2 class data... but accuracy is constant 1 after first iteration !
Input data is grayscale images. both class images are randomly selected when HDF5Data creation.
Why is that happened ? What's wrong or where is mistake !
network.prototxt :
name: "brainMRI"
layer {
name: "data"
type: "HDF5Data"
top: "data"
top: "label"
include: {
phase: TRAIN
}
hdf5_data_param {
source: "/home/shivangpatel/caffe/brainMRI1/train_file_location.txt"
batch_size: 10
}
}
layer {
name: "data"
type: "HDF5Data"
top: "data"
top: "label"
include: {
phase: TEST
}
hdf5_data_param {
source: "/home/shivangpatel/caffe/brainMRI1/test_file_location.txt"
batch_size: 10
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool2"
top: "ip1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "ip1"
top: "ip1"
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "ip1"
top: "ip2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 2
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "softmax"
type: "Softmax"
bottom: "ip2"
top: "smip2"
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "ip2"
bottom: "label"
top: "loss"
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "smip2"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
Output :
I0217 17:41:07.912580 2913 net.cpp:270] This network produces output loss
I0217 17:41:07.912607 2913 net.cpp:283] Network initialization done.
I0217 17:41:07.912739 2913 solver.cpp:60] Solver scaffolding done.
I0217 17:41:07.912789 2913 caffe.cpp:212] Starting Optimization
I0217 17:41:07.912813 2913 solver.cpp:288] Solving brainMRI
I0217 17:41:07.912832 2913 solver.cpp:289] Learning Rate Policy: inv
I0217 17:41:07.920737 2913 solver.cpp:341] Iteration 0, Testing net (#0)
I0217 17:41:08.235076 2913 solver.cpp:409] Test net output #0: accuracy = 0.98
I0217 17:41:08.235194 2913 solver.cpp:409] Test net output #1: loss = 0.0560832 (* 1 = 0.0560832 loss)
I0217 17:41:35.831647 2913 solver.cpp:341] Iteration 100, Testing net (#0)
I0217 17:41:36.140849 2913 solver.cpp:409] Test net output #0: accuracy = 1
I0217 17:41:36.140949 2913 solver.cpp:409] Test net output #1: loss = 0.00757247 (* 1 = 0.00757247 loss)
I0217 17:42:05.465395 2913 solver.cpp:341] Iteration 200, Testing net (#0)
I0217 17:42:05.775877 2913 solver.cpp:409] Test net output #0: accuracy = 1
I0217 17:42:05.776000 2913 solver.cpp:409] Test net output #1: loss = 0.0144996 (* 1 = 0.0144996 loss)
.............
.............
Summarizing some information from the comments:
- You run test at intervals of test_interval:100 iterations.
- Each test interval goes over test_iter:5 * batch_size:10 = 50 samples.
- Your train and test sets seems to be very nit: all the negative samples (label=0) are grouped together before all the positive samples.
Consider your SGD iterative solver, you feed it batches of batch_size:10 during training. Your training set has 14,746 negative samples (that is 1474 batches) before any positive sample. So, for the first 1474 iterations your solver only "sees" negative examples and no positive ones.
What do you expect this solver will learn?
The problem
Your solver only sees negative examples, thus learns that no matter what the input is it should output "0". Your test set is also ordered in the same fashion, so testing only 50 samples at each test_interval, you only test on the negative examples in the test set resulting with a perfect accuracy of 1.
But as you noted, your net actually learned nothing.
Solution
I suppose you already guess what the solution should be by now. You need to shuffle your training set, and test your net on your entire test set.

Accuracy issue in caffe

I have a network which has 4 Boolean outputs. It is not a classification problem and each of them are meaningful. I expect to get a zero or one for each of them. Right now I have used the Euclidean loss function.
There are 1000000 samples. In the input file, each of them have 144 features, so there the size of the input is 1000000*144.
I have used batch size of 50, because otherwise the processing time is too much.
The output file is of the size 1000000*4, i.e. there are four output per each input.
When I am using the accuracy layer, it complains about the dimension of output. It needs just one Boolean output, not four. I think it is because it considers the problem as a classification problem.
I have two questions.
First, considering the error of the accuracy layer, is the Euclidean loss function suitable for this task? And How I can get the accuracy for my network?
Second,I gonna get the exact value of the predicted output for each of the four variable. I mean I need the exact predicted values for each test record. Now, I just have the loss value for each batch.
Please guide me to solve those issues.
Thanks,
Afshin
The train network is:
{ state {
phase: TRAIN
}
layer {
name: "abbas"
type: "HDF5Data"
top: "data"
top: "label"
hdf5_data_param {
source: "/home/afo214/Research/hdf5/simulation/Train-1000-11- 1/Train-Sc-B-1000-11-1.txt"
batch_size: 50
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "data"
top: "ip1"
inner_product_param {
num_output: 350
weight_filler {
type: "xavier"
}
}
}
layer {
name: "sig1"
bottom: "ip1"
top: "sig1"
type: "Sigmoid"
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "sig1"
top: "ip2"
inner_product_param {
num_output: 150
weight_filler {
type: "xavier"
}
}
}
The test network is also:
state {
phase: TEST
}
layer {
name: "abbas"
type: "HDF5Data"
top: "data"
top: "label"
hdf5_data_param {
source: "/home/afo214/Research/hdf5/simulation/Train-1000-11- 1/Train-Sc-B-1000-11-1.txt"
batch_size: 50
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "data"
top: "ip1"
inner_product_param {
num_output: 350
weight_filler {
type: "xavier"
}
}
}
layer {
name: "sig1"
bottom: "ip1"
top: "sig1"
type: "Sigmoid"
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "sig1"
top: "ip2"
inner_product_param {
num_output: 150
weight_filler {
type: "xavier"
}
}
}
layer {
name: "sig2"
bottom: "ip2"
top: "sig2"
type: "Sigmoid"
}
layer {
name: "ip4"
type: "InnerProduct"
bottom: "sig2"
top: "ip4"
inner_product_param {
num_output: 4
weight_filler {
type: "xavier"
}
}
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "ip4"
bottom: "label"
top: "accuracy"
}
layer {
name: "loss"
type: "EuclideanLoss"
bottom: "ip4"
bottom: "label"
top: "loss"
}
And I get this error:
accuracy_layer.cpp:34] Check failed: outer_num_ * inner_num_ == bottom[1]->count() (50 vs. 200) Number of labels must match number of predictions; e.g., if label axis == 1 and prediction shape is (N, C, H, W), label count (number of labels) must be N*H*W, with integer values in {0, 1, ..., C-1}.
Without using the accuracy layer caffe gives me the loss value.
Should "EuclideanLoss" be used for predicting binary outputs?
If you are trying to predict discrete binary labels then "EuclideanLoss" is not a very good choice. This loss is better suited for regression tasks where you wish to predict continuous values (e.g., estimating coordinated of bounding boxes etc.).
For predicting discrete labels, "SoftmaxWithLoss" or "InfogainLoss" are better suited. Usually, "SoftmaxWithLoss" is used.
For predicting binary outputs you may also consider "SigmoidCrossEntropyLoss".
Why is there an error in the "Accuracy" layer?
In caffe, "Accuracy" layers expects two inputs ("bottom"s): one is a prediction vector and the other is the ground truth expected discrete label.
In your case, you need to provide, for each binary output a vector of length 2 with the predicted probabilities of 0 and 1, and a single binary label:
layer {
name: "acc01"
type: "Accuracy"
bottom: "predict01"
bottom: "label01"
top: "acc01"
}
In this example you measure the accuracy for a single binary output. The input "predict01" is a two-vector for each example in the batch (for batch_size: 50 the shape of this blob should be 50-by-2).
What can you do?
You are trying to predict 4 different outputs in a single net, therefore, you need 4 different loss and accuracy layers.
First, you need to split ("Slice") the ground truth labels into 4 scalars (instead of a single binary 4-vector):
layer {
name: "label_split"
bottom: "label" # name of input 4-vector
top: "label01"
top: "label02"
top: "label03"
top: "label04"
type: "Slice"
slice_param {
axis: 1
slice_point: 1
slice_point: 2
slice_point: 3
}
}
Now you have to have a prediction, loss and accuracy layer for each of the binary labels
layer {
name: "predict01"
type: "InnerProduct"
bottom: "sig2"
top: "predict01"
inner_product_param {
num_outout: 2 # because you need to predict 2 probabilities one for False, one for True
...
}
layer {
name: "loss01"
type: "SoftmaxWithLoss"
bottom: "predict01"
bottom: "label01"
top: "loss01"
}
layer {
name: "acc01"
type: "Accuracy"
bottom: "predict01"
bottom: "label01"
top: "acc01"
}
Now you need to replicate these three layer for each of the four binary labels you wish to predict.

Resources