Trying to train LeNet on my own dataset. I generated HDF5 file from my long 1D vectordata set and created HDF5 data layer as follows: I named the top blobs same as I did when I generate my HDF5.
name: "Test_net"
layer {
name: "data"
type: "HDF5Data"
top: "Inputdata"
top: "label"
hdf5_data_param {
source:"~/*_hdf5_train.txt"
batch_size: 32
}
include{phase: TRAIN}
}
layer {
name: "data2"
type: "HDF5Data"
top: "Inputdata"
top: "label"
hdf5_data_param {
source:"~/*_hdf5_test.txt"
batch_size: 32
}
include{phase: TEST}
}
layer {
name: "conv1"
type: "convolution"
bottom: "data"
top: "conv1"
param {lr_mult:1}
param {lr_mult:2}
convolution_param{
num_output: 20
kernel_h: 1
kernel_w: 5
stride_h: 1
stride_w: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "xavier"
}
}
}
layer {
name: "pool1"
type: "pooling"
bottom: "conv1"
top: "pool1"
pooling_param{
pool: MAX
kernel_h: 1
kernel_w: 2
stride_h: 1
stride_w: 2
}
}
# more layers here...
layer{
name: "loss"
type: "SigmoidCrossEntropyLoss"
bottom: "ip2"
bottom: "label"
top: "loss"
}
But then when I tried to train I am having the following error from insert_split.cpp.
insert_splits.cpp:29] Unknown bottom blob 'data' (layer 'conv1', bottom index 0)
*** Check failure stack trace: ***
# 0x7f19d7e735cd google::LogMessage::Fail()
# 0x7f19d7e75433 google::LogMessage::SendToLog()
# 0x7f19d7e7315b google::LogMessage::Flush()
# 0x7f19d7e75e1e google::LogMessageFatal::~LogMessageFatal()
# 0x7f19d82684dc caffe::InsertSplits()
# 0x7f19d8230d5e caffe::Net<>::Init()
# 0x7f19d8233f21 caffe::Net<>::Net()
# 0x7f19d829c68a caffe::Solver<>::InitTrainNet()
# 0x7f19d829d9f7 caffe::Solver<>::Init()
# 0x7f19d829dd9a caffe::Solver<>::Solver()
# 0x7f19d8211683 caffe::Creator_SGDSolver<>()
# 0x40a6c9 train()
# 0x4071c0 main
# 0x7f19d6dc8830 __libc_start_main
# 0x4079e9 _start
# (nil) (unknown)
Aborted (core dumped)
What did I do wrong?
Cheers,
Your data layer outputs two "blobs": "label" and "Inputdata". Your "conv1" layer expects as input a "blob" named "data". Caffe does not know that you meant "Inputdata" and "data" to be the same blob...
Now, since you already saved the hdf5 files with "Inputdata" name, you cannot change this name in the "HDF5Data" layer, what you can do is change "data" to "Inputdata" in the "bottom" of "conv1" layer.
PS,
Your loss layer requires two "bottom"s: ip2 and label you forgot to feed.
Related
I'm trying to learn caffe by making an xor example.
I'm following this link from the caffe website, but they are doing a CNN.
I'm trying to follow along the tutorial and I am stuck when it comes to compiling the model.
I made a prototxt file describes the model architecture, I am trying to make a two layered xor network. My code is below:
name: "xor_test"
layer {
name: "data"
type: "Data"
transform_param {
scale: 1
}
data_param {
source: "0 0 0
1 0 1
0 1 1
1 1 0"
backend: LMDB
batch_size: 1
}
top: "data"
top: "data"
}
layer {
name: "ip1"
type: "InnerProduct"
param { lr_mult: 1 }
param { lr_mult: 2 }
inner_product_param {
num_output: 3
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
bottom: "data"
top: "ip1"
}
layer {
name: "tanh1"
type: "Tanh"
bottom: "ip1"
top: "ip1"
}
layer {
name: "ip2"
type: "InnerProduct"
param { lr_mult: 1 }
param { lr_mult: 2 }
inner_product_param {
num_output: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
bottom: "ip1"
top: "ip2"
}
layer {
name: "tanh2"
type: "Tanh"
bottom: "ip2"
top: "ip2"
}
I don't know if this model is correct, I can't find other examples to reference.
After this, the tutorial says to create a solver prototxt file which referenced the previously created file.
net: "test.prototxt"
test_iter: 2
test_interval: 5
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
lr_policy: "inv"
gamma: 0.0001
power: 0.75
display: 5
# The maximum number of iterations
max_iter: 10
# solver mode: CPU or GPU
solver_mode: CPU
I'm not sure how to train or test the model since my inputs are not images.
Your input layer is incorrect. Since you are not using images as inputs, but rather simple binary vectors, you might consider using HDF5Data layer for input.
There is a good example here on how to construct and use this input data layer.
I'm trying to reproduce following thesis with caffe
Deep EXpectation
Last layer has 100 outputs, each layer is implying probability of predicted age. And final predicted age is calculated by following equation:
so I want to make loss using EUCLIDEAN_LOSS with label and Predicted value.
I show my prototxt for last output layer and loss layer.
layer {
bottom: "pool5"
top: "fc100"
name: "fc100"
type: "InnerProduct"
inner_product_param {
num_output: 100
}
}
layer {
bottom: "fc100"
top: "prob"
name: "prob"
type: "Softmax"
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc100"
bottom: "label"
top: "loss"
loss_weight: 1
}
Just for now, I am trying these with SoftmaxWithLoss. However, this loss is more appropriate to classification not for regression. How can I design the loss layer for in this case?
Thanks in advance.
TL;DR
I've been through similar task once, and from my experience there was little difference (in terms of output accuracy) between training discrete labels and regressing a single continuous value.
There are several ways you can approach this problem:
1. Regressing a single output
Since you only need to predict a single scalar value, you should train your net to do just so:
layer {
bottom: "pool5"
top: "fc1"
name: "fc1"
type: "InnerProduct"
inner_product_param {
num_output: 1 # predict single output
}
}
You need to make sure the predicted value is in range [0..99]:
layer {
bottom: "fc1"
top: "pred01" # map to [0..1] range
type: "Sigmoid"
name: "pred01"
}
layer {
bottom: "pred01"
top: "pred_age"
type: "Scale"
name: "pred_age"
param { lr_mult: 0 } # do not learn this scale - it is fixed
scale_param {
bias_term: false
filler { type: "constant" value: 99 }
}
}
Once you have the prediction in pred_age you can add a loss layer
layer {
bottom: "pred_age"
bottom: "true_age"
top: "loss"
type: "EuclideanLoss"
name: "loss"
}
Though, I would advice to use "SmoothL1" in this case as it is more robust.
2. Regressing the expectation of the discrete prediction
You can implement your prediction formula in caffe. You need a fixed vector of values [0..99] for that. There are many ways to do that, none is very straight-forward. Here's one way using net-surgery:
First, define the net
layer {
bottom: "prob"
top: "pred_age"
name: "pred_age"
type: "Convolution"
param { lr_mult: 0 } # fixed layer.
convolution_param {
num_output: 1
bias_term: false
}
}
layer {
bottom: "pred_age"
bottom: "true_age"
top: "loss"
type: "EuclideanLoss" # same comment about type of loss as before
name: "loss"
}
You cannot use this net yet, first you need to set the kernel of pred_age layer to 0..99.
In python, load the new
net = caffe.Net('path/to/train_val.prototxt', caffe.TRAIN)
li = list(net._layer_names).index('pred_age') # get layer index
net.layers[li].blobs[0].data[...] = np.arange(100, dtype=np.float32) # set the kernel
net.save('/path/to/init_weights.caffemodel') # save the weights
Now you can train your net, but MAKE SURE you are starting your train from the weights saved in '/path/to/init_weights.caffemodel'.
Right now, I am train network with with 2 class data... but accuracy is constant 1 after first iteration !
Input data is grayscale images. both class images are randomly selected when HDF5Data creation.
Why is that happened ? What's wrong or where is mistake !
network.prototxt :
name: "brainMRI"
layer {
name: "data"
type: "HDF5Data"
top: "data"
top: "label"
include: {
phase: TRAIN
}
hdf5_data_param {
source: "/home/shivangpatel/caffe/brainMRI1/train_file_location.txt"
batch_size: 10
}
}
layer {
name: "data"
type: "HDF5Data"
top: "data"
top: "label"
include: {
phase: TEST
}
hdf5_data_param {
source: "/home/shivangpatel/caffe/brainMRI1/test_file_location.txt"
batch_size: 10
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool2"
top: "ip1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "ip1"
top: "ip1"
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "ip1"
top: "ip2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 2
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "softmax"
type: "Softmax"
bottom: "ip2"
top: "smip2"
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "ip2"
bottom: "label"
top: "loss"
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "smip2"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
Output :
I0217 17:41:07.912580 2913 net.cpp:270] This network produces output loss
I0217 17:41:07.912607 2913 net.cpp:283] Network initialization done.
I0217 17:41:07.912739 2913 solver.cpp:60] Solver scaffolding done.
I0217 17:41:07.912789 2913 caffe.cpp:212] Starting Optimization
I0217 17:41:07.912813 2913 solver.cpp:288] Solving brainMRI
I0217 17:41:07.912832 2913 solver.cpp:289] Learning Rate Policy: inv
I0217 17:41:07.920737 2913 solver.cpp:341] Iteration 0, Testing net (#0)
I0217 17:41:08.235076 2913 solver.cpp:409] Test net output #0: accuracy = 0.98
I0217 17:41:08.235194 2913 solver.cpp:409] Test net output #1: loss = 0.0560832 (* 1 = 0.0560832 loss)
I0217 17:41:35.831647 2913 solver.cpp:341] Iteration 100, Testing net (#0)
I0217 17:41:36.140849 2913 solver.cpp:409] Test net output #0: accuracy = 1
I0217 17:41:36.140949 2913 solver.cpp:409] Test net output #1: loss = 0.00757247 (* 1 = 0.00757247 loss)
I0217 17:42:05.465395 2913 solver.cpp:341] Iteration 200, Testing net (#0)
I0217 17:42:05.775877 2913 solver.cpp:409] Test net output #0: accuracy = 1
I0217 17:42:05.776000 2913 solver.cpp:409] Test net output #1: loss = 0.0144996 (* 1 = 0.0144996 loss)
.............
.............
Summarizing some information from the comments:
- You run test at intervals of test_interval:100 iterations.
- Each test interval goes over test_iter:5 * batch_size:10 = 50 samples.
- Your train and test sets seems to be very nit: all the negative samples (label=0) are grouped together before all the positive samples.
Consider your SGD iterative solver, you feed it batches of batch_size:10 during training. Your training set has 14,746 negative samples (that is 1474 batches) before any positive sample. So, for the first 1474 iterations your solver only "sees" negative examples and no positive ones.
What do you expect this solver will learn?
The problem
Your solver only sees negative examples, thus learns that no matter what the input is it should output "0". Your test set is also ordered in the same fashion, so testing only 50 samples at each test_interval, you only test on the negative examples in the test set resulting with a perfect accuracy of 1.
But as you noted, your net actually learned nothing.
Solution
I suppose you already guess what the solution should be by now. You need to shuffle your training set, and test your net on your entire test set.
I have a network which has 4 Boolean outputs. It is not a classification problem and each of them are meaningful. I expect to get a zero or one for each of them. Right now I have used the Euclidean loss function.
There are 1000000 samples. In the input file, each of them have 144 features, so there the size of the input is 1000000*144.
I have used batch size of 50, because otherwise the processing time is too much.
The output file is of the size 1000000*4, i.e. there are four output per each input.
When I am using the accuracy layer, it complains about the dimension of output. It needs just one Boolean output, not four. I think it is because it considers the problem as a classification problem.
I have two questions.
First, considering the error of the accuracy layer, is the Euclidean loss function suitable for this task? And How I can get the accuracy for my network?
Second,I gonna get the exact value of the predicted output for each of the four variable. I mean I need the exact predicted values for each test record. Now, I just have the loss value for each batch.
Please guide me to solve those issues.
Thanks,
Afshin
The train network is:
{ state {
phase: TRAIN
}
layer {
name: "abbas"
type: "HDF5Data"
top: "data"
top: "label"
hdf5_data_param {
source: "/home/afo214/Research/hdf5/simulation/Train-1000-11- 1/Train-Sc-B-1000-11-1.txt"
batch_size: 50
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "data"
top: "ip1"
inner_product_param {
num_output: 350
weight_filler {
type: "xavier"
}
}
}
layer {
name: "sig1"
bottom: "ip1"
top: "sig1"
type: "Sigmoid"
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "sig1"
top: "ip2"
inner_product_param {
num_output: 150
weight_filler {
type: "xavier"
}
}
}
The test network is also:
state {
phase: TEST
}
layer {
name: "abbas"
type: "HDF5Data"
top: "data"
top: "label"
hdf5_data_param {
source: "/home/afo214/Research/hdf5/simulation/Train-1000-11- 1/Train-Sc-B-1000-11-1.txt"
batch_size: 50
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "data"
top: "ip1"
inner_product_param {
num_output: 350
weight_filler {
type: "xavier"
}
}
}
layer {
name: "sig1"
bottom: "ip1"
top: "sig1"
type: "Sigmoid"
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "sig1"
top: "ip2"
inner_product_param {
num_output: 150
weight_filler {
type: "xavier"
}
}
}
layer {
name: "sig2"
bottom: "ip2"
top: "sig2"
type: "Sigmoid"
}
layer {
name: "ip4"
type: "InnerProduct"
bottom: "sig2"
top: "ip4"
inner_product_param {
num_output: 4
weight_filler {
type: "xavier"
}
}
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "ip4"
bottom: "label"
top: "accuracy"
}
layer {
name: "loss"
type: "EuclideanLoss"
bottom: "ip4"
bottom: "label"
top: "loss"
}
And I get this error:
accuracy_layer.cpp:34] Check failed: outer_num_ * inner_num_ == bottom[1]->count() (50 vs. 200) Number of labels must match number of predictions; e.g., if label axis == 1 and prediction shape is (N, C, H, W), label count (number of labels) must be N*H*W, with integer values in {0, 1, ..., C-1}.
Without using the accuracy layer caffe gives me the loss value.
Should "EuclideanLoss" be used for predicting binary outputs?
If you are trying to predict discrete binary labels then "EuclideanLoss" is not a very good choice. This loss is better suited for regression tasks where you wish to predict continuous values (e.g., estimating coordinated of bounding boxes etc.).
For predicting discrete labels, "SoftmaxWithLoss" or "InfogainLoss" are better suited. Usually, "SoftmaxWithLoss" is used.
For predicting binary outputs you may also consider "SigmoidCrossEntropyLoss".
Why is there an error in the "Accuracy" layer?
In caffe, "Accuracy" layers expects two inputs ("bottom"s): one is a prediction vector and the other is the ground truth expected discrete label.
In your case, you need to provide, for each binary output a vector of length 2 with the predicted probabilities of 0 and 1, and a single binary label:
layer {
name: "acc01"
type: "Accuracy"
bottom: "predict01"
bottom: "label01"
top: "acc01"
}
In this example you measure the accuracy for a single binary output. The input "predict01" is a two-vector for each example in the batch (for batch_size: 50 the shape of this blob should be 50-by-2).
What can you do?
You are trying to predict 4 different outputs in a single net, therefore, you need 4 different loss and accuracy layers.
First, you need to split ("Slice") the ground truth labels into 4 scalars (instead of a single binary 4-vector):
layer {
name: "label_split"
bottom: "label" # name of input 4-vector
top: "label01"
top: "label02"
top: "label03"
top: "label04"
type: "Slice"
slice_param {
axis: 1
slice_point: 1
slice_point: 2
slice_point: 3
}
}
Now you have to have a prediction, loss and accuracy layer for each of the binary labels
layer {
name: "predict01"
type: "InnerProduct"
bottom: "sig2"
top: "predict01"
inner_product_param {
num_outout: 2 # because you need to predict 2 probabilities one for False, one for True
...
}
layer {
name: "loss01"
type: "SoftmaxWithLoss"
bottom: "predict01"
bottom: "label01"
top: "loss01"
}
layer {
name: "acc01"
type: "Accuracy"
bottom: "predict01"
bottom: "label01"
top: "acc01"
}
Now you need to replicate these three layer for each of the four binary labels you wish to predict.
i am extracting 30 facial keypoints (x,y) from an input image as per kaggle facialkeypoints competition.
How do i setup caffe to run a regression and produce 30 dimensional output??.
Input: 96x96 image
Output: 30 - (30 dimensions).
How do i setup caffe accordingly?. I am using EUCLIDEAN_LOSS (sum of squares) to get the regressed output. Here is a simple logistic regressor model using caffe but it is not working. Looks accuracy layer cannot handle multi-label output.
I0120 17:51:27.039113 4113 net.cpp:394] accuracy <- label_fkp_1_split_1
I0120 17:51:27.039135 4113 net.cpp:356] accuracy -> accuracy
I0120 17:51:27.039158 4113 net.cpp:96] Setting up accuracy
F0120 17:51:27.039201 4113 accuracy_layer.cpp:26] Check failed: bottom[1]->channels() == 1 (30 vs. 1)
*** Check failure stack trace: ***
# 0x7f7c2711bdaa (unknown)
# 0x7f7c2711bce4 (unknown)
# 0x7f7c2711b6e6 (unknown)
Here is the layer file:
name: "LogReg"
layers {
name: "fkp"
top: "data"
top: "label"
type: HDF5_DATA
hdf5_data_param {
source: "train.txt"
batch_size: 100
}
include: { phase: TRAIN }
}
layers {
name: "fkp"
type: HDF5_DATA
top: "data"
top: "label"
hdf5_data_param {
source: "test.txt"
batch_size: 100
}
include: { phase: TEST }
}
layers {
name: "ip"
type: INNER_PRODUCT
bottom: "data"
top: "ip"
inner_product_param {
num_output: 30
}
}
layers {
name: "loss"
type: EUCLIDEAN_LOSS
bottom: "ip"
bottom: "label"
top: "loss"
}
layers {
name: "accuracy"
type: ACCURACY
bottom: "ip"
bottom: "label"
top: "accuracy"
include: { phase: TEST }
}
i found it :)
I replaced the SOFTLAYER to EUCLIDEAN_LOSS function and changed the number of outputs. It worked.
layers {
name: "loss"
type: EUCLIDEAN_LOSS
bottom: "ip1"
bottom: "label"
top: "loss"
}
HINGE_LOSS is also another option.