I am using caffe with the HDF5 layer. It will read my hdf5list.txt as
/home/data/file1.h5
/home/data/file2.h5
/home/data/file3.h5
In each file*.h5, I have 10.000 images. So, I have about 30.000 images in total. In each iteration, I will use batch size is 10 as the setting
layer {
name: "data"
type: "HDF5Data"
top: "data"
top: "label"
hdf5_data_param {
source: "./hdf5list.txt"
batch_size: 10
shuffle: true
}
include {
phase: TRAIN
}
}
Using caffe, Its output likes
Iterations 10, loss=100
Iterations 20, loss=90
...
My question is that how to compute the a number of epoch, respect to the loss? It means I want to plot a graph with x-axis is number of epoch and y-asix is loss.
Related link: Epoch vs iteration when training neural networks
If you want to do this for just the current problem, it is super easy. Note that
Epoch_index = floor((iteration_index * batch_size) / (# data_samples))
Now, in solver.cpp, find the line where Caffe prints Iterations ..., loss = .... Just compute epoch index using the above formula and print that too. You are done. Do not forget to recompile Caffe.
If you want to modify Caffe so that it always shows the epoch index, then you will first need to compute the data size from all your HDF5 files. By glancing the Caffe HDF5 layer code, I think you can get the number of data samples by hdf_blobs_[0]->shape(0). You should add this up for all HDF5 files and use that number in solver.cpp.
The variable hdf_blobs_ is defined in layers/hdf5_data_layer.cpp. I believe it is populated in the function util/hdf5.cpp. I think this is how the flow goes:
In layers/hdf5_data_layer.cpp, the hdf5 filenames are read from the text file.
Then a function LoadHDF5FileData attempts to load the hdf5 data into blobs.
Inside LoadHDF5FileData, the blob variable - hdf_blobs_ - is declared and it is populated inside the function util/hdf5.cpp.
Inside util/hdf5.cpp, the function hdf5_load_nd_dataset first calls hdf5_load_nd_dataset_helper that reshapes the blobs accordingly. I think this is where you will get the dimensions of your data for one hdf5 file. Iterating over multiple hdf5 files is done in the void HDF5DataLayer<Dtype>::Next() function in layers/hdf5_data_layer.cpp. So here you need to add up the data dimensions received earlier.
Finally, you need to figure out how to pass them back till solver.cpp.
Related
AFAIK, we have two ways to obtain the validation loss.
(1) online during training process by setting the solver as follows:
train_net: 'train.prototxt'
test_net: "test.prototxt"
test_iter: 200
test_interval: 100
(2) offline based on the weight in the .caffemodel file. In this question, I regard to the second way due to limited GPU. First, I saved the weight of network to .caffemodel after each 100 iterations by snapshot: 100. Based on these .caffemodel, I want to calculate the validation loss
../build/tools/caffe test -model ./test.prototxt -weights $snapshot -iterations 10 -gpu 0
where snapshot is file name of .caffemodel. For example snap_network_100.caffemodel
And the data layer of my test prototxt is
layer {
name: "data"
type: "HDF5Data"
top: "data"
top: "label"
include {
phase: TEST
}
hdf5_data_param {
source: "./list.txt"
batch_size: 8
shuffle: true
}
}
The first and the second ways give different validation loss. I found that the first way the validation loss independent of batch size. It means the validation loss is same with different batch size. While, the second way, the validation loss changed with different batch size but the loss is very close together with different iterations.
My question is that which way is correct to compute validation loss?
You compute the validation loss for different number of iterations:
test_iter: 200
In your 'solver.prototxt', vs. -iterations 10 when running from command line. This means you are averaging the loss over different number of validation samples.
Since you are using far less samples when validating from command line, you are much more sensitive to batch_size.
Make sure you are using exactly the same settings and verify that the validation loss is indeed the same.
I am training a network with batch optimization over my training set, and I would like to get a loss vector containing the loss of each of my training examples.
More specifically I am using images (of size 3x64x64) in a batch of size 64. Therefore my input is a tensor of size 64x3x64x64.
During training when I write
output = net:forward(input)
loss = criterion:forward(input, target)
loss is a number, but I would like to get a tensor (of size 64) with one entry per image in my batch, corresponding to the loss value of this precise image.
Is there a way to do that without looping on the first dimension of my input tensor?
The forward method calls another method, the updateOutput method which can be overwritten.
For eg., in case of MSECriterion(), you can change the method by commenting the call to the THNN library and write on your own how you want the criterion to function, i.e., do a normal element wise subtraction and then square(again element wise) and divide by the total number of data points(again element wise); then return the output as a tensor.
You will also need to recompile the nn package once you have changed this using luarocks make rocks/[the scm file in the folder] after navigating to the nn folder.
I used pre-trained GoogLeNet and then fine tuned it on my dataset for binary classification problem. Validation dataset seems to give the "loss3/top1" 98.5%. But when I evaluating the performance on my evaluation dataset it gives me 50% accuracy. Whatever changes I did it train_val.prototxt, I did the same changes in deploy.prototxt and I am not sure what changes should I do in these lines.
name: "GoogleNet"
layer {
name: "data"
type: "input"
top: "data"
input_param { shape: { dim:10 dim:3 dim:224 dim:224 } }
}
Any suggestions???
You do not need to change anything further in your deploy.prototxt*, but in the way you feed the data to the net. You must transform your evaluation images in the same way you transformed your training/validation images.
See, for example, how classifier.py puts the input images through a properly initialized caffe.io.Transformer class.
The "Input" layer you have in the prototxt is merely a declaration for caffe to allocate memory according to an input blob of shape 10-by-3-by-224-by-224.
* of course, you must verify that train_val.prototxt and deploy.prototxt are exactly the same (apart from the input layer(s) and loss layer(s)): that includes making sure layer names are identical as caffe uses layer names to assign weights from 'caffemodel' file to the actual parameters it loads. Mismatching names will cause caffe to use random weights for some of the layers.
This is really weird. I'm implementing this model:
Except that I read data from a text file using an ImageData blob, batch_size: 1. There are only two labels and the text file is organized as usual
/home/.../pathToFile 0
...
/home/.../pathToFile 1
Still, Caffe only trains and tests label 0!
I run caffe using the regular tool.
./build/tools/caffe train --solver=solver.prototxt
When I open the net in pycaffe I get this message for the first time ever:
WARNING: Logging before InitGoogleLogging() is written to STDERR
and the size of the
net.blobs['label'].data
is now 1, when it should be 2!
Not only that but that label seems to be a float rather than an integer.
In: net.blobs['label'].data
Out: array([ 0.], dtype=float32)
I know that this has worked before, I just can't get my head around what I'm doing wrong or where to begin troubleshoot.
The output shape of your network depends on the input batch_size: if you define batch_size: 1 than your net processes a single example each time, thus it only reads a single label. If you change batch_size to 2, caffe will read two samples and consequently the shape of label will become 2.
One exception to this "shape rule" is the loss output: the loss defines a scalar function with respect to which gradients are computed. Thus, the loss output will always be a scalar regardless of the input shape.
Regarding the data type of label: Caffe stores all variables in "Blobs" of type float32.
How can I use caffe convnet to detect facial expressions?
I have a image dataset, Cohn Kanade, and I want to train caffe convnet with this dataset. Caffe has a documentation site, but its not explain how to train my own data. Just with pre trained data.
Can someone teach me how to do it?
Caffe supports multiple formats for the input data (HDF5/lmdb/leveldb). It's just a matter of picking the one you feel most comfortable with. Here are a couple of options:
caffe/build/tools/convert_imageset:
convert_imageset is one of the command line tools you get from building caffe.
Usage is along the lines of:
specifying a list of images and label pairs in a text file. 1 row per pair.
specifying where the images are located.
Choosing a backend db (which format). Default is lmdb which should be fine.
You need to write up a text file where each line starts with the filename of the image followed by a scalar label (e.g. 0, 1, 2,...)
Construct your lmdb in python using Caffe's Datum class:
This requires building caffe's python interface. Here you write some python code that:
iterates through a list of images
loads the images into a numpy array.
Constructs a caffe Datum object
Assigns the image data to the Datum object.
The Datum class has a member called label you can set it to the AU class from your CK dataset, if that is what you want your network to classify.
Writes the Datum object to the db and moves on to the next image.
Here's a code snippet of converting images to an lmdb from a blog post by Gustav Larsson. In his example he constructs an lmdb of images and label pairs for image classification.
Loading the lmdb into your network:
This is done exactly like in the LeNet example. This Data layer at the beginning of the network prototxt that describes the LeNet model.
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
scale: 0.00390625
}
data_param {
source: "examples/mnist/mnist_train_lmdb"
batch_size: 64
backend: LMDB
}
}
The source field is where you point caffe to the location of the lmdb you just created.
Something more related to performance and not critical to getting this to work is specifying how to normalize the input features. This is done through the transform_param field. CK+ has fixed size images, so no need for resizing. One thing you do need though is normalize the grayscale values. You can do this through mean subtraction. A simple of doing this is to replace the value of transform_param:scale with the mean value of the gray scale intensities in your CK+ dataset.