NVIDIA DIGITS - Multiple Input layers - machine-learning

i have a question regarding NVIDIA DIGITS Framework.
So i have been using caffe without DIGITS and have used HDF5 Layers so far. There i could use multiple "top" (data_0, data_1, data_2) inputs (see code below). So i could feed the net with more then one input image. But in DIGITS only lmdb input layer works.
So is it possible to create a lmdb input layer with multiple input images??
layer {
name: "data"
type: "HDF5Data"
top: "data_0"
top: "data_1"
top: "data_2"
top: "label"
hdf5_data_param {
source: "train.txt"
batch_size: 64
shuffle: true
}
}

Sorry, that isn't supported in DIGITS.
Since DIGITS manages your datasets for you, it also sets up the data layers in your network for you. This way you don't need to copy+paste the LMDB paths into your network when you want to run a previous network on a new dataset or when you move the location of your jobs on disk. It's a decision which sacrifices flexibility for the sake of making the common case easy.
For classification, one LMDB should have two tops: "data" and "label". For other dataset types, there should be one LMDB with a single "data" top, and another LMDB with a single "label" top. If you need a more complicated data layer setup, then you'll need to either use Caffe directly or make some changes to the DIGITS source code.
DIGITS's HDF5 support is not great because Caffe's HDF5 support is not great.

Related

forward network in CNN

I am new to deep learning. I found there are two prototxt files when I used caffe, one is "deploy" and another is "train_val".
I know that "train_val" is used to train the model. But for the "deploy" file some people said it is for test the image.
So, my question is does the "deploy" only have forward() network so the test image data only go through the forward network for once to get the score?
As you already noted there are some fundamental differences between 'train_val.prototxt' and 'deploy.prototxt'.
One key difference is that 'deploy.prototxt' usually lack any loss layer.
When you have no loss function defined for a net, there is no meaning of backward propagation: what gradients would you propagate? gradients of what function?
Therefore, a net object in caffe has backward() method implemented for all phases. Nevertheless, this method is meaningless when you test the net with no loss function (only prediction).
ideally that is how it should work,but the files are just network definition.You can use one single file to both train and test.You have to specify what phase you you want some blobs to be availables,meaning you can definite two inputData layer ,one that would be used during training,and another used for testing and specify the corresponding phase like this:
name: "MyModel"
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: false
crop_size: 227
mean_file: "data/train_mean.binaryproto" # location of the training data mean
}
data_param {
source: "data/train_lmdb" # location of the training samples
batch_size: 128 # how many samples are grouped into one mini-batch
backend: LMDB
}
}
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
During training, the previous layer will be used where the the second will be ignored.
During test phase,the first layer will be ignored the second layer will be used as input for testing.
Another point is that during testing ,we need the accuracy of our prediction as we don't need to update our weights anymore
l
layer {
name: "accuracy"
type: "Accuracy"
bottom: "fc8"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc8"
bottom: "label"
top: "loss"
}
if the include directive is not given,the layer is included in all phases.
Although you can include the accuracy layer also during training to see how the output are goings(ie: for measuring accuracy improvement after how many iterations),we need it more on predictions .
in your solver,you can specify test_iter to specify after how many iteration test operation will be carried out(You validate your model each test_iter iterations)
train_val and deploy file separate those two phases into two different files.All specification in train_val are related to the training phase. and deploy for testing.I am not sure,where the train_val combination came from,but i suppose it was due to the fact that you can validate your model after test_iter and continue to train again from there.
As you dont need the loss during test,rather than the probability,you can use softmax for probability out function in stead of of softmaxwithloss in deploy or you can have both defined.
The caffe test command performs forward operation but doesn't do the backward()(back propagation) operation.I hope it helps

How to compute epoch from iteration number using HDF5 layer?

I am using caffe with the HDF5 layer. It will read my hdf5list.txt as
/home/data/file1.h5
/home/data/file2.h5
/home/data/file3.h5
In each file*.h5, I have 10.000 images. So, I have about 30.000 images in total. In each iteration, I will use batch size is 10 as the setting
layer {
name: "data"
type: "HDF5Data"
top: "data"
top: "label"
hdf5_data_param {
source: "./hdf5list.txt"
batch_size: 10
shuffle: true
}
include {
phase: TRAIN
}
}
Using caffe, Its output likes
Iterations 10, loss=100
Iterations 20, loss=90
...
My question is that how to compute the a number of epoch, respect to the loss? It means I want to plot a graph with x-axis is number of epoch and y-asix is loss.
Related link: Epoch vs iteration when training neural networks
If you want to do this for just the current problem, it is super easy. Note that
Epoch_index = floor((iteration_index * batch_size) / (# data_samples))
Now, in solver.cpp, find the line where Caffe prints Iterations ..., loss = .... Just compute epoch index using the above formula and print that too. You are done. Do not forget to recompile Caffe.
If you want to modify Caffe so that it always shows the epoch index, then you will first need to compute the data size from all your HDF5 files. By glancing the Caffe HDF5 layer code, I think you can get the number of data samples by hdf_blobs_[0]->shape(0). You should add this up for all HDF5 files and use that number in solver.cpp.
The variable hdf_blobs_ is defined in layers/hdf5_data_layer.cpp. I believe it is populated in the function util/hdf5.cpp. I think this is how the flow goes:
In layers/hdf5_data_layer.cpp, the hdf5 filenames are read from the text file.
Then a function LoadHDF5FileData attempts to load the hdf5 data into blobs.
Inside LoadHDF5FileData, the blob variable - hdf_blobs_ - is declared and it is populated inside the function util/hdf5.cpp.
Inside util/hdf5.cpp, the function hdf5_load_nd_dataset first calls hdf5_load_nd_dataset_helper that reshapes the blobs accordingly. I think this is where you will get the dimensions of your data for one hdf5 file. Iterating over multiple hdf5 files is done in the void HDF5DataLayer<Dtype>::Next() function in layers/hdf5_data_layer.cpp. So here you need to add up the data dimensions received earlier.
Finally, you need to figure out how to pass them back till solver.cpp.

How to use data different than images with Caffe?

I have seen lots of examples showing how to insert image data for model training in Caffe.
I am trying to train a model using data that is not images. I can reshape it as a matrix or a vector (for every example), but I don't understand how to make my Caffe network read it.
I know that Caffe can work with lmdb/hdf5 databases, and I can additionally use a Python data layer.
I guess a Python data layer will be my best choice. Can someone provide an example of how to create some kind of array in Python and use it as training data for a Caffe model?
You do not need to create a python layer for simple vector inputs. HDF5 layer is probably easiest to work with. Just create HDF5 files with your favorite tool ( Refer this for creating HDF5 using matlab, or this for creating using python)
Both examples are fairly easy to follow. The matlab example gives you a more advanced version of HDF5 file creation--as in creating batches and all--but at its heart you just need to call
store2hdf5(filename, data, labels) %others are optional
Similarly, the python example also goes over complete example, all of which you may or may not need. At its core, creating HDF5 file is simply.
import h5py
with h5py.File('filename.h5', 'w') as f:
f['data'] = your_data
f['label'] = your_labels
You can easily use the file thus created in HDF5 Datalayer easily as follows. You just need to create a text file with list of HDF5 files you want to use.
layer {
name: "data"
type: "HDF5Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
hdf5_data_param {
source: "path_to_text_file_containing_list_of_HDF5_Files.txt" #
batch_size: 128
shuffle: true
}
}

Modifying Deploy.prototxt in GoogLeNet

I used pre-trained GoogLeNet and then fine tuned it on my dataset for binary classification problem. Validation dataset seems to give the "loss3/top1" 98.5%. But when I evaluating the performance on my evaluation dataset it gives me 50% accuracy. Whatever changes I did it train_val.prototxt, I did the same changes in deploy.prototxt and I am not sure what changes should I do in these lines.
name: "GoogleNet"
layer {
name: "data"
type: "input"
top: "data"
input_param { shape: { dim:10 dim:3 dim:224 dim:224 } }
}
Any suggestions???
You do not need to change anything further in your deploy.prototxt*, but in the way you feed the data to the net. You must transform your evaluation images in the same way you transformed your training/validation images.
See, for example, how classifier.py puts the input images through a properly initialized caffe.io.Transformer class.
The "Input" layer you have in the prototxt is merely a declaration for caffe to allocate memory according to an input blob of shape 10-by-3-by-224-by-224.
* of course, you must verify that train_val.prototxt and deploy.prototxt are exactly the same (apart from the input layer(s) and loss layer(s)): that includes making sure layer names are identical as caffe uses layer names to assign weights from 'caffemodel' file to the actual parameters it loads. Mismatching names will cause caffe to use random weights for some of the layers.

How to use caffe convnet library to detect facial expressions?

How can I use caffe convnet to detect facial expressions?
I have a image dataset, Cohn Kanade, and I want to train caffe convnet with this dataset. Caffe has a documentation site, but its not explain how to train my own data. Just with pre trained data.
Can someone teach me how to do it?
Caffe supports multiple formats for the input data (HDF5/lmdb/leveldb). It's just a matter of picking the one you feel most comfortable with. Here are a couple of options:
caffe/build/tools/convert_imageset:
convert_imageset is one of the command line tools you get from building caffe.
Usage is along the lines of:
specifying a list of images and label pairs in a text file. 1 row per pair.
specifying where the images are located.
Choosing a backend db (which format). Default is lmdb which should be fine.
You need to write up a text file where each line starts with the filename of the image followed by a scalar label (e.g. 0, 1, 2,...)
Construct your lmdb in python using Caffe's Datum class:
This requires building caffe's python interface. Here you write some python code that:
iterates through a list of images
loads the images into a numpy array.
Constructs a caffe Datum object
Assigns the image data to the Datum object.
The Datum class has a member called label you can set it to the AU class from your CK dataset, if that is what you want your network to classify.
Writes the Datum object to the db and moves on to the next image.
Here's a code snippet of converting images to an lmdb from a blog post by Gustav Larsson. In his example he constructs an lmdb of images and label pairs for image classification.
Loading the lmdb into your network:
This is done exactly like in the LeNet example. This Data layer at the beginning of the network prototxt that describes the LeNet model.
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
scale: 0.00390625
}
data_param {
source: "examples/mnist/mnist_train_lmdb"
batch_size: 64
backend: LMDB
}
}
The source field is where you point caffe to the location of the lmdb you just created.
Something more related to performance and not critical to getting this to work is specifying how to normalize the input features. This is done through the transform_param field. CK+ has fixed size images, so no need for resizing. One thing you do need though is normalize the grayscale values. You can do this through mean subtraction. A simple of doing this is to replace the value of transform_param:scale with the mean value of the gray scale intensities in your CK+ dataset.

Resources