I have seen lots of examples showing how to insert image data for model training in Caffe.
I am trying to train a model using data that is not images. I can reshape it as a matrix or a vector (for every example), but I don't understand how to make my Caffe network read it.
I know that Caffe can work with lmdb/hdf5 databases, and I can additionally use a Python data layer.
I guess a Python data layer will be my best choice. Can someone provide an example of how to create some kind of array in Python and use it as training data for a Caffe model?
You do not need to create a python layer for simple vector inputs. HDF5 layer is probably easiest to work with. Just create HDF5 files with your favorite tool ( Refer this for creating HDF5 using matlab, or this for creating using python)
Both examples are fairly easy to follow. The matlab example gives you a more advanced version of HDF5 file creation--as in creating batches and all--but at its heart you just need to call
store2hdf5(filename, data, labels) %others are optional
Similarly, the python example also goes over complete example, all of which you may or may not need. At its core, creating HDF5 file is simply.
import h5py
with h5py.File('filename.h5', 'w') as f:
f['data'] = your_data
f['label'] = your_labels
You can easily use the file thus created in HDF5 Datalayer easily as follows. You just need to create a text file with list of HDF5 files you want to use.
layer {
name: "data"
type: "HDF5Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
hdf5_data_param {
source: "path_to_text_file_containing_list_of_HDF5_Files.txt" #
batch_size: 128
shuffle: true
}
}
Related
I wanna to create a classifier for an image dataset that each image is in multiple classes from all classes, so the target values are k-hot vectors. Now I create a text file which contains address if image file and space and a k-hot vector in each line but when i try to run scripts to create lmdb files it raise errors that can not open or find files. I try the same process with same data and just a number as class label and everything goes well. So I think it cannot parse .txt file correctly when labels are vectors.
Any suggestion...
Thank you
Caffe "Data" layers and convert_imageset script were written with a very specific use case in mind: image classification. Therefore the basic element stored in (and fetched from) LMDB by caffe is Datum that has a room for a single integer label.
You can see a more lengthy discussion on this subject here
It does not mean Caffe cannot facilitate different types of inputs/tasks.
You can use "HDF5Data" layer instead. When it comes to hdf5 inputs caffe has almost no restrictions on the input shape and size.
See, e.g., this answer and this one for more details on how to actually make it work.
i have a question regarding NVIDIA DIGITS Framework.
So i have been using caffe without DIGITS and have used HDF5 Layers so far. There i could use multiple "top" (data_0, data_1, data_2) inputs (see code below). So i could feed the net with more then one input image. But in DIGITS only lmdb input layer works.
So is it possible to create a lmdb input layer with multiple input images??
layer {
name: "data"
type: "HDF5Data"
top: "data_0"
top: "data_1"
top: "data_2"
top: "label"
hdf5_data_param {
source: "train.txt"
batch_size: 64
shuffle: true
}
}
Sorry, that isn't supported in DIGITS.
Since DIGITS manages your datasets for you, it also sets up the data layers in your network for you. This way you don't need to copy+paste the LMDB paths into your network when you want to run a previous network on a new dataset or when you move the location of your jobs on disk. It's a decision which sacrifices flexibility for the sake of making the common case easy.
For classification, one LMDB should have two tops: "data" and "label". For other dataset types, there should be one LMDB with a single "data" top, and another LMDB with a single "label" top. If you need a more complicated data layer setup, then you'll need to either use Caffe directly or make some changes to the DIGITS source code.
DIGITS's HDF5 support is not great because Caffe's HDF5 support is not great.
I'm preparing to train in Caffe using data in a hdf5 file. This file also contains the per-pixel mean data/image of the training set. In the file 'train_val.prototxt' for the input data layer in the section 'transform_params' it is possible to use a mean_file to normalize the data, usually in binaryproto format, for example for the ImageNet Caffe tutorial example:
transform_param {
mirror: true
crop_size: 227
mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
}
For per-channel normalization one can instead use mean_value instead of mean_file.
But is there any way to use mean image data directly from my database (here hdf5) file?
I have extracted the mean from the hdf5 to a numpy file but not sure if that can be used in the prototxt either or converted. I can't find info about this in the Caffe documentation.
AFAIK, "HDF5Data" layer does not support transformations. You should subtract the mean values yourself when you store the data to HDF5 files.
If you want to save a numpy array in a binaryproto format, you can see this answer for more details.
I used pre-trained GoogLeNet and then fine tuned it on my dataset for binary classification problem. Validation dataset seems to give the "loss3/top1" 98.5%. But when I evaluating the performance on my evaluation dataset it gives me 50% accuracy. Whatever changes I did it train_val.prototxt, I did the same changes in deploy.prototxt and I am not sure what changes should I do in these lines.
name: "GoogleNet"
layer {
name: "data"
type: "input"
top: "data"
input_param { shape: { dim:10 dim:3 dim:224 dim:224 } }
}
Any suggestions???
You do not need to change anything further in your deploy.prototxt*, but in the way you feed the data to the net. You must transform your evaluation images in the same way you transformed your training/validation images.
See, for example, how classifier.py puts the input images through a properly initialized caffe.io.Transformer class.
The "Input" layer you have in the prototxt is merely a declaration for caffe to allocate memory according to an input blob of shape 10-by-3-by-224-by-224.
* of course, you must verify that train_val.prototxt and deploy.prototxt are exactly the same (apart from the input layer(s) and loss layer(s)): that includes making sure layer names are identical as caffe uses layer names to assign weights from 'caffemodel' file to the actual parameters it loads. Mismatching names will cause caffe to use random weights for some of the layers.
How can I use caffe convnet to detect facial expressions?
I have a image dataset, Cohn Kanade, and I want to train caffe convnet with this dataset. Caffe has a documentation site, but its not explain how to train my own data. Just with pre trained data.
Can someone teach me how to do it?
Caffe supports multiple formats for the input data (HDF5/lmdb/leveldb). It's just a matter of picking the one you feel most comfortable with. Here are a couple of options:
caffe/build/tools/convert_imageset:
convert_imageset is one of the command line tools you get from building caffe.
Usage is along the lines of:
specifying a list of images and label pairs in a text file. 1 row per pair.
specifying where the images are located.
Choosing a backend db (which format). Default is lmdb which should be fine.
You need to write up a text file where each line starts with the filename of the image followed by a scalar label (e.g. 0, 1, 2,...)
Construct your lmdb in python using Caffe's Datum class:
This requires building caffe's python interface. Here you write some python code that:
iterates through a list of images
loads the images into a numpy array.
Constructs a caffe Datum object
Assigns the image data to the Datum object.
The Datum class has a member called label you can set it to the AU class from your CK dataset, if that is what you want your network to classify.
Writes the Datum object to the db and moves on to the next image.
Here's a code snippet of converting images to an lmdb from a blog post by Gustav Larsson. In his example he constructs an lmdb of images and label pairs for image classification.
Loading the lmdb into your network:
This is done exactly like in the LeNet example. This Data layer at the beginning of the network prototxt that describes the LeNet model.
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
scale: 0.00390625
}
data_param {
source: "examples/mnist/mnist_train_lmdb"
batch_size: 64
backend: LMDB
}
}
The source field is where you point caffe to the location of the lmdb you just created.
Something more related to performance and not critical to getting this to work is specifying how to normalize the input features. This is done through the transform_param field. CK+ has fixed size images, so no need for resizing. One thing you do need though is normalize the grayscale values. You can do this through mean subtraction. A simple of doing this is to replace the value of transform_param:scale with the mean value of the gray scale intensities in your CK+ dataset.