Slicing operation in tensorflow - machine-learning

I am passing a batch of images to my neural network. Let's say the shape of a batch is (4, 224,224,3). Now I want to apply slicing operation on my batch so that I can get two tensors of shape (2,224,224,3) respectively. How can I do this using tf.slice() or something like that?

I think you rather want to use tf.split. E.g in your case,
tf.split(my_tensor, 2)

Related

PyTorch: Batch size and individual datum in nn.Module

In pytorch nn.Module, the model created seems to be agnostic of the batch size. That is, if an individual datum is 128 dimensions, and we are training in batches of 64, the model should have an input of 128, not 128 x 64.
The first step of my nn.Sequential is a Flatten. When I apply the model to a single datum (no batch), I need to make sure the Flatten has a start_dim=0. But this is incorrect when applying to a batch. This seems to be the opposite interface than above: you need to tailor your model to whether or not you are using batches.
So:
Does a nn.Module need to be aware of batching?
If yes: How do you apply the model to a single sample, without a batch?
If not: How do you apply Flatten, when you might send a batch, or you might send a single sample?
An equivalent question might be: How do I build a PyTorch model to train with batches, but still apply it to individual datum at production time?

How to apply CNN for multi-channel pixel data based weights to each channel?

I have an image with 8 channels.I have a conventional algorithm where weights are added to each of these channels to get an output as '0' or '1'.This works fine with several samples and complex scenarios. I would like implement the same in Machine Learning using CNN method.
I am new to ML and started looking out the tutorials which seem to be exclusively dealing with image processing problems- Hand writing recognition,Feature extraction etc.
http://cv-tricks.com/tensorflow-tutorial/training-convolutional-neural-network-for-image-classification/
https://leonardoaraujosantos.gitbooks.io/artificial-inteligence/content/neural_networks.html
I have setup the Keras with Theano as background.Basic Keras samples are working without problem.
What steps do I require to follow in order achieve the same result using CNN ? I do not comprehend the use of filters,kernels,stride in my use case.How do we provide Training data to Keras if the pixel channel values and output are in the below form?
Pixel#1 f(C1,C2...C8)=1
Pixel#2 f(C1,C2...C8)=1
Pixel#3 f(C1,C2...C8)=0 .
.
Pixel#N f(C1,C2...C8)=1
I think you should treat this the same way you use CNN to do semantic segmentation. For an example look at
https://people.eecs.berkeley.edu/~jonlong/long_shelhamer_fcn.pdf
You can use the same architecture has they are using but for the first layer instead of using filters for 3 channels use filters for 8 channels.
For the loss function you can use the same loos function or something that is more specific for binary loss.
There are several implementation for keras but with tensorflow
backend
https://github.com/JihongJu/keras-fcn
https://github.com/aurora95/Keras-FCN
Since the input is in the form of channel values,that too in sequence.I would suggest you to use Convolution1D. Here,you are taking each pixel's channel values as the input and you need to predict for each pixel.Try this
eg :
Conv1D(filters, kernel_size, strides=1, padding='valid')
Conv1D()
MaxPooling1D(pool_size)
......
(Add many layers as you want)
......
Dense(1)
use binary_crossentropy as the loss function.

How to apply mean/average pooling over the batch size to get a single output for the whole batch in Keras?

For eg.- the input with dimensions [10,1,224,224] is required to be reduced to [1,1,224,224] where [samples,channels,rows,columns] is the convention for the dimensions.
Then your problem is badly formuled, consider using [10,1,224,224] as input_shape and make batches of such tensors. Then use Averagepooling3D, see doc here.
You won't be able to make operations on batches with the usual layers, except maybe if you build your own custom layer : see here.

Feeding image into tensorflow

I am new to using TensorFlow. So I wastrying the MNIST tutorials in ML for beginners. The code runs just fine. But what if I want to input an image of my own, which has say a handwritten number on it, and se if it predicts what number it might be? How do I feed my own image into the TensorFlow program?
Assuming you're using this file.
If you look at x, the shape is [None, 784]. To feed your own image in, you'll have to store the image as a variable (loading it using PIL or OpenCV or something), flatten it, wrap it in a list, and pass it to the graph in the feed_dict, looking something like this:
sess.run(y_, feed_dict={x: [np.flatten(image_you_loaded_in)]})
It will need to be a 28x28 image in order for this code to work without modification.

Net surgery: How to reshape a convolution layer of a caffemodel file in caffe?

I'm trying to reshape the size of a convolution layer of a caffemodel (This is a follow-up question to this question). Although there is a tutorial on how to do net surgery, it only shows how to copy weight parameters from one caffemodel to another of the same size.
Instead I need to add a new channel (all 0) to my convolution filter such that it changes its size from currently (64x3x3x3) to (64x4x3x3).
Say the convolution layer is called 'conv1'. This is what I tried so far:
# Load the original network and extract the fully connected layers' parameters.
net = caffe.Net('../models/train.prototxt',
'../models/train.caffemodel',
caffe.TRAIN)
Now I can perform this:
net.blobs['conv1'].reshape(64,4,3,3);
net.save('myNewTrainModel.caffemodel');
But the saved model seems not to have changed. I've read that the actual weights of the convolution are stored rather in net.params['conv1'][0].data than in net.blobs but I can't figure out how to reshape the net.params object. Does anyone have an idea?
As you well noted, net.blobs does not store the learned parameters/weights, but rather stores the result of applying the filters/activations on the net's input. The learned weights are stored in net.params. (see this for more details).
AFAIK, you cannot directly reshape net.params and add a channel.
What you can do, is have two nets deploy_trained_net_with_3ch.prototxt and deploy_empty_net_with_4ch.prototxt. The two files can be almost identical apart from the input shape definition and the first layer's name.
Then you can load both nets to python and copy the relevant part:
net3ch = caffe.Net('deploy_trained_net_with_3ch.prototxt', 'train.caffemodel', caffe.TEST)
net4ch = caffe.Net('deploy_empty_net_with_4ch.prototxt', 'train.caffemodel', caffe.TEST)
since all layer names are identical (apart from conv1) net4ch.params will have the weights of train.caffemodel. As for the first layer, you can now manually copy the relevant part:
net4ch.params['conv1_4ch'][0].data[:,:3,:,:] = net3ch.params['conv1'][0].data[...]
and finally:
net4ch.save('myNewTrainModel.caffemodel')

Resources