Making neural net to draw an image (aka Google's inceptionism) using nolearn\lasagne - machine-learning

Probably lots of people already saw this article by Google research:
http://googleresearch.blogspot.ru/2015/06/inceptionism-going-deeper-into-neural.html
It describes how Google team have made neural networks to actually draw pictures, like an artificial artist :)
I wanted to do something similar just to see how it works and maybe use it in future to better understand what makes my network to fail. The question is - how to achieve it with nolearn\lasagne (or maybe pybrain - it will also work but I prefer nolearn).
To be more specific, guys from Google have trained an ANN with some architecture to classify images (for example, to classify which fish is on a photo). Fine, suppose I have an ANN constructed in nolearn with some architecture and I have trained to some degree. But... What to do next? I don't get it from their article. It doesn't seem that they just visualize the weights of some specific layers. It seems to me (maybe I am wrong) like they do one of 2 things:
1) Feed some existing image or purely a random noise to the trained network and visualize the activation of one of the neuron layers. But - looks like it is not fully true, since if they used convolution neural network the dimensionality of the layers might be lower then the dimensionality of original image
2) Or they feed random noise to the trained ANN, get its intermediate output from one of the middlelayers and feed it back into the network - to get some kind of a loop and inspect what neural networks layers think might be out there in the random noise. But again, I might be wrong due to the same dimensionality issue as in #1
So... Any thoughts on that? How we could do the similar stuff as Google did in original article using nolearn or pybrain?

From their ipython notebook on github:
Making the "dream" images is very simple. Essentially it is just a
gradient ascent process that tries to maximize the L2 norm of
activations of a particular DNN layer. Here are a few simple tricks
that we found useful for getting good images:
offset image by a random jitter
normalize the magnitude of gradient
ascent steps apply ascent across multiple scales (octaves)

It is done using a convolutional neural network, which you are correct that the dimensions of the activations will be smaller than the original image, but this isn't a problem.
You change the image with iterations of forward/backward propagation just how you would normally train a network. On the forward pass, you only need to go until you reach the particular layer you want to work with. Then on the backward pass, you are propagating back to the inputs of the network instead of the weights.
So instead of finding the gradients to the weights with respect to a loss function, you are finding gradients to inputs with respect to the l2 Normalization of a certain set of activations.

Related

Pattern Recognition using Machine learning

I have many evolution curves (on time), of a system as images.
These evolution curves are plotted when the system behave in a normal way ('ok').
I want to train a model, which learn the shapes of the curves (or parts of the shapes) when it behave in a normal way, so it will be able to classify new curves to normal (or abnormal).
Any ideas of the model to use, or how to proceed ?
Thank you
You can perform PCA, and then classify. Also look for functional data analysis
Here is a nice getting started guide with PCA
You can start with labeling (annotating) the images. The label can be as Normal/ Not Normal as 0/1 or as many classes you want to divide the data into.
Since it's a chart so the orientation is important, a wrong orientation can destroy the meaning of the image.
So make an algorithm which always orient the chart in the same way while reading.
Now that the labeling is done you need to train these images for correct classification.
Augment the data if needed
Find a image classification model
Use the trained weights
feed you images and annotations in the desired format
Train the model
Check for the output error or classification errors.
Create an evaluation matrix like confusion matrix in case of classification.
If the model is right and training is properly done you will get good accuracy.
Otherwise repeat the steps.
This is just an overview, with this you can start towards your goal.

Back Propagation in Convolutional Neural Networks and how to update filters

Im learning about Convolutional Neural Networks and right now i'm confused about how to implement it.
I know about regular neural networks and concepts like Gradient Descent and Back Propagation, And i can understand how CNN's how works intuitively.
My question is about Back Propagation in CNN's. How it happens? The last fully connected layers is the regular Neural Networks and there is no problem about that. But how i can update filters in convolution layers? How I can Back Propagate error from fully connected layers to these filters? My problem is updating Filters!
Filters are only simple matrixes? Or they have structures like regular NN's and connections between layers simulates that capability? I read about Sparse Connectivity and Shared Weights but I cant relate them to CNN's. Im really confused about implementing CNN's and i cant find any tutorials that talks about these concept. I can't read Papers because I'm new to these things and my Math is not good.
i dont want to use TensorFlow or tools like this, Im learning the main concept and using pure Python.
First off, I can recommend this introduction to CNNs. Maybe you can grasp the idea of it better with this.
To answer some of your questions in short:
Let's say you want to use a CNN for image classification. The picture consists of NxM pixels and has 3 channels (RBG). To apply a convolutional layer on it, you use a filter. Filters are matrices of (usually, but not necessarily) quadratic shape (e. g. PxP) and a number of channels that equals the number of channels of the representation it is applied on. Therefore, the first Conv layer filter has also 3 channels. Channels are the number of layers of the filter, so to speak.
When applying a filter to a picture, you do something called discrete convolution. You take your filter (which is usually smaller than your image) and slide it over the picture step by step, and calculate the convolution. This basically is a matrix multiplication. Then you apply a activation function on it and maybe even a pooling layer. Important to note is that the filter for all performed convolutions on this layer stays the same, so you only have P*P parameters per layer. You tweak the filter in a way, so that it fits the training data as well as possible. That's why its parameters are called shared weights. When applying GD, you simply have to apply it on said filter weights.
Also, you can find a nice demo for the convolutions here.
Implementing these things are certainly possible, but for starting out you could try out tensorflow for experimenting. At least that's the way I learn new concepts :)

Can identity NNs do edge detection for grayscale images?

Not CNNs, regular NNs. Also, I'm actually interested in making an AI based edge detector. I've read some papers, but none seem to kick start me. Can anyone share some getting started tips for making edge detectors with AI? CNNs work as classifiers, not image filters. So how can I?
Neural Network's back propagation technique is one of the popular techniques that mainly used for classification process. In the process of back propagation, a convolution matrix will be generated, a knowledge that actually generates the edge from gray level image.
But, I have another doubt what kind of learning you are opting for to train your NN, Supervised or Unsupervised?
Supervised- Train the network with a given set of data sets which can be an edge
Unsupervised- Create input layer with 5 inputs and subtract central pixel from all the neighbour four pixels and thresholding can be done at output layer.
You can even go for HYBRID APPROACH OF NEURO-FUZZY:-
One the given input image Sobel and Laplacian is applied.
Fuzzy rules are applied on the output we gain from these operators.
In neural network, input layer consists of gradient direction and hidden layer consists of fuzzy data. Both are used to train the network.
Hope, it helps

CNN Regression on Grid - Limitation of Convolutional Neural Networks?

I'm working on a (high energy physics related) problem using CNNs.
For understanding the problem, let's consider these examples here.
The left-hand side is the input to the CNN, the right-hand side the desired output. So the network is supposed to cluster the input. The actual algorithm behind this clustering (i.e. how we got the desired output for training) is really complex and we want the CNN to learn this.
I've tried different CNN architectures, for example one similar to the U-net architecture (https://arxiv.org/abs/1505.04597) but also various concatenations of convolutional layers, etc.
The outputs are always really similar (for all architectures).
Here you can see some CNN predictions.
In principle the network is performing quite well, but as you can see, in most cases the CNN output consists of several filled pixels that are directly next to each other, which will never (!) happen in the true cases.
I've been using mean squared error as the loss function in all of the networks.
Do you have any suggestions how one could avoid this problem and improve the networks performance?
Or is this a general limitation to CNNs and in practice it is not possible to solve such a problem using CNNs?
Thank you very much!
My suggestion would be to split up the work. First use a U-Shaped NN to find the activations in a binary segmentation task (like in your paper) and then regress on the found activations to find their final values. In my experience this works way better than doing regression on large images, because the MSE will result in blurry outputs, as you have observed.
The CNN does not know that you wanted a sharp result. As mentioned by #Thomas, MSE tends to give you blurry result as it is the nature of that loss function. Giving a blurry result does not introduce large loss in MSE.
An easy modification would be to use L1 Loss (absolute difference instead of squared error). It has a constant gradient unlike MSE whose gradient decreases with error.
If you really wanted a sharp result, it would be easier to add a manual step -- non maximum suppression (NMS). In practice, a 3x3 box-max filter might do.

Neural Network Picture Classification

I would like to implement a Picture Classification using Neural Network. I want to know the way to select the Features from the Picture and the number of Hidden units or Layers to go with.
For now i have an idea of changing the size of image to some 50x50 or smaller so that the number of Features are less and that all inputs have constant size.The features would be RGB value of each of the pixels.Will it be fine or there is some other better way?
Also i decided to go with 1 Hidden Layer with half the number of units as in Inputs. I can change the number to get better results. Or would i require more layers ?
There are numerous image data sets that are successfully learned by neural networks, like
MNIST (here you will find many links to papers)
NORB
and CIFAR-10/100.
Not that you need many training examples. Usually one hidden layer is sufficient. But it can be hard to determine the "right" number of neurons. Sometimes the number of hidden neurons should even be greater than the number of inputs. When you use 2 or more hidden layer you will usually need less hidden nodes and the training will be faster. But when you have to many hidden layers it can be difficult to train the weights in the first layer.
A kind of neural network that is designed especially for images are convolutional neural networks. They usually work much better than multilayer perceptrons and are much faster.
50x50 image features matrix is 2500 features with RGB values. Your neural network may memorize this but most probably will perform poorly on other images.
Therefore this type of problem is more about image-processing , feature extraction. Your features will change according to your requirements. See this similar question about image processing and neural networks
1 layer network will only be suitable for linear problems, are you sure your problem is linear? Otherwise you will need multi layer neural network

Resources