Train TensorFlow to modify images - machine-learning

I am interested in the possibility of training a TensorFlow model to modify images, but I'm not quite sure where to get started. Almost all of the examples/tutorials dealing with images are for image classification, but I think I am looking for something a little different.
Image classification training data typically includes the images plus a corresponding set of classification labels, but I am thinking of a case of an image plus a "to-be" version of the image as the "label". Is this possible? Is it really just a classification problem in disguise?
Any help on where to get started would be appreciated. Also, the solution does not have to use TensorFlow, so any suggestions on alternate machine learning libraries would also be appreciated.
For example, lets say we want to train TensorFlow to draw circles around objects in a picture.
Example Inbound Image:
(source: pbrd.co)
Label/Expected Output:
(source: pbrd.co)
How could I accomplish that?

I can second that, its really hard to find information about Image modification with tensorflow :( But have a look here: https://affinelayer.com/pix2pix/
From my understanding, you do use a GAN, but insead of feeding the Input of the generator with random data during training, you use a sample Input.

Two popular ways (the ones that I know about) to make models generate/edit images are:
Deep Convolutional Generative Adversarial Networks
Back-Propagation through a pre-trained image classification model (in a similar manner to deep dream) but you can start from the final layer to feed back the wanted label and the gradient descent should be applied to the image only. This was explained in more details in the following course: CS231n (this lecture)
But I don't think they fit the circle around "3" example that you gave. I think object detection and instance segmentation would be more helpful. Detect the object you are looking for, extract its boundaries via segmentation and post-process it to make the circle that you wish for (or any other shape).
Reference for the images: Intro to Deep Learning for Computer Vision

Related

My Image Captioning Model giving me same caption on All images

I am doing a project related to the captioning of Medical Images. I am using the Code from this link.
I am using Indiana Data set of Radiographs and using Findings as Captions for training. I trained successfully and loss value is 0.75. But my final model giving me the same caption for all the images I had checked (Some people also facing the same issue. please check the Comments of this link).
Can you please suggest me any changes in any part of the code or anything else so it starts giving me proper captions for every image I will check.
Thanks in advance.
Looking at the dataset, what I can see is that most of the data is quite similar (black and white images of Chest X rays) - please correct me if I am wrong. So what seems to be happening is that the CNN is learning common features across most of the images. The network is not just deep/advanced enough to pick out the distinguishing patterns. According to the tutorial you are following, I don't think the VGG-16 or 19 network is learning the distinguishing patterns in the images.
The image captioning model will only be as good as the underlying CNN network. If you have a class label field in your data (like the indication/impression field provided here), you can actually confirm this hypothesis by training the network to predict the class of each image and if the performance is poor you can confirm this. If you have the class label, try experimenting with a bunch of CNNs and use the one which achieves the best classification accuracy as the feature extractor.
If you do not have a class label, I would suggest trying out some deeper CNN architectures like Inception or ResNet and see if the performance improves. Hope this was helpful!
Make sure you have an equal number of images in each class. If you have 1,000 images in the “pneumonia” category and only 5 in the “broken rib” category, your model will pick the label “pneumonia” almost every time.

Pattern Recognition using Machine learning

I have many evolution curves (on time), of a system as images.
These evolution curves are plotted when the system behave in a normal way ('ok').
I want to train a model, which learn the shapes of the curves (or parts of the shapes) when it behave in a normal way, so it will be able to classify new curves to normal (or abnormal).
Any ideas of the model to use, or how to proceed ?
Thank you
You can perform PCA, and then classify. Also look for functional data analysis
Here is a nice getting started guide with PCA
You can start with labeling (annotating) the images. The label can be as Normal/ Not Normal as 0/1 or as many classes you want to divide the data into.
Since it's a chart so the orientation is important, a wrong orientation can destroy the meaning of the image.
So make an algorithm which always orient the chart in the same way while reading.
Now that the labeling is done you need to train these images for correct classification.
Augment the data if needed
Find a image classification model
Use the trained weights
feed you images and annotations in the desired format
Train the model
Check for the output error or classification errors.
Create an evaluation matrix like confusion matrix in case of classification.
If the model is right and training is properly done you will get good accuracy.
Otherwise repeat the steps.
This is just an overview, with this you can start towards your goal.

shape Detection - TensorFlow

I'm trying to train a model to detect the basic shapes like Circle, Square, Rectangle, etc. using Tensorflow. What would be the best input data set? To load the shapes directly or to find the edge of the image using OpenCV and load only the edge image.
We can detect shapes using OpenCV too. What would be the added advantage to use Machine Learning.
Sample images given for training the model.
I would recommend starting with this guide for doing classification, not object detection:
https://kiosk-dot-codelabs-site.appspot.com/codelabs/tensorflow-for-poets/#0
Classification is for one unique tag for one picture (99% square, 1%circle). Object Detection is for classification of several objects within the picture (x_min=3,y_min=8,x_max=20,y_max30, 99% square). Your case looks more like a classification problem.
You don't need the full Docker installation as in the guide.
If you have Python 3.6 on your system, you can just do:
pip install tensorflow
And then jump to "4. Retrieving the images"
I had to try it out myself, so I downloaded the first 100 pictures of squares and circles from Google with the add-on "fatkun batch download image" from Chrome Web Store.
On my first 10 tests I get accuracy between 92,0% (0.992..) and 99,58%. If your examples are more uniform than a lot of different pictures from Google, you will probably get better results.
You may want to checkout objective detection in tensorflow.
https://github.com/tensorflow/models/tree/master/research/object_detection
There is a pre-trained model here
http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_11_06_2017.tar.gz
One potential advantage of using neural nets to do the detection is that it can reduce the cpu cycles to calculate. This is useful on mobile devices.
For example - the Hough transform https://en.wikipedia.org/wiki/Hough_transform is too expensive to calculate / but if a convolutional neural net was used instead - more possibilities open up for real time image processing.
To actually train a new model - see here https://www.tensorflow.org/tutorials/deep_cnn

Tissue based image segmentation

I'm working on a project to do a segmentation of tissu. So far i so good for now. But her i want to segment the destructed from the good tissu. Her is an image example. So as you can see the good tissus are smooth and the destructed ones are not. I have the idea to detected the edges to do the segmentation but it give bad results.
I'm opening to any i'm open to any suggestions.
Use a convolutional neural network for example any prebuilt in the Caffe package. Label the different kinds of areas in as many images as you have, then use many (1000s) small (32x32) patches from those to train the network. This will produce much better results than any kind of handcrafted algorithm.
A very simple approach which can be used as an intermediate test could be following:
Blur the image to reduce the noise. This is an important step. OpenCV provides an inbuilt method for it.
Find contours using the OpenCV method findContour().
Then if the perimeter of contour is greater than a set threshold (you will have to set a value) then, you can consider it to be a smooth tissue else you can discard the tissue.
This is a really simple approach and a simple program can be written for it really fast.

Making neural net to draw an image (aka Google's inceptionism) using nolearn\lasagne

Probably lots of people already saw this article by Google research:
http://googleresearch.blogspot.ru/2015/06/inceptionism-going-deeper-into-neural.html
It describes how Google team have made neural networks to actually draw pictures, like an artificial artist :)
I wanted to do something similar just to see how it works and maybe use it in future to better understand what makes my network to fail. The question is - how to achieve it with nolearn\lasagne (or maybe pybrain - it will also work but I prefer nolearn).
To be more specific, guys from Google have trained an ANN with some architecture to classify images (for example, to classify which fish is on a photo). Fine, suppose I have an ANN constructed in nolearn with some architecture and I have trained to some degree. But... What to do next? I don't get it from their article. It doesn't seem that they just visualize the weights of some specific layers. It seems to me (maybe I am wrong) like they do one of 2 things:
1) Feed some existing image or purely a random noise to the trained network and visualize the activation of one of the neuron layers. But - looks like it is not fully true, since if they used convolution neural network the dimensionality of the layers might be lower then the dimensionality of original image
2) Or they feed random noise to the trained ANN, get its intermediate output from one of the middlelayers and feed it back into the network - to get some kind of a loop and inspect what neural networks layers think might be out there in the random noise. But again, I might be wrong due to the same dimensionality issue as in #1
So... Any thoughts on that? How we could do the similar stuff as Google did in original article using nolearn or pybrain?
From their ipython notebook on github:
Making the "dream" images is very simple. Essentially it is just a
gradient ascent process that tries to maximize the L2 norm of
activations of a particular DNN layer. Here are a few simple tricks
that we found useful for getting good images:
offset image by a random jitter
normalize the magnitude of gradient
ascent steps apply ascent across multiple scales (octaves)
It is done using a convolutional neural network, which you are correct that the dimensions of the activations will be smaller than the original image, but this isn't a problem.
You change the image with iterations of forward/backward propagation just how you would normally train a network. On the forward pass, you only need to go until you reach the particular layer you want to work with. Then on the backward pass, you are propagating back to the inputs of the network instead of the weights.
So instead of finding the gradients to the weights with respect to a loss function, you are finding gradients to inputs with respect to the l2 Normalization of a certain set of activations.

Resources