The link here describes a method for image classification using affinity propagation. I'm confused as to how they got the feature vectors, i.e, the data structure of the images, e.g, arrays?
Additionally, how would I accomplish this given that I can't use Places365 as it's custom data (audio spectrograms)?
Finally, how would I plot the images as they've done in the diagram?
The images are passed through a neural network. The activations of neural network layer for an image is the feature vector. See https://keras.io/applications/ for examples.
Spectrograms can be treated like images.
Sometimes even when domain is very different, the neural network features can extract useful information that can help you with clustering/classification tasks.
Related
I am trying to train image classificator using CNN GoogleNet Inception. I have some labeled images(cca 1000 per category) and much more unlabeled images. So far have I used just labeled images and I got good accuracy. I am just not sure if it is possible to use somehow unlabeled images.
The only information about them is, that there are always some images(1-10) in one directory. And images in one directory belong to same class.
Thank You
Have a look at Keras ImageDataGenerator. Its a convenience function that reads images from subdirectories that correspondend to classes.
Even if you don´t use Keras for training you could do a dummy run to generate labels for your unlabeled images and then use these in your neural network architecture.
You can also look into pseudo labelling for images for which you don´t have any information regarding the content.
I am thinking about a toy project that would use a neural network for object recognition. Some of my objects are quite similar when viewed from one specific angle but easily distinguishable when viewed from a different angle. Thus my question:
What are methods to feed multiple images of the same object into a network? Or which network architectures exist that can take advantage of multiple images taken at different angles?
I have a good understanding of machine learning techniques but only basic understanding of neural networks. So what I am looking for here is both names of methods, techniques and other jargon that would be relevant for a google search as well as links to specific papers or articles that could be of interest.
The most common ones using multidimensional data use either multidimensional convolutions (https://keras.io/layers/convolutional/#conv3d), recurrent networks (http://www.deeplearningbook.org/contents/rnn.html) or multiple inputs, which is kinda similar to multidimensional convolutions.
Recurrent Networks handle sequences of data and the stacks of images can be seen a sequence. In contrast the multidimensional convolutions mostly exploit nearby data. Therefore it is important that the same space is highly correlated across your image stack. If this is not the case, you might want to consider using multiple inputs into your neural network.
Is it possible to feed image features, say SIFT features, to a convolutional neural network model in Tensorflow? I am trying a tensorflow implementation of this project in which a grayscale image is coloured. Will image features be a better choice than feeding the images as is to the model?
PS. I am a novice to machine learning and is not familiar with creating neural n/w models
You can feed tensorflow neural net almost anything.
If you have extra features for each pixel, then instead of using one channel (intensity) you would use multiple channels.
If you have extra features, which are about whole image, you can make separate input a merge features at some upper layer.
As for the better performance, you should try both approaches.
General intuition is that, extra features help if you don't have many samples and their effect is diminishing if you have many samples and network can learn features by itself.
Also one more point: If you are novice, I strongly recommend using higher level framework like keras.io (which is layer over tensorflow) instead of tensorflow.
Given two layers of a neural network that have a 2D representation, i.e. fields of activation. I'd like to connect each neuron of the lower layer to the near neurons of the upper layer, say within a certain radius. Is this possible with TensorFlow?
This is similar to a convolution, but the weight kernels should not be tied. I'm trying to avoid connecting both layers fully first and masking out most of the parameters, in order to keep the number of parameters low.
I don't see a simple way to do this with existing TensorFlow ops efficiently, but there might be some tricks with sparse things. However, ops for efficient locally connected, non-convolutional neural net layers would be very useful, so you might want to file a feature request as a GitHub issue.
I am a very new student on machine learning. I just wanted to ask what are possible ways to improve a method (Naive Bayes for example) to get better results classifying images into text or non-text images, instead of just inputing a x number of images and telling the system which have text and which do not?
Thanks in advance
The state of the art in such problems are deep neural networks with several convolutional layers. See this article for an example of image classification using deep convolutional nets. Your problem (just determining if an image has text or not) is much easier than the general image classification problem the authors consider, so you'd probably get away with using a much simpler network architecture.
Nowadays you don't need to implement these things yourself, there are efficient and GPU-accelerated implementations freely available, for instance Caffe, Torch7, keras...