image augmentation of keras, how it works? - machine-learning

I am reading Fit generator and data augmentation in keras, but there are still something that I am not quite sure about image augmentation in keras.
(1) In datagen.flow(), we also set a batch_size. I know batch_size is needed if we do mini-batch training, so are these two batch_size values the same, i mean, if we indicate batch_size in flow() generator, are we assuming we will do mini-batch training with the same batch_size?
(2)
Let me assume the size of training set is 10,000. I guess the only difference between model.fit_generator() and model.fit() at each epoch is that, for the former one, we are using 10,000 of randomly transformed images, rather than the original 10,000 ones. But for other epochs, we are using another 10,000 images which are totally different than those used in the first epoch, because all the images are randomly generated. Is it right?
It is like we are always using new images at each epoch, which is different from the ordinary case, when the same set of images are used at each epoch.
I am new to this area. Please help!

the 1st question:the answer is YES.
the 2nd question:yes we are always using new images at each epoch,if we use data augmentation in model.fit_generator()

Related

Pytorch Feature Scaling within the Model or within the Dataloader

I am trying to conduct a simple feature scaling in PyTorch. For example, I have an image, and I want to scale certain pixel values down by 10. Now I have 2 options:
Directly divide those features by 10.0 in __getitem__ function in dataloader;
Pass the original features into the model forward function, but before pass them through trainable layers, scale down the corresponding features.
I have conducted several experiments, but observed after the first epoch, the validation losses between the two would start to diverge slightly. While after a couple hundreds of epochs, the two trained models would vary largely. Any suggestion on this?

Reducing pixels in large data set (sklearn)

Im currently working on a classification project but I'm in doubt about how I should start off.
Goal
Accurately classifying pictures of size 80*80 (so 6400 pixels) in the correct class (binary).
Setting
5260 training samples, 600 test samples
Question
As there are more pixels than samples, it seems logic to me to 'drop' most of the pixels and only look at the important ones before I even start working out a classification method (like SVM, KNN etc.).
Say the training data consists of X_train (predictors) and Y_train (outcomes). So far, I've tried looking at the SelectKBest() method from sklearn for feature extraction. But what would be the best way to use this method and to know how many k's I've actually got to select?
It could also be the case that I'm completely on the wrong track here, so correct me if I'm wrong or suggest an other approach to this if possible.
You are suggesting to reduce the dimension of your feature space. That is a method of regularization to reduce overfitting. You haven't mentioned overfitting is an issue so I would test that first. Here are some things I would try:
Use transfer learning. Take a pretrained network for image recognition tasks and fine tune it to your dataset. Search for transfer learning and you'll find many resources.
Train a convolutional neural network on your dataset. CNNs are the go-to method for machine learning on images. Check for overfitting.
If you want to reduce the dimensionality of your dataset, resize the image. Going from 80x80 => 40x40 will reduce the number of pixels by 4x, assuming your task doesn't depend on fine details of the image you should maintain classification performance.
There are other things you may want to consider but I would need to know more about your problem and its requirements.

How can I train a naivebayes classifier incrementally?

Using Accord.NET I've created a NaiveBayes classifier. It will classify a pixel based on 6 or so sets of image processing results. My images are 5MP, so a training set of 50 images creates a very large set of training data.
6 int array per pixel * 5 million pixels * 50 images.
Instead of trying to store all that data in memory, is there a way to incrementally train the NaiveBayes classifier? Calling Learn() multiple times overwrites the old data each time rather than adding to it.
Right now is not possible to train a Naive Bayes model incrementally using Accord.NET.
However, since all that Naive Bayes is going to do is to try to fit some distributions to your data, and since your data has very few dimensions, maybe you could try to learn your model on a subsample of your data rather than all of it at once.
When you go loading images to build your training set, you can try to randomly discard x% of the pixels in each image. You can also plot the classifier accuracy for different values of x to find the best balance between memory and accuracy for your model (hint: for such a small model and this large amount of training data, I expect that it wont make that much of a difference even if you dropped 50% of your data).

How to fit a classifier with high accuracy on the training set with low features?

I have input (r,c) in range (0, 1] as the coordinate of a pixel of an image and its color 1 or 2 only.
I have about 6,400 pixels.
My attempt of fitting X=(r,c) and y=color was a failure the accuracy won't go higher than 70%.
Here's the image:
The first is the actual image, the 2nd is the image I use to train on, it has only 2 colors. The last is the image that the neural network generated with about 500 weights training with 50 iterations. Input Layer is 2, one hidden layer of size 100, and the output layer is 2. (for binary classification like this, I may need only one output layer but I am just preparing for multi-class classification)
The classifier failed to fit the training set, why is that? I tried generating high polynomial terms of those 2 features but it doesn't help. I tried using Gaussian kernel and random 20-100 landmarks on the picture to add more features, also got similar output. I tried using logistic regressions, doesn't help.
Please help me increase the accuracy.
Here's the input:input.txt (you can load it into Octave the variable is coordinate (r,c features) and idx (color)
You can try plotting it first to make sure that you understand the input then try training on it and tell me if you get better result.
Your problem is hard to model. You are trying to fit function from R^2 to R, which has lots of complexity - lots of "spikes", lots of discontinuous regions (pixels that are completely separated from the rest). This is not an easy problem, and not usefull one.. In order to overfit your network to such setting you will need plenty of hidden units. Thus, what are the options to do so?
General things that are missing in the question, and are important
Your output variable should be {0, 1} if you are fitting your network through cross entropy cost (log likelihood), which you should use for classification.
50 iteraions (if you are talking about some mini-batch iteraions) is orders of magnitude to small, unless you mean 50 epochs (iterations over whole training set).
Actual things, that will probably need to be done (at least one of the below):
I assume that you are using ReLU activations (or Tanh, hard to say looking at the output) - you can instead use RBF activations, and increase number of hidden neurons to ~5000,
If you do not want to go with RBFs, then you will need 1-2 additional hidden layers to fit function of this complexity. Try architecture of type 100-100-100 instaed.
If the above fails - increase number of hidden units, that's all you need - enough capacity.
In general: neural networks are not designed for working with low dimensional datasets. This is nice example from the web, that you can learn pix-pos to color mapping, but it is completely artificial and seems to actually harm people intuitions.

Caffe predicts same class regardless of image

I modified the MNIST example and when I train it with my 3 image classes it returns an accuracy of 91%. However, when I modify the C++ example with a deploy prototxt file and labels file, and try to test it on some images it returns a prediction of the second class (1 circle) with a probability of 1.0 no matter what image I give it - even if it's images that were used in the training set. I've tried a dozen images and it consistently just predicts the one class.
To clarify things, in the C++ example I modified I did scale the image to be predicted just like the images were scaled in the training stage:
img.convertTo(img, CV_32FC1);
img = img * 0.00390625;
If that was the right thing to do, then it makes me wonder if I've done something wrong with the output layers that calculate probability in my deploy_arch.prototxt file.
I think you have forgotten to scale the input image during classification time, as can be seen in line 11 of the train_test.prototxt file. You should probably multiply by that factor somewhere in your C++ code, or alternatively use a Caffe layer to scale the input (look into ELTWISE or POWER layers for this).
EDIT:
After a conversation in the comments, it turned out that the image mean was mistakenly being subtracted in the classification.cpp file whereas it was not being subtracted in the original training/testing pipeline.
Are your train classes balanced?
You may get to a stacked network on a prediction of one major class.
In order to find the issue I suggest to output the train prediction during training compared to predictions with the forward example on same train images from a different class.

Resources