So far I have trained my neural network is trained on the MNIST data set (from this tutorial). Now, I want to test it by feeding my own images into it.
I've processed the image using OpenCV by making the dimensions 28x28 pixels, turning it into grayscale, and using adaptive thresholding. Where do I proceed from here?
An 'image' is a 28x28 array of values from 0-1... so not really an image. Just greyscaling your original image will not make it fit for input. You have to go through the following steps.
Load your image into your programming langauge, with 784 rgb values representing pixels
For each rgb value, take the average of r, g and b. Then divide this value by 255. You will now have the greyscale of an image, a value between 0 and 1.
Replace the rgb values with the greyscale values
You will now have an image which looks like this (see the right array):
So you must do everything through your programming language. If you just greyscale an image with a photoeditor, the pixels will still be r,g,b.
You can use libraries like PIL, skimage that let you load the data into numpy arrays in python and also support many image operations like grayscaling, scaling etc.
After you have processed the image and read the data into numpy array you can then feed this to your network.
Related
I m building a CNN model with tensorflow Keras and the dataset available is in black and white.
I m using ImageDataGenerator available from keras.preprocessing.image api to convert image to array. By default it converts every image to 3 channel input. So will my model be able to predict real world image(colored imaged) if the trained image is in color and not black and white?
Also in ImageDataGenerator there is parameter named "color_mode" where it can take input as "grayscale" and gives us 2d array to be used in model. If I go with this approach do I need to convert real world image into grayscale as well?
The color space of the images you train should be the same as the color space of the images your application images.
If luminance is the the most important e.g. OCR, then training on gray scale images should produce a more efficient image. But if you are to recognize things that could appear in different colors, it may be interesting to use a color input.
If the color is not important and you train using 3-channel images, e.g. RGB, you will have to give examples in enough colors to avoid it to overfitting to the color. e.g you want to distinguish a car from a tree, you may end up with a model that maps any green object to a tree and all the rest to cars.
I have an image ~(5000*5000*8).I converted this big image to small images(eg:400 images with 256*256*8 dim (x,y,channel)) and imported those to numpy array.now I have an array (400,256,256,8) for one of my class and another array (285,256,256,8) for my second class and I save these arrays to npy files. I want to classify this images pixel by pixel, I have a label matrix with 2 class. now I want to classify this image by a UNET customised method and I use deep cognition and peltarion websites to config my network and data, so I need a method to help me for classify my image pixel-wise. please help me.
I am trying to train a cnn model for face gender and age detection. My training set contains facial images both coloured and grayscale. How do I normalize this dataset? Or how do I handle a dataset with a mixture of grayscale and coloured images?
Keep in mind the network will just attempt to learn the relationship between your labels (gender/age) and you training data, in the way they are presented to the network.
The optimal choice is depending if you expect the model to work on gray-scale or colored images in the future.
If want to to predict on gray-scale image only
You should train on grayscale image only!
You can use many approaches to convert the colored images to black and white:
simple average of the 3 RGB channels
more sophisticated transforms using cylindrical color spaces as HSV,HSL. There you could use one of the channels as you gray. Normally, tthe V channel corresponds better to human perception than the average of RGB
https://en.wikipedia.org/wiki/HSL_and_HSV
If you need to predict colored image
Obviously, there is not easy way to reconstruct the colors from a grayscale image. Then you must use color images also during training.
if your model accepts MxNx3 image in input, then it will also accept the grayscale ones, given that you replicate the info on the 3 RGB channels.
You should carefully evaluate the number of examples you have, and compare it to the usual training set sizes required by the model you want to use.
If you have enough color images, just do not use the grayscale cases at all.
If you don't have enough examples, make sure you have balanced training and test set for gray/colored cases, otherwise your net will learn to classify gray-scale vs colored separately.
Alternatively, you could consider using masking, and replace with a masking values the missing color channels.
Further alternative you could consider:
- use a pre-trained CNN for feature extraction e.g. VGGs largely available online, and then fine tune the last layers
To me it feels that age and gender estimation would not be affected largely by presence/absence of color, and it might be that reducing the problem to a gray scale images only will help you to convergence since there will be way less parameters to estimate.
You should probably rather consider normalizing you images in terms of pose, orientation, ...
To train a network you have to ensure same size among all the training images, so convert all to grayscale. To normalize you can subtract the mean of training set from each image. Do the same with validation and testing images.
For detailed procedure go through below article:
https://becominghuman.ai/image-data-pre-processing-for-neural-networks-498289068258
I have a question regarding the preprocessing step "Image mean subtraction".
I use the UCSD Dataset for my training.
One popular preprocessing step is the mean subtraction. Now I wonder if I am doing it right.
What I am doing is the following:
I have 200 gray scaled Train images
I put all images in a list and compute the mean with numpy:
np.mean(ImageList, axis=0)
This returns me a mean image
Now I subtract the mean image from all Train images
When I now visualize my preprocessed train images they are mostly black and have also negative values in them.
Is this correct? Or is my understanding of subtracting the mean image incorrect?
Here is one of my training images:
And this is the "mean image":
It seems like you are doing it right.
As for the negative values: they are to be expected. Your original images had intensity values in range [0..1], once you subtract the mean (that should be around ~0.5), you should have values in range (roughly) [-.5..0.5].
Please note that you should save the "mean image" you got for test time as well: once you wish to predict using the trained net you need to subtract the same mean image from the test image.
Update:
In your case (static camera) the mean subtracted removes the "common" background. These settings seems to be in your favor as they focus the net to the temporal changes in the frame. this method will work well for you, as long as you test on the same image set (i.e., frames from the same static camera).
I am looking for parabolas in some radar data. I am using the OpenCV Haar cascaded classifier. My positive images are 20x20 PNGs where all of the pixels are black, except for those that trace a parabolic shape--one parabola per positive image.
My question is this: will these positives train a classifier to look for black boxes with parabolas in them, or will they train a classifier to look for parabolic shapes?
Should I add a layer of medium value noise to my positive images, or should they be unrealistically crisp and high contrast?
Here is an example of the original data.
Here is an example of my data after I have performed simple edge detection using GIMP. The parabolic shapes are highlighted in the white boxes
Here is one of my positive images.
I figured out a way to do detect parabolas initially using the MatchTemplate method from OpenCV. At first, I was using the Python cv, and later cv2 libraries, but I had to make sure that my input images were 8-bit unsigned integer arrays. I eventually obtained a similar effect with less fuss using scipy.signal.correlate2d( image, template, mode='same'). The mode='same' resizes the output to the size of image. When I was done I performed thresholding, using the numpy.where() function, and opening and closing to eliminate salt and pepper noise using the scipy.ndimage module.
Here's the output, before thresholding.