I'm building a classifier for MIT Indoor Scenes dataset (https://web.mit.edu/torralba/www/indoor.html). Downloading the data gives me an 'Images' folder with 67 subfolders (for 67 classes), and different number of images within each subfolder. So far I have
import albumentations as A
from albumentations.pytorch import ToTensorV2
from torchvision.datasets import ImageFolder
from torch.utils.data import Dataset, DataLoader
alb_transform = A.Compose([
A.Resize(256, 256),
A.RandomCrop(width=224, height=224),
A.HorizontalFlip(p=0.5),
A.RandomBrightnessContrast(p=0.2),
ToTensorV2()
])
dataset = ImageFolder('Images',transform=alb_transform)
dataloader = DataLoader(dataset, batch_size=64, shuffle=True, num_workers=0)
However, when I do
images, labels = next(iter(dataloader))
I get the error
KeyError: 'You have to pass data to augmentations as named arguments, for example: aug(image=image)'
The solutions I have looked at either used Pytorch Transforms, or wrote their own Dataset class. Is there a way to build the helper function for visualizing images using ImageLoader and Albumentations? Why am I getting that error?
Related
Pytorch's torchvision package provides pre-trained neural networks for image classification. I've been using the following code to classify an image using Alexnet (note: some of this code is from this webpage):
from PIL import Image
import torch
from torchvision import transforms
from torchvision import models
# function to transform image
transform = transforms.Compose([
transforms.Resize(224),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])])
# image
img = Image.open('/path/to/image.jpg')
img = transform(img)
img = torch.unsqueeze(img, 0)
# alexnet
alexnet = models.alexnet(pretrained=True)
alexnet.eval()
out = alexnet(img)
percents = torch.nn.functional.softmax(out, dim=1)[0] * 100
top5_vals, top5_inds = percents.topk(5)
There are 1,000 total classes, and the top5_inds variable gives me the indices of the top 5 classes. But how do I get the associated labels (e.g. snail, basketball, banana)? I can't seem to find any sort of list as part of Pytorch's documentation or the alexnet variable.
Torchvision models are pretrained on the ImageNet dataset. Due to its comprehensiveness and size, ImageNet is the most commonly used dataset for pretraining & transfer learning. As you noted, it has 1000 classes. The complete class list can be searched, or you can refer to this listing on GitHub: https://gist.github.com/yrevar/942d3a0ac09ec9e5eb3a
Let's say i have this python code:
from imblearn.pipeline import Pipeline
from sklearn.feature_selection import VarianceThreshold
from sklearn.preprocessing import StandardScaler
from imblearn.over_sampling import RandomOverSampler
from sklearn.decomposition import PCA
selector = VarianceThreshold()
scaler = StandardScaler()
ros = RandomOverSampler()
pca = PCA()
clf = neighbors.KNeighborsClassifier(n_jobs=-1)
pipe = Pipeline(steps=[('selector', selector), ('scaler', scaler), ('sampler', ros), ('pca', pca), ('kNN', clf)])
pipe.fit(X_train,y_train)
preds = pipe.predict(X_test)
What this does is import 4 transformers and an estimator from scickit learn.Then it fit's these in the data and lastly it predicts.If i understand correctly fit method applies the 4 transformers to the data and the predict method makes the final estimation(in our case using kNN).My question is this:For scaler as well as pca the alterations that are done in the train data must be also applied in the test data.But in fit's parameters we don't give test and as a result the test data won't be altered.How does that make sense?Is there something i am missing?
The model only learns the parameters from the training data and assumes that the test data will have similar patterns and transforms it accordingly. You cannot have a test data that is completely different from training data and expect good predictions, hence the same PCA and scaler models are also used on test dataset. If you fit a scaler on a smaller test dataset, the results might come out completely different than what the model is originally trained on.
I'm trying to do some experiments on the Omniglot dataset, and I saw that Pytorch implemented it. I've run the command
from torchvision.datasets import Omniglot
but I have no idea on how to actually load the dataset. Is there a way to open it equivalent to how we open MNIST? Something like the following:
train_dataset = dsets.MNIST(root='./data',
train=True,
transform=transforms.ToTensor(),
download=True)
The final goal is to be able to open training and test set separately and run experiments on it.
You can do exact same transformations as Omniglot contains images and labels just like MNIST, for example:
import torchvision
dataset = torchvision.datasets.Omniglot(
root="./data", download=True, transform=torchvision.transforms.ToTensor()
)
image, label = dataset[0]
print(type(image)) # torch.Tensor
print(type(label)) # int
Instead of train and test , Omniglot dataset uses background and evaluation terminology instead.
background_set = datasets.Omniglot(root='./data', background=True, download=True,
transform=transforms.ToTensor())
currently I am working on flower Classification dataset of kaggle which has only 210 images, with this set of image I am getting accuracy of only 11% on validation set.
enter code here
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import cv2
#from tqdm import tqdm
import os
import warnings
warnings.filterwarnings('ignore')
flower_img = r'C:\Users\asus\Downloads\flower_images\flower_images'
data = pd.read_csv(r'C:\Users\asus\Downloads\flower_images\flower_labels.csv')
img = os.listdir(flower_img)[1]
image_name = [img.split('.')[-2] for img in os.listdir(flower_img)]
label_array = np.array(data['label'])
label_unique = np.unique(label_array)
names = [' phlox','rose','calendula','iris','leucanthemum maximum','bellflower','viola','rudbeckia laciniata','peony','aquilegia']
Flower_names = {}
for i in range(10):
Flower_names[i] = names[i]
print(Flower_names)
Flower_names.get(8)
x = data['label'][2]
Flower_names.get(x)
i=0
for img in os.listdir(flower_img):
#print(img)
path = os.path.join(flower_img,img)
#img = cv2.imread(path,cv2.IMREAD_GRAYSCALE)
img = cv2.imread(path)
#print(img.shape)
img = cv2.resize(img,(128,128))
data['file'][i] = np.array(img)
i+=1
data['file'][0].shape
plt.imshow(data['file'][0])
plt.show()
import keras
from keras.models import Sequential
from keras.layers import Dense,Conv2D,Activation,MaxPool2D,Dropout,Flatten
model = Sequential()
model.add(Conv2D(32,kernel_size=3,activation='relu',input_shape=(128,128,3)))
model.add(MaxPool2D(pool_size=(2,2)))
model.add(Conv2D(64,kernel_size=3,activation='relu'))
model.add(MaxPool2D(pool_size=(2,2)))
model.add(Conv2D(128,kernel_size=3,activation='relu'))
model.add(MaxPool2D(pool_size=(2,2)))
#model.add(Conv2D(512,kernel_size=3,activation='relu'))
#model.add(MaxPool2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(512,activation='relu'))
model.add(Dense(10,activation='softmax'))
model.add(Dropout(0.25))
from keras.optimizers import Adam
model.compile(loss='categorical_crossentropy',optimizer=Adam(lr=0.002),metrics=['accuracy'])
model.summary()
x = np.array([i for i in data['file']]).reshape(-1,128,128,3)
y = np.array([i for i in data['label']])
from keras.utils import to_categorical
y = to_categorical(y)
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(x,y)
model.fit(x_train,y_train,validation_data=(x_test,y_test),epochs=10)
model.evaluate(x_test,y_test)
model.evaluate(x_train,y_train)
how can I increase accuracy only using this dataset also how can I predict classes for any input image.
Link of Flower color images dataset : https://www.kaggle.com/olgabelitskaya/flower-color-images
Your dataset size is very small. Convolutional neural networks are optimal when trained using very large data sets. You really want to have thousands of images (or more!) in your data set.
You can try to enhance your current data set by using various image processing techniques to increase the size of the data set. These techniques will take the original images, skew them, rotate them and do other modification to bolster the size of the training data. These techniques can be helpful, but increasing the natural size of the data set is preferred.
If you cannot increase the size of the dataset, you should examine why you need to use a CNN. There are other algorithms that may give better results when trained with a smaller data set. Take a look at Support Vector Machines or k-NN.
If you must use a CNN, Transfer Learning is a good solution. You can use the features from a trained model and apply them to your problem. I have had great success with this approach.
The things you can do:
Progressive resizing link
Image augmentation link
Transfer learning link
To be honest, there are much and much more techniques could be utilized to enhance the effectiveness of used data. Try to search about this topic. These ones are the ones that I remember in a minute. These ones that I've given link are just major example ones. You can dig better with a dedicated research.
This question already has an answer here:
Multi-dimensional input layers in Keras
(1 answer)
Closed 5 years ago.
I'm attempting to train a model of 3 layer Dense Neural Network using Keras with a GPU enabled Tensorflow backend.
The dataset I have is 4 million 20x40px images that I placed in directories with the name of the category they belong to.
Because of the large amount of data I can't just load it all into RAM and feed it to my model so I thought using Keras's ImageDataGenerator, specifically the function flow_from_directory() would do the trick. This yields a tuple of (x, y) where x is the numpy array of the image and y is the label of the image.
I expected the model to know to access the numpy array to be given as input for my model so I setup my input shape to be: (None,20,40,3) where None is the batch size, 20 and 40 are size of the image and 3 are the number of channels in the image. This does not work however as when I try to train my model I keep getting the error:
ValueError: Error when checking target: expected dense_3 to have 4 dimensions, but got array with shape (1024, 2)
I know the cause is that it is getting the tuple from flow_from_directoy and I guess I could change the input shape to match, however, I fear that this would render my model useless as I will be using images to make predictions not a pre-categorized tuple. So my question is, how can I get flow_from_directory to feed the image to my model and only use the tuple to validate it's training? Am I misunderstanding something here?
For reference, here is my code:
from keras.models import Model
from keras.layers import *
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import TensorBoard
# Prepare the Image Data Generator.
train_datagen = ImageDataGenerator()
test_datagen = ImageDataGenerator()
train_generator = train_datagen.flow_from_directory(
'/path/to/train_data/',
target_size=(20, 40),
batch_size=1024,
)
test_generator = test_datagen.flow_from_directory(
'/path/to/test_data/',
target_size=(20, 40),
batch_size=1024,
)
# Define input tensor.
input_t = Input(shape=(20,40,3))
# Now create the layers and pass the input tensor to it.
hidden_1 = Dense(units=32, activation='relu')(input_t)
hidden_2 = Dense(units=16)(hidden_1)
prediction = Dense(units=1)(hidden_2)
# Now put it all together and create the model.
model = Model(inputs=input_t, outputs=prediction)
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
# Prepare Tensorboard callback and start training.
tensorboard = TensorBoard(log_dir='./graph', histogram_freq=0, write_graph=True, write_images=True)
print(test_generator)
model.fit_generator(
train_generator,
steps_per_epoch=2000,
epochs=100,
validation_data=test_generator,
validation_steps=800,
callbacks=[tensorboard]
)
# Save trained model.
model.save('trained_model.h5')
Your input shape is wrong for Dense layers.
Dense layers expect inputs in the shape (None,length).
You'll either need to reshape your inputs so that they become vectors:
imageBatch=imageBatch.reshape((imageBatch.shape[0],20*40*3))
Or use convolutional layers, that expect that type of input shape (None,nRows,nCols,nChannels) like in tensorflow.