How to load Omniglot on Pytorch - machine-learning

I'm trying to do some experiments on the Omniglot dataset, and I saw that Pytorch implemented it. I've run the command
from torchvision.datasets import Omniglot
but I have no idea on how to actually load the dataset. Is there a way to open it equivalent to how we open MNIST? Something like the following:
train_dataset = dsets.MNIST(root='./data',
The final goal is to be able to open training and test set separately and run experiments on it.

You can do exact same transformations as Omniglot contains images and labels just like MNIST, for example:
import torchvision
dataset = torchvision.datasets.Omniglot(
root="./data", download=True, transform=torchvision.transforms.ToTensor()
image, label = dataset[0]
print(type(image)) # torch.Tensor
print(type(label)) # int

Instead of train and test , Omniglot dataset uses background and evaluation terminology instead.
background_set = datasets.Omniglot(root='./data', background=True, download=True,


Explanation Needed for Autokeras's AutoModel and GraphAutoModel

I understand what AutoKeras ImageClassifier does (
clf = ImageClassifier(verbose=True, augment=False), y_train, time_limit=12 * 60 * 60)
clf.final_fit(x_train, y_train, x_test, y_test, retrain=True)
y = clf.evaluate(x_test, y_test)
But i am unable to Understand what does AutoModel class ( does, or how is it different from ImageClassifier
Documentation for Arguments Inputs and Outputs Says
inputs: A list of or a HyperNode instance. The input node(s) of the AutoModel.
outputs: A list of or a HyperHead instance. The output head(s) of the AutoModel.
What is HyperNode Instance ?
Similarly, what is GraphAutoModel class ? (
Documentation Reads
A HyperModel defined by a graph of HyperBlocks. GraphAutoModel is a subclass of HyperModel. Besides the HyperModel properties, it also has a tuner to tune the HyperModel. The user can use it in a similar way to a Keras model since it also has fit() and predict() methods.
What is HyperBlocks ?
If Image Classifier automatically does HyperParameter Tuning, what is the use of GraphAutoModel ?
Links to Any Documents / Resources for better understanding of AutoModel and GraphAutoModel appreciated .
Having worked with autokeras recently, I can share my little knowledge.
Task API
When doing a classical task such as image classification/regression, text classification/regression, ..., you can use the simplest APIs provided by autokeras called Task API: ImageClassifier, ImageRegressor, TextClassifier, TextRegressor, ... In this case you have one input (image or text or tabular data, ...) and one output (classification, regression).
However when you are in a situation where you have for example a task that requires multi inputs/outputs architecture, then you cannot use directly Task API, and this is where Automodel comes into play with the I/O API. you can check the example provided in the documentation where you have two inputs (image and structured data) and two outputs (classification and regression)
GraphAutomodel works like keras functional API. It assembles different blocks (Convolutions, LSTM, GRU, ...) and create a model using this block, then it will look for the best hyperparameters given this architecture you provided. Suppose for instance I want to do a binary classification task using time series as input data.
First let's generate a toy dataset :
import numpy as np
import autokeras as ak
x = np.random.randn(100, 7, 3)
y = np.random.choice([0, 1], size=100, p=[0.5, 0.5])
Here x is a time series of 100 samples, each sample is a sequence of length 7 and a features dimension of 3. The corresponding target variable y is binary (0, 1).
Using GraphAutomodel, I can specify the architecture I want, using what is called HyperBlocks. There are many blocks: Conv, RNN, Dense, ... check the full list here.
In my case I want to use RNN blocks to create a model because I have time series data :
input_layer = ak.Input()
rnn_layer = ak.RNNBlock(layer_type="lstm")(input_layer)
dense_layer = ak.DenseBlock()(rnn_layer)
output_layer = ak.ClassificationHead(num_classes=2)(dense_layer)
automodel = ak.GraphAutoModel(input_layer, output_layer, max_trials=2, seed=123), y, validation_split=0.2, epochs=2, batch_size=32)
(If you are not familiar with the above style of defining model, then you should check the keras functional API documentation).
So in this example I have more flexibility for creating the skeleton of architecture I would like to use : LSTM block followed by a Dense layer, followed by a Classification layer, However I didn't specify any hyperparameter, (number of lstm layers, number of dense layers, size of lstm layers, size of dense layers, activation functions, dropout, batchnorm, ....), Autokeras will do the hyperparameters tuning automatically based on the architecture (skeleton) I provided.

how to get more accuracy on CNN with less number of images

currently I am working on flower Classification dataset of kaggle which has only 210 images, with this set of image I am getting accuracy of only 11% on validation set.
enter code here
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import cv2
#from tqdm import tqdm
import os
import warnings
flower_img = r'C:\Users\asus\Downloads\flower_images\flower_images'
data = pd.read_csv(r'C:\Users\asus\Downloads\flower_images\flower_labels.csv')
img = os.listdir(flower_img)[1]
image_name = [img.split('.')[-2] for img in os.listdir(flower_img)]
label_array = np.array(data['label'])
label_unique = np.unique(label_array)
names = [' phlox','rose','calendula','iris','leucanthemum maximum','bellflower','viola','rudbeckia laciniata','peony','aquilegia']
Flower_names = {}
for i in range(10):
Flower_names[i] = names[i]
x = data['label'][2]
for img in os.listdir(flower_img):
path = os.path.join(flower_img,img)
#img = cv2.imread(path,cv2.IMREAD_GRAYSCALE)
img = cv2.imread(path)
img = cv2.resize(img,(128,128))
data['file'][i] = np.array(img)
import keras
from keras.models import Sequential
from keras.layers import Dense,Conv2D,Activation,MaxPool2D,Dropout,Flatten
model = Sequential()
from keras.optimizers import Adam
x = np.array([i for i in data['file']]).reshape(-1,128,128,3)
y = np.array([i for i in data['label']])
from keras.utils import to_categorical
y = to_categorical(y)
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(x,y),y_train,validation_data=(x_test,y_test),epochs=10)
how can I increase accuracy only using this dataset also how can I predict classes for any input image.
Link of Flower color images dataset :
Your dataset size is very small. Convolutional neural networks are optimal when trained using very large data sets. You really want to have thousands of images (or more!) in your data set.
You can try to enhance your current data set by using various image processing techniques to increase the size of the data set. These techniques will take the original images, skew them, rotate them and do other modification to bolster the size of the training data. These techniques can be helpful, but increasing the natural size of the data set is preferred.
If you cannot increase the size of the dataset, you should examine why you need to use a CNN. There are other algorithms that may give better results when trained with a smaller data set. Take a look at Support Vector Machines or k-NN.
If you must use a CNN, Transfer Learning is a good solution. You can use the features from a trained model and apply them to your problem. I have had great success with this approach.
The things you can do:
Progressive resizing link
Image augmentation link
Transfer learning link
To be honest, there are much and much more techniques could be utilized to enhance the effectiveness of used data. Try to search about this topic. These ones are the ones that I remember in a minute. These ones that I've given link are just major example ones. You can dig better with a dedicated research.

TensorFlow 1.2.1 and InceptionV3 to classify an image

I'm trying to create an example using the Keras built in the latest version of TensorFlow from Google. This example should be able to classify a classic image of an elephant. The code looks like this:
# Import a few libraries for use later
from PIL import Image as IMG
from tensorflow.contrib.keras.python.keras.preprocessing import image
from tensorflow.contrib.keras.python.keras.applications.inception_v3 import InceptionV3
from tensorflow.contrib.keras.python.keras.applications.inception_v3 import preprocess_input, decode_predictions
# Get a copy of the Inception model
print('Loading Inception V3...\n')
model = InceptionV3(weights='imagenet', include_top=True)
print ('Inception V3 loaded\n')
# Read the elephant JPG
elephant_img ='elephant.jpg')
# Convert the elephant to an array
elephant = image.img_to_array(elephant_img)
elephant = preprocess_input(elephant)
elephant_preds = model.predict(elephant)
print ('Predictions: ', decode_predictions(elephant_preds))
Unfortunately I'm getting an error when trying to evaluate the model with model.predict:
ValueError: Error when checking : expected input_1 to have 4 dimensions, but got array with shape (299, 299, 3)
This code is taken from and based on the excellent example coremltools-keras-inception and will be expanded more when it is figured out.
The reason why this error occured is that model always expects the batch of examples - not a single example. This diverge from a common understanding of models as mathematical functions of their inputs. The reasons why model expects batches are:
Models are computationaly designed to work faster on batches in order to speed up training.
There are algorithms which takes into account the batch nature of input (e.g. Batch Normalization or GAN training tricks).
So four dimensions comes from a first dimension which is a sample / batch dimension and then - the next 3 dimensions are image dims.
Actually I found the answer. Even though the documentation states that if the top layer is included the shape of the input vector is still set to take a batch of images. Thus we need to add this before the code line for the prediction:
elephant = numpy.expand_dims(elephant, axis=0)
Then the tensor is in the right shape and everything works correctly. I am still uncertain why the documentation states that the input vector should be (3x299x299) or (299x299x3) when it clearly wants 4 dimensions.
Be careful!

Resizing images in Keras ImageDataGenerator flow methods

The Keras ImageDataGenerator class provides the two flow methods flow(X, y) and flow_from_directory(directory) (
Why is the parameter
target_size: tuple of integers, default: (256, 256). The dimensions to which all images found will be resized
Only provided by flow_from_directory(directory) ? And what is the most concise way to add reshaping of images to the preprocessing pipeline using flow(X, y) ?
flow_from_directory(directory) generates augmented images from directory with arbitrary collection of images. So there is need of parameter target_size to make all images of same shape.
While flow(X, y) augments images which are already stored in a sequence in X which is nothing but numpy matrix and can be easily preprocessed/resized before passing to flow. So no need for target_size parameter. As for resizing I prefer using scipy.misc.imresize over PIL.Image resize, or cv2.resize as it can operate on numpy image data.
import scipy
new_shape = (28,28,3)
X_train_new = np.empty(shape=(X_train.shape[0],)+new_shape)
for idx in xrange(X_train.shape[0]):
X_train_new[idx] = scipy.misc.imresize(X_train[idx], new_shape)
For large training dataset, performing transformations such as resizing on the entire training data is very memory consuming. As Keras did in ImageDataGenerator, it's better to do it batch by batch. As far as I know, there're 2 ways to achieve this other than operating the whole dataset:
You can use Lambda Layer to create a layer and then feed original training data to it. The output is the resized you need.
Here is the sample code if you use TensorFlow as the backend of Keras:
original_dim = (32, 32, 3)
target_size = (64, 64)
input = keras.layers.Input(original_dim)
x = tf.keras.layers.Lambda(lambda image: tf.image.resize(image, target_size))(input)
As #Retardust mentioned, maybe you can customize your own ImageDataGenerator as well as the preprocessing_function.
For anyone else who wants to do this, .flow method of ImageDataGenerator does not have a target_shape parameter and we cannot resize an image using preprocessing_function parameter as the documentation states The function will run after the image is resized and augmented. The function should take one argument: one image (Numpy tensor with rank 3), and should output a Numpy tensor with the same shape.
So in order to use .flow, you will have to pass resized images only otherwise use a custom generator that resizes them on the fly.
Here's a sample of custom generator in keras (can also be made using python generator or any other method)
class Custom_Generator(keras.utils.Sequence) :
def __init__(self,...,datapath, batch_size, ..) :
def __len__(self) :
#calculate data len, something like len(train_labels)
def load_and_preprocess_function(self, label_names, ...):
#do something...
#load data for the batch using label names with whatever library
def __getitem__(self, idx) :
batch_y = train_labels[idx:idx+batch_size]
batch_x = self.load_and_preprocess_function()
return ( batch_x, batch_y )
X_data_resized = numpy.asarray([skimage.transform.resize(image, new_shape) for image in X_data])
because of the above code is now depreciated...
There is also (newer) method flow_from_dataframe() which accepts a Pandas dataframe with file paths and y data as columns - and it also allows to specify the target size. Just in case your image data is not organized directory-wise!

Scikit learn - fit_transform on the test set

I am struggling to use Random Forest in Python with Scikit learn. My problem is that I use it for text classification (in 3 classes - positive/negative/neutral) and the features that I extract are mainly words/unigrams, so I need to convert these to numerical features. I found a way to do it with DictVectorizer's fit_transform:
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import classification_report
from sklearn.feature_extraction import DictVectorizer
vec = DictVectorizer(sparse=False)
rf = RandomForestClassifier(n_estimators = 100)
trainFeatures1 = vec.fit_transform(trainFeatures)
# Fit the training data to the training output and create the decision trees
rf =, LabelEncoder().fit_transform(trainLabels))
testFeatures1 = vec.fit_transform(testFeatures)
# Take the same decision trees and run on the test data
Output = rf.score(testFeatures1.toarray(), LabelEncoder().fit_transform(testLabels))
print "accuracy: " + str(Output)
My problem is that the fit_transform method is working on the train dataset, which contains around 8000 instances, but when I try to convert my test set to numerical features too, which is around 80000 instances, I get a memory error saying that:
testFeatures1 = vec.fit_transform(testFeatures)
File "C:\Python27\lib\site-packages\sklearn\feature_extraction\", line 143, in fit_transform
return self.transform(X)
File "C:\Python27\lib\site-packages\sklearn\feature_extraction\", line 251, in transform
Xa = np.zeros((len(X), len(vocab)), dtype=dtype)
What could possibly cause this and is there any workaround? Many thanks!
You are not supposed to do fit_transform on your test data, but only transform. Otherwise, you will get different vectorization than the one used during training.
For the memory issue, I recommend TfIdfVectorizer, which has numerous options of reducing the dimensionality (by removing rare unigrams etc.).
If the only problem is fitting test data, simply split it to small chunks. Instead of something like
you can do
for i in range(K):
x=vect.transform(test[ i*size : (i+1)*size ])
and record results/stats and analyze them afterwards.
in particular
predictions = []
for i in range(K):
x=vect.transform(test[ i*size : (i+1)*size ])
predictions += rf.predict(x) # assuming it retuns a list of labels, otherwise - convert it to list
print accuracy_score( predictions, true_labels )
