I am using the CoViDx dataset. When I try to open images, sometimes it works well, sometimes it prints the error in the title: UnidentifiedImageError: cannot identify image file.
This is the code. My covid dataset class:
class CovidDataset(Dataset):
def __init__(self, dataset_df, transform=None):
self.dataset_df = dataset_df
self.transform = transform
def __len__(self):
return self.dataset_df.shape[0]
def __getitem__(self, idx):
image_name = self.dataset_df['filename'][idx]
img = Image.open(image_name)
label = self.dataset_df['class'][idx]
if self.transform:
img = self.transform(img)
return img, label
Then train_dataset = CovidDataset(train_df, transform=image_transforms['train']) and when I do train_dataset[0] it prints:
(tensor([[[-0.1433, -0.1312, -0.1072, ..., -1.2137, -1.2017, -1.2017],
[-0.1673, -0.1312, -0.1312, ..., -1.2017, -1.2137, -1.2137],
[-0.1673, -0.1433, -0.1312, ..., -1.1776, -1.2137, -1.2137],
[-0.7687, -0.7807, -0.8048, ..., -0.6725, -0.6965, -0.6725],
[-0.7446, -0.7687, -0.7807, ..., -0.6604, -0.6965, -0.6845],
[-0.7446, -0.7567, -0.7927, ..., -0.6604, -0.6965, -0.6845]]]),
Instead, if I do train_dataset[1] I have this error: UnidentifiedImageError: cannot identify image file 'train/sub-S03144_ses-E06258_run-1_bp-chest_vp-ap_dx-corrected.png'
So, going on, some images are opened, while some other images do not work. How can I fix this? I have already seen this post: link however, I have this error if I apply the script in the answer: NotADirectoryError: [Errno 20] Not a directory: 'train/37ae5f8b-8504-479e-bdbd-58dc6158f0f6.png'


Pytorch: Add information to images in image prediction

I would like to add information to my current dataset. At the moment, I have six-frame sequences in folders. The DataLoader reads all 6 and uses the first 3 for predicting the last 1/2/3 (depending on how many I tell him to). This is the function for the DataLoader.
class TrainFeeder(Dataset):
def init(self, data_set):
super(TrainFeeder, self).init()
self.input_data = data_set
if torch.cuda.current_device() ==0:
print('There are total %d sequences in trainset' % len(self.input_data))
def getitem(self, index):
path = self.input_data[index]
imgs_path = sorted(glob.glob(path + '/*.png'))
imgs = []
for img_path in imgs_path:
img = cv2.imread(img_path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = cv2.resize(img, (256,448))
img = cv2.resize(img, (0, 0), fx=0.5, fy=0.5, interpolation=cv2.INTER_CUBIC) #has been 0.5 for official data, new is fx = 2.63 and fy = 2.84
img_tensor = ToTensor()(img).float()
imgs = torch.stack(imgs, dim=0)
return imgs
def len(self):
return len(self.input_data)
Now I'd like to add one value to these images. It is a boolean, I have stored in a list in a .json in the same folder, like the six-frame-sequences. But I don't know how to add the values of the list in the .json to the tensor. Which dimension should I use? Will the system work at all, if I change the shape of the input?
The function getitem can return anything, so you can return a tuple instead of just images :
def __getitem__(self, index):
path = ...
# load your 6 images
imgs = torch.stack( ... )
# load your boolean metadata
metadata = load_json_data( ... )
# return them both
return (imgs, metadata)
You will need to make metadata a tensor before returning it, otherwise I expect that pytorch will complain about not being able to collate (i.e stack) them to make batches
"Will the system work" is a question only you can answer, since you did not provide the code of your ML model. I would bet on : "no but it won't require significant changes to work". Most likely you currently have a loop like
for imgs in dataloader:
# do some training
output = model(imgs)
And you will have to make it like
for imgs, metadata in dataloader:
# do some training
output = model(imgs)

Custom dataset Loader pytorch

i am doing covid-19 classification.i took dataset from kaggle. it has folder named dataset which contain 3 folders normal pnuemonia and covid-19 each contaning images for these classes i am stucked in writting getitem in pytorch custom dataloader ?
Dataset has 189 covid images but by this get item i get 920 images of covid kindly help
class_names = ['normal', 'viral', 'covid']
root_dir = 'COVID-19 Radiography Database'
source_dirs = ['NORMAL', 'Viral Pneumonia', 'COVID-19']
if os.path.isdir(os.path.join(root_dir, source_dirs[1])):
os.mkdir(os.path.join(root_dir, 'test'))
for i, d in enumerate(source_dirs):
os.rename(os.path.join(root_dir, d), os.path.join(root_dir, class_names[i]))
for c in class_names:
os.mkdir(os.path.join(root_dir, 'test', c))
for c in class_names:
images = [x for x in os.listdir(os.path.join(root_dir, c)) if x.lower().endswith('png')]
selected_images = random.sample(images, 30)
for image in selected_images:
source_path = os.path.join(root_dir, c, image)
target_path = os.path.join(root_dir, 'test', c, image)
shutil.move(source_path, target_path)
Above code is used to create test dataset which has 30 images of each class
class ChestXRayDataset(torch.utils.data.Dataset):
def __init__(self, image_dirs, transform):
def get_images(class_name):
images = [x for x in os.listdir(image_dirs[class_name]) if
print(f'Found {len(images)} {class_name} examples')
return images
self.images = {}
self.class_names = ['normal', 'viral', 'covid']
for class_name in self.class_names:
self.images[class_name] = get_images(class_name)
self.image_dirs = image_dirs
self.transform = transform
def __len__(self):
return sum([len(self.images[class_name]) for class_name in self.class_names])
def __getitem__(self, index):
class_name = random.choice(self.class_names)
index = index % len(self.images[class_name])
image_name = self.images[class_name][index]
image_path = os.path.join(self.image_dirs[class_name], image_name)
image = Image.open(image_path).convert('RGB')
return self.transform(image), self.class_names.index(class_name)
**Stucked in get item of this **
images in folder are arranged as follows
Dataset is as follows
**Code for confusion matrix is **
nb_classes = 3
confusion_matrix = torch.zeros(nb_classes, nb_classes)
with torch.no_grad():
for data in tqdm_notebook(dl_train,total=len(dl_train),unit='batch'):
img,lab = data
img,lab = img.to(device),lab.to(device)
_,output = torch.max(model(img),1)
for t, p in zip(lab.view(-1), output.view(-1)):
confusion_matrix[t.long(), p.long()] += 1
output for confusion matrix only one class is getting trained
confusio matrix image
Putting you images in a dictionary complicates the manipulation, rather use a list. Also you Dataset should not have any randomness, shuffling of the data should happen from the DataLoader not from the Dataset.
Use something like below:
class ChestXRayDataset(torch.utils.data.Dataset):
def __init__(self, image_dirs, transform):
def get_images(class_name):
images = [x for x in os.listdir(image_dirs[class_name]) if
print(f'Found {len(images)} {class_name} examples')
return images
self.images = []
self.labels = []
self.class_names = ['normal', 'viral', 'covid']
for class_name in self.class_names:
images = get_images(class_name)
# This is a list containing all the images
# This is a list containing all the corresponding image labels
self.image_dirs = image_dirs
self.transform = transform
def __len__(self):
return len(self.images)
# Will return the image and its label at the position `index`
def __getitem__(self, index):
# image at index position of all the images
image_name = self.images[index]
# Its label
class_name = self.labels[index]
image_path = os.path.join(self.image_dirs[class_name], image_name)
image = Image.open(image_path).convert('RGB')
return self.transform(image), self.class_names.index(class_name)
If you enumerate it say using
ds = ChestXRayDataset(image_dirs, transform)
for x, y in ds:
print (x.shape, y)
You should see all the images and the labels in the sequential order.
However in real case you would rather use a Torch DataLoader and pass it the ds object with shuffle parameter set to True. So the DataLoader will take care of shuffling the Dataset by calling the __getitem__ with shuffled index values.

How to batch process with multiple Bounding Boxes in imgaug

I'm trying to set up a data augmentation pipline with imgaug. The transformation of the images works and does not throw any errors. In the second attempt I tried to transform the N Bounding Boxes for each image and I get a persistent error.
def image_batch_augmentation(batch_images, batch_bbox, batch_image_shape):
def create_BoundingBox(bbox):
return BoundingBox(bbox[0], bbox[1], bbox[2], bbox[3], bbox[4])
bbox = [[create_BoundingBox(bbox) for bbox in batch if sum(bbox) != 0]for batch in batch_bbox]
bbox = [BoundingBoxesOnImage(batch, shape=(h,w)) for batch, w, h in zip(bbox,batch_image_shape[0], batch_image_shape[1]) ]
seq_det = seq.to_deterministic()
aug_image = seq_det.augment_images(image.numpy())
aug_bbox = [seq_det.augment_bounding_boxes(batch) for batch in bbox]
return aug_image, aug_bbox
In the following line the following error occurs:
aug_bbox = seq_det.augment_bounding_boxes(bbox)
Exception has occurred: InvalidArgumentError
cannot compute Mul as input #1(zero-based) was expected to be a double tensor but is a int64 tensor [Op:Mul] name: mul/
I have already tried several different approaches but I can't get any further. Furthermore, I haven't found any information in the docs or other known platforms that would help me to get the code running.
The problem is constant, as can be seen from the error message on the data types. An adjustment of these has led to a success.
Here is the corresponding code that is actually running:
def image_batch_augmentation(batch_images, batch_bbox, batch_image_shape):
def create_BoundingBox(bbox, w, h):
return BoundingBox(bbox[0]*h, bbox[1]*w, bbox[2]*h, bbox[3]*w, tf.cast(bbox[4], tf.int32))
bbox = [[create_BoundingBox(bbox, float(w), float(h)) for bbox in batch if sum(bbox) != 0] for batch, w, h in zip(batch_bbox, batch_image_shape[0], batch_image_shape[1])]
bbox = [BoundingBoxesOnImage(batch, shape=(int(w),int(h))) for batch, w, h in zip(bbox,batch_image_shape[0], batch_image_shape[1]) ]
seq_det = seq.to_deterministic()
images_aug = seq_det.augment_images(image.numpy())
bbsoi_aug = seq_det.augment_bounding_boxes(bbox)
return images_aug, bbsoi_aug

Tesseract fails to parse text from image

I'm completely new to opencv and tesseract.
I spent all day trying to make code that would parse game duration from images like that: original image (game duration is in the top left corner)
I came to code that manages to recognize the duration sometimes (about 40% of all cases). Here it is:
from PIL import Image
except ImportError:
import Image
import os
import cv2
import pytesseract
import re
import json
def non_digit_split(s):
return filter(None, re.split(r'(\d+)', s))
def time_to_sec(min, sec):
return (int(min) * 60 + int(sec)).__str__()
def process_img(image_url):
img = cv2.resize(cv2.imread('./images/' + image_url), None, fx=5, fy=5, interpolation=cv2.INTER_CUBIC)
str = pytesseract.image_to_string(img)
if "WIN " in str:
time = list(non_digit_split(str.split("WIN ",1)[1][0:6].strip()))
str = time_to_sec(time[0], time[2])
str = 'Not recognized'
return str
res = {}
img_list = os.listdir('./images')
for i in img_list:
res[i] = process_img(i)
with open('output.txt', 'w') as file:
Don't even ask how I came to resizing image, but it helped a little.
I also tried to crop image first like that:
cropped image
but tesseract couldn't find any text here.
I'm sure that the issue I'm trying to solve is pretty easy. Can you please point me the right direction? How should I preprocess it so tesseract will parse it right?
Thanks to #DmitriiZ comment I managed to produce working piece of code.
I made a preprocessor that outputs something like that:
Preprocessed image
Tesseract handles it just fine.
Here is the full code:
from PIL import Image
except ImportError:
import Image
import os
import pytesseract
import json
def is_dark(image):
pixels = image.getdata()
black_thresh = 100
nblack = 0
for pixel in pixels:
if (sum(pixel) / 3) < black_thresh:
nblack += 1
n = len(pixels)
if (nblack / float(n)) > 0.5:
return True
return False
def preprocess(img):
basewidth = 500
wpercent = (basewidth/float(img.size[0]))
hsize = int((float(img.size[1])*float(wpercent)))
#Enlarging image
img = img.resize((basewidth,hsize), Image.ANTIALIAS)
#Converting image to black and white
img = img.convert("1", dither=Image.NONE)
return img
def process_img(image_url):
img = Image.open('./images/' + image_url)
#Area we need to crop can be found in one of two different areas,
#depending on which team won. You can replace that block and is_dark()
#function by just img.crop().
top_area = (287, 15, 332, 32)
crop = img.crop(top_area)
if is_dark(crop):
bot_area = (287, 373, 332, 390)
crop = img.crop(bot_area)
img = preprocess(crop)
str = pytesseract.image_to_string(img)
return str
res = {}
img_list = os.listdir('./images')
for i in img_list:
res[i] = process_img(i)
with open('output.txt', 'w') as file:

Why is training using custom python layer in Pycaffe is extremely slow?

I created a custom layer in python so that I can feed the data directly.
but I noticed it runs extremely slow and the GPU usage is at most 1% ( the memory is allocated, i.e. I can see that when I run the script, it allocates 2100MB VRAM and terminating the training, frees around 1G.
I'm not sure if this is an expected behavior or I'm doing something wrong.
Here is the script I wrote (based on this former pr) :
import json
import caffe
import numpy as np
from random import shuffle
from PIL import Image
class MyDataLayer(caffe.Layer):
This is a simple datalayer for training a network on CIFAR10.
def setup(self, bottom, top):
self.top_names = ['data', 'label']
# === Read input parameters ===
params = eval(self.param_str)
# Check the paramameters for validity.
# store input as class variables
self.batch_size = params['batch_size']
# Create a batch loader to load the images.
self.batch_loader = BatchLoader(params, None)
# === reshape tops ===
# since we use a fixed input image size, we can shape the data layer
# once. Else, we'd have to do it in the reshape call.
top[0].reshape(self.batch_size, 3, params['im_height'], params['im_width'])
# this is for our label, since we only have one label we set this to 1
top[1].reshape(self.batch_size, 1)
print_info("MyDataLayer", params)
def forward(self, bottom, top):
Load data.
for itt in range(self.batch_size):
# Use the batch loader to load the next image.
im, label = self.batch_loader.load_next_image()
# Add directly to the caffe data layer
top[0].data[itt, ...] = im
top[1].data[itt, ...] = label
def reshape(self, bottom, top):
There is no need to reshape the data, since the input is of fixed size
(rows and columns)
def backward(self, top, propagate_down, bottom):
These layers does not back propagate
class BatchLoader(object):
This class abstracts away the loading of images.
Images can either be loaded singly, or in a batch. The latter is used for
the asyncronous data layer to preload batches while other processing is
the format is like :
png_data_batch_1/leptodactylus_pentadactylus_s_000004.png 6
png_data_batch_1/camion_s_000148.png 9
png_data_batch_1/tipper_truck_s_001250.png 9
def __init__(self, params, result):
self.result = result
self.batch_size = params['batch_size']
self.image_root = params['image_root']
self.im_shape = [params['im_height'],params['im_width']]
# get list of images and their labels.
self.image_labels = params['label']
#getting the list of all image filenames along with their labels
self.imagelist = [line.rstrip('\n\r') for line in open(self.image_labels)]
self._cur = 0 # current image
# this class does some simple data-manipulations
self.transformer = SimpleTransformer()
print ("BatchLoader initialized with {} images".format(len(self.imagelist)))
def load_next_image(self):
Load the next image in a batch.
# Did we finish an epoch?
if self._cur == len(self.imagelist):
self._cur = 0
# Load an image
image_and_label = self.imagelist[self._cur] # Get the image index
#read the image filename
image_file_name = image_and_label[0:-1]
#load the image
im = np.asarray(Image.open(self.image_root +'/'+image_file_name))
#im = scipy.misc.imresize(im, self.im_shape) # resize
# do a simple horizontal flip as data augmentation
flip = np.random.choice(2)*2-1
im = im[:, ::flip, :]
# Load and prepare ground truth
#read the label
label = image_and_label[-1]
#convert to onehot encoded vector
#fix: caffe automatically converts the label into one hot encoded vector. so we only need to simply use the decimal number (i.e. the plain label number)
#one_hot_label = np.eye(10)[label]
self._cur += 1
return self.transformer.preprocess(im), label
def check_params(params):
A utility function to check the parameters for the data layers.
required = ['batch_size', 'image_root', 'im_width', 'im_height', 'label']
for r in required:
assert r in params.keys(), 'Params must include {}'.format(r)
def print_info(name, params):
Ouput some info regarding the class
print ("{} initialized for split: {}, with bs: {}, im_shape: {}.".format(
class SimpleTransformer:
SimpleTransformer is a simple class for preprocessing and deprocessing
images for caffe.
def __init__(self, mean=[125.30, 123.05, 114.06]):
self.mean = np.array(mean, dtype=np.float32)
self.scale = 1.0
def set_mean(self, mean):
Set the mean to subtract for centering the data.
self.mean = mean
def set_scale(self, scale):
Set the data scaling.
self.scale = scale
def preprocess(self, im):
preprocess() emulate the pre-processing occuring in the vgg16 caffe
im = np.float32(im)
im = im[:, :, ::-1] # change to BGR
im -= self.mean
im *= self.scale
im = im.transpose((2, 0, 1))
return im
def deprocess(self, im):
inverse of preprocess()
im = im.transpose(1, 2, 0)
im /= self.scale
im += self.mean
im = im[:, :, ::-1] # change to RGB
return np.uint8(im)
And in my train_test.prototxt file I have :
name: "CIFAR10_SimpleTest_PythonLayer"
layer {
name: 'MyPythonLayer'
type: 'Python'
top: 'data'
top: 'label'
include {
phase: TRAIN
python_param {
#the python script filename
module: 'mypythonlayer'
#the class name
layer: 'MyDataLayer'
#needed parameters in json
param_str: '{"phase":"TRAIN", "batch_size":10, "im_height":32, "im_width":32, "image_root": "G:/Caffe/examples/cifar10/testbed/Train and Test using Pycaffe", "label": "G:/Caffe/examples/cifar10/testbed/Train and Test using Pycaffe/train_cifar10.txt"}'
layer {
name: 'MyPythonLayer'
type: 'Python'
top: 'data'
top: 'label'
include {
phase: TEST
python_param {
#the python script filename
module: 'mypythonlayer'
#the class name
layer: 'MyDataLayer'
#needed parameters in json
param_str: '{"phase":"TEST", "batch_size":10, "im_height":32, "im_width":32, "image_root": "G:/Caffe/examples/cifar10/testbed/Train and Test using Pycaffe", "label": "G:/Caffe/examples/cifar10/testbed/Train and Test using Pycaffe/test_cifar10.txt"}'
Whats wrong here?
Your data layer is not efficient enough and it takes most of the training time (you should try caffe time ... to get a more detailed profiling). At each forward pass you are waiting for the python layer to read batch_size images from disk one after the other. This can take forever.
You should consider using Multiprocessing to perform the reading at the background while the net is processing the previous batches: this should give you good CPU/GPU utilization.
See this example for multiprocessing python data layer.
Python layers are executed on CPU not the GPU so it's slow because things have to keep going between the CPU and GPU when training. That's also why you see low gpu usage because its waiting on the cpu to execute the python layer.
