How to perform image augmentation for sequence of images representing a sample - machine-learning

I want to know how to perform image augmentaion for sequence image data.
The shape of my input to the model looks as below.
(None,30,112,112,3)
Where 30 is the number of images present in one sample. 112*112 are heigth and width,3 is the number of channels.
Currently I have 17 samples(17,30,112,112,3) which are not enough therefore i want make some sequence image augmentation so that I will have atleast 50 samples as (50,30,112,112,3)
(Note : My data set is not of type video,rather they are in the form of sequence of images captured at every 3 seconds.So,we can say that it is in the form of already extacted frames)
17 samples, each having 30 sequence images are stored in separate folders in a directory.
folder_1
folder_2,
.
.
.
folder_17
Can you Please let me know the code to perform data augmentation?

Here is an illustration of using imgaug library for a single image
# Reading an image using OpenCV
import cv2
img = cv2.imread('flower.jpg')
# Appending images 5 times to a list and convert to an array
images_list = []
for i in range(0,5):
images_list.append(img)
images_array = np.array(images_list)
The array images_array has shape (5, 133, 200, 3) => (number of images, height, width, number of channels)
Now our input is set. Let's do some augmentation:
# Import 'imgaug' library
import imgaug as ia
import imgaug.augmenters as iaa
# preparing a sequence of functions for augmentation
seq = iaa.Sequential([
iaa.Fliplr(0.5),
iaa.Crop(percent=(0, 0.1)),
iaa.LinearContrast((0.75, 1.5)),
iaa.AdditiveGaussianNoise(loc=0, scale=(0.0, 0.05*255), per_channel=0.5),
iaa.Multiply((0.8, 1.2), per_channel=0.2)
],random_order=True)
Refer to this page for more functions
# passing the input to the Sequential function
images_aug = seq(images=images_array)
images_aug is an array that contains the augmented images
# Display all the augmented images
for img in images_aug:
cv2.imshow('Augmented Image', img)
cv2.waitKey()
Some augmented results:
You can extend the above for your own problem.

Related

How to compare probe face images with gallery images with feature extractor | Python

I have a dataset that contains 1500 face images and i have selected 150 images as probe.
Now 150 images are in probe folder and other images are in gallery folder.
I have facenet feature extractor which extract features from images and save into .npy array to compute euclidean distance.
How i can compare these 150 images with whole gallery folder and draw a accuracy graph of rank-1,5 and 10 and between similar images and compute mAP?
First, i will run feature extractor to the test image. Then calculate the difference between each 150 images feature extraction results (lets say train sets) and test image feature extraction results.
all_res = []
for set in train sets :
res = set - test_res
res = sum(res)
all_res.append(res)
all_res = all_rest.sort()
So the smallest index of the all_res list are the first rank and the biggest one is the latest rank. I hope it can be good reference. Also you can use sklearn to evaluate your model such as SVC, accuracy_score, etc.
Suppose that 1500 face images are stored in source folder, and 150 images are stored in target folder.
#!pip install deepface
from deepface import DeepFace
targets = ["img1.jpg", "img2.jpg", "img150.jpg"]
resp = DeepFace.find(img_path = targets, db_path = "source", model_name = "Facenet")
BTW, you can set Facenet, VGG-Face, OpenFace, DeepFace or DeepID as model name.
Response object will return list of pandas data frames. Each data frame is sorted from the most similar one to least similar one. That's why, I'll get the 1st one.
index = 0
for df in resp:
if df.shape[0] > 0:
#print(targets[index], ": ", df.head(1))
df.to_csv("%s" % (targets[index]), index = False)
index = index + 1
This will match identities in two folders.

Python parallelization for code to combine multiple images

I am new to Python and am trying to parallelize a program that I somehow pieced together from the internet. The program reads all image files (usually multiple series of images such as abc001,abc002...abc015 and xyz001,xyz002....xyz015) in a specific folder and then combines images in a specified range. Most times, the number of files exceeds 10000, and my latest case requires me to combine 24000 images. Could someone help me with:
Taking 2 sets of images from different directories. Currently I have to move these images into 1 directory and then work in said directory.
Reading only specified files. Currently my program reads all files, saves names in an array (I think it's an array. Could be a directory also) and then uses only the images required to combine. If I specify a range of files, it still checks against all files in the directory and takes a lot of time.
Parallel Processing - I work with usually 10k files or sometimes more. These are images saved from the fluid simulations that I run at specific times. Currently, I save about 2k files at a time in separate folders and run the program to combine these 2000 files at one time. And then I copy all the output files to a separate folder to keep them together. It would be great if I could use all 16 cores on the processor to combine all files in 1 go.
Image series 1 is like so.
Consider it to be a series of photos of the cat walking towards the camera. Each frame is is suffixed with 001,002,...,n.
Image series 1 is like so.
Consider it to be a series of photos of the cat's expression changing with each frame. Each frame is is suffixed with 001,002,...,n.
The code currently combines each frame from set1 and set2 to provide output.png as shown in the link here.
import sys
import os
from PIL import Image
keywords=input('Enter initial characters of image series 1 [Ex:Scalar_ , VoF_Scene_]:\n')
keywords2=input('Enter initial characters of image series 2 [Ex:Scalar_ , VoF_Scene_]:\n')
directory = input('Enter correct folder name where images are present :\n') # FOLDER WHERE IMAGES ARE LOCATED
result1 = {}
result2={}
name_count1=0
name_count2=0
for filename in os.listdir(directory):
if keywords in filename:
name_count1 +=1
result1[name_count1] = os.path.join(directory, filename)
if keywords2 in filename:
name_count2 +=1
result2[name_count2] = os.path.join(directory, filename)
num1=input('Enter initial number of series:\n')
num2=input('Enter final number of series:\n')
num1=int(num1)
num2=int(num2)
if name_count1==(num2-num1+1):
a1=1
a2=name_count1
elif name_count2==(num2-num1+1):
a1=1
a2=name_count2
else:
a1=num1
a2=num2+1
for x in range(a1,a2):
y=format(x,'05') # '05' signifies number of digits in the series of file name Ex: [Scalar_scene_1_00345.png --> 5 digits], [Temperature_section_2_951.jpg --> 3 digits]. Change accordingly
y=str(y)
for comparison_name1 in result1:
for comparison_name2 in result2:
test1=result1[comparison_name1]
test2=result2[comparison_name2]
if y in test1 and y in test2:
a=test1
b=test2
test=[a,b]
images = [Image.open(x) for x in test]
widths, heights = zip(*(i.size for i in images))
total_width = sum(widths)
max_height = max(heights)
new_im = Image.new('RGB', (total_width, max_height))
x_offset = 0
for im in images:
new_im.paste(im, (x_offset,0))
x_offset += im.size[0]
output_name='output'+y+'.png'
new_im.save(os.path.join(directory, output_name))
I did a Python version as well, it's not quite as fast but it is maybe closer to your heart :-)
#!/usr/bin/env python3
import cv2
import numpy as np
from multiprocessing import Pool
def doOne(params):
"""Append the two input images side-by-side to output the third."""
imA = cv2.imread(params[0], cv2.IMREAD_UNCHANGED)
imB = cv2.imread(params[1], cv2.IMREAD_UNCHANGED)
res = np.hstack((imA, imB))
cv2.imwrite(params[2], res)
if __name__ == '__main__':
# Build the list of jobs - each entry is a tuple with 2 input filenames and an output filename
jobList = []
for i in range(1000):
# Horizontally append a-XXXXX.png to b-XXXXX.png to make c-XXXXX.png
jobList.append( (f'a-{i:05d}.png', f'b-{i:05d}.png', f'c-{i:05d}.png') )
# Make a pool of processes - 1 per CPU core
with Pool() as pool:
# Map the list of jobs to the pool of processes
pool.map(doOne, jobList)
You can do this a little quicker with libvips. To join two images left-right, enter:
vips join left.png out.png result.png horizontal
To test, I made 200 pairs of 1200x800 PNGs like this:
for i in {1..200}; do cp x.png left$i.png; cp x.png right$i.png; done
Then tried a benchmark:
time parallel vips join left{}.png right{}.png result{}.png horizontal ::: {1..200}
real 0m42.662s
user 2m35.983s
sys 0m6.446s
With imagemagick on the same laptop I see:
time parallel convert left{}.png right{}.png +append result{}.png ::: {1..200}
real 0m55.088s
user 3m24.556s
sys 0m6.400s
You can do that much faster without Python, and using multi-processing with ImageMagick or libvips.
The first part is all setup:
Make 20 images, called a-000.png ... a-019.png that go from red to blue:
convert -size 64x64 xc:red xc:blue -morph 18 a-%03d.png
Make 20 images, called b-000.png ... b-019.png that go from yellow to magenta:
convert -size 64x64 xc:yellow xc:magenta -morph 18 b-%03d.png
Now append them side-by-side into c-000.png ... c-019.png
for ((f=0;f<20;f++))
do
z=$(printf "%03d" $f)
convert a-${z}.png b-${z}.png +append c-${z}.png
done
Those images look like this:
If that looks good, you can do them all in parallel with GNU Parallel:
parallel convert a-{}.png b-{}.png +append c-{}.png ::: {1..19}
Benchmark
I did a quick benchmark and made 20,000 images a-00000.png...a-019999.png and another 20,000 images b-00000.png...b-019999.png with each image 1200x800 pixels. Then I ran the following command to append each pair horizontally and write 20,000 output images c-00000.png...c-019999.png:
seq -f "%05g" 0 19999 | parallel --eta convert a-{}.png b-{}.png +append c-{}.png
and that takes 16 minutes on my MacBook Pro with all 12 CPU cores pegged at 100% throughout. Note that you can:
add spacers between the images,
write annotation onto the images,
add borders,
resize
if you wish and do lots of other processing - this is just a simple example.
Note also that you can get even quicker times - in the region of 10-12 minutes if you accept JPEG instead of PNG as the output format.

How is Spark reading my image using the image format?

It might be a silly question but I can't figure out how Spark read my image using the spark.read.format("image").load(....) argument.
After importing my image which gives me the following:
>>> image_df.select("image.height","image.width","image.nChannels", "image.mode", "image.data").show()
+------+-----+---------+----+--------------------+
|height|width|nChannels|mode| data|
+------+-----+---------+----+--------------------+
| 430| 470| 3| 16|[4D 55 4E 4C 54 4...|
+------+-----+---------+----+--------------------+
I arrive to the conclusion that:
my image is 430x470 pixels,
my image is colored (RGB due to nChannels = 3) which is an openCV compatible-type,
my image mode is 16 which corresponds to a particular openCV byte-order.
Does someone knows which website/documentation I could browse to know more about it?
the data in the data column is of type Binary but:
when I run image_df.select("image.data").take(1) I got an output which seems to be only one array (see below).
>>> image_df.select("image.data").take(1)
# **1/** Here are the last elements of the result
....<<One Eternity Later>>....x92\x89\x8a\x8d\x84\x86\x89\x80\x84\x87~'))]
# 2/ I got also several part of the result which looks like:
.....\x89\x80\x80\x83z|\x7fvz}tpsjqtkrulsvmsvmsvmrulrulrulqtkpsjnqhnqhmpgmpgmpgnqhnqhn
qhnqhnqhnqhnqhnqhmpgmpgmpgmpgmpgmpgmpgmpgnqhnqhnqhnqhnqhnqhnqhnqhknejmdilcilchkbh
kbilcilckneloflofmpgnqhorioripsjsvmsvmtwnvypx{ry|sz}t{~ux{ry|sy|sy|sy|sz}tz}tz}tz}
ty|sy|sy|sy|sz}t{~u|\x7fv|\x7fv}.....
What come next are linked to the results displayed above. Those might be due to my lack of knowledge concerning openCV (or else). Nonetheless:
1/ I don't understand the fact that if I got an RGB image, I should have 3 matrix but the output finishes by .......\x84\x87~'))]. I was more thinking on obtaining something like [(...),(...),(...\x87~')].
2/ Is this part has a special meaning? Like those are the separator between each matrix or something?
To be more clear about what I'm trying to achieve, I want to process images to do pixel comparison between each images. Therefore, I want to know the pixel values for a given position in my image (I assume that if I have an RGB image, I shall have 3 pixel values for a given position).
Example: let's say that I have a webcam pointing to the sky only during the day and I want to know the values of a pixel at a position corresponding to the top left sky part, I found out that the concatenation of those values gives the colour Light Blue which says that the photo was taken on a sunny day. Let's say that the only possibility is that a sunny day takes the colour Light Blue.
Next I want to compare the previous concatenation with another concat of pixel values at the exact same position but from a picture taken the next day. If I found out that they are not equal then I conclude that the given picture was taken on a cloudy/rainy day. If equal then sunny day.
Any help on that would be highly appreciated. I have vulgarized my example for a better understanding but my goal is pretty much the same. I know that ML model can exist to achieve those stuff but I would be happy to try this first. My first goal is to split this column into 3 columns corresponding to each color code: a red matrix, a green matrix, a blue matrix
I think I have the logic. I used the keras.preprocessing.image.img_to_array() function to understand how the values are classified (since I have an RGB image, I must have 3 matrix: one for each color R G B). Posting that if someone wonder how it works, I might be wrong but I think I have something :
from keras.preprocessing import image
import numpy as np
from PIL import Image
# Using spark built-in data source
first_img = spark.read.format("image").schema(imageSchema).load(".....")
raw = first_img.select("image.data").take(1)[0][0]
np.shape(raw)
(606300,) # which is 470*430*3
# Using keras function
img = image.load_img(".../path/to/img")
yy = image.img_to_array(img)
>>> np.shape(yy)
(430, 470, 3) # the form is good but I have a problem of order since:
>>> raw[0], raw[1], raw[2]
(77, 85, 78)
>>> yy[0][0]
array([78., 85., 77.], dtype=float32)
# Therefore I used the numpy reshape function directly on raw
# to have 470 matrix of 3 lines and 470 columns:
array = np.reshape(raw, (430,470,3))
xx = image.img_to_array(array) # OPTIONAL and not used here
>>> array[0][0] == (raw[0],raw[1],raw[2])
array([ True, True, True])
>>> array[0][1] == (raw[3],raw[4],raw[5])
array([ True, True, True])
>>> array[0][2] == (raw[6],raw[7],raw[8])
array([ True, True, True])
>>> array[0][3] == (raw[9],raw[10],raw[11])
array([ True, True, True])
So if I understood well, spark will read the image as a big array - (606300,) here - where in fact each element are ordered and corresponds to their respective color shade (R G B).
After doing my little transformations, I obtain 430 matrix of 3 columns x 470 lines. Since my image is (470x430) for (WidthxHeight), each matrix corresponds to a pixel heigth position and inside each: 3 columns for each color and 470 lines for each width position.
Hope that helps someone :)!

Using Keras ImageDataGenerator in a regression model

I want to use the flow_from_directory method of the ImageDataGenerator
to generate training data for a regression model, where the target value can be any float value between 1 and -1. flow_from_directory has a "class_mode" parameter with the description
class_mode: one of "categorical", "binary", "sparse" or None. Default:
"categorical". Determines the type of label arrays that are returned:
"categorical" will be 2D one-hot encoded labels, "binary" will be 1D
binary labels, "sparse" will be 1D integer labels.
Which of these values should I take? None of them seems to really fit...
With Keras 2.2.4 you can use flow_from_dataframe which solves what you want to do, allowing you to flow images from a directory for regression problems. You should store all your images in a folder and load a dataframe containing in one column the image IDs and in the other column the regression score (labels) and set class_mode='other' in flow_from_dataframe.
Here you can find an example where the images are in image_dir, the dataframe with the image IDs and the regression scores is loaded with pandas from the "train file"
train_label_df = pd.read_csv(train_file, delimiter=' ', header=None, names=['id', 'score'])
train_datagen = ImageDataGenerator(rescale = 1./255, horizontal_flip = True,
fill_mode = "nearest", zoom_range = 0.2,
width_shift_range = 0.2, height_shift_range=0.2,
rotation_range=30)
train_generator = train_datagen.flow_from_dataframe(dataframe=train_label_df, directory=image_dir,
x_col="id", y_col="score", has_ext=True,
class_mode="other", target_size=(img_width, img_height),
batch_size=bs)
I think that organizing your data differently, using a DataFrame (without necessarily moving your images to new locations) will allow you to run a regression model. In short, create columns in your DataFrame containing the file path of each image and the target value. This allows your generator to keep regression values and images properly synced even when you shuffle your data at each epoch.
Here is an example showing how to link images with binomial targets, multinomial targets and regression targets just to show that "a target is a target is a target" and only the model might change:
df['path'] = df.object_id.apply(file_path_from_db_id)
df
object_id bi multi path target
index
0 461756 dog white /path/to/imgs/756/61/blah_461756.png 0.166831
1 1161756 cat black /path/to/imgs/756/61/blah_1161756.png 0.058793
2 3303651 dog white /path/to/imgs/651/03/blah_3303651.png 0.582970
3 3367756 dog grey /path/to/imgs/756/67/blah_3367756.png -0.421429
4 3767756 dog grey /path/to/imgs/756/67/blah_3767756.png -0.706608
5 5467756 cat black /path/to/imgs/756/67/blah_5467756.png -0.415115
6 5561756 dog white /path/to/imgs/756/61/blah_5561756.png -0.631041
7 31255756 cat grey /path/to/imgs/756/55/blah_31255756.png -0.148226
8 35903651 cat black /path/to/imgs/651/03/blah_35903651.png -0.785671
9 44603651 dog black /path/to/imgs/651/03/blah_44603651.png -0.538359
10 49557622 cat black /path/to/imgs/622/57/blah_49557622.png -0.295279
11 58164756 dog grey /path/to/imgs/756/64/blah_58164756.png 0.407096
12 95403651 cat white /path/to/imgs/651/03/blah_95403651.png 0.790274
13 95555756 dog grey /path/to/imgs/756/55/blah_95555756.png 0.060669
I describe how to do this in great detail with examples here:
https://techblog.appnexus.com/a-keras-multithreaded-dataframe-generator-for-millions-of-image-files-84d3027f6f43
At this moment (newest version of Keras from January 21st 2017) the flow_from_directory could only work in a following manner:
You need to have a directories structured in a following manner:
directory with images\
1st label\
1st picture from 1st label
2nd picture from 1st label
3rd picture from 1st label
...
2nd label\
1st picture from 2nd label
2nd picture from 2nd label
3rd picture from 2nd label
...
...
flow_from_directory returns batches of a fixed size in a format of (picture, label).
So as you can see it could only be used for a classification case and all options provided in a documentation specify only a way in which the class is provided to your classifier. But, there is a neat hack which could make a flow_from_directory useful for a regression task:
You need to structure your directory in a following manner:
directory with images\
1st value (e.g. -0.95423)\
1st picture from 1st value
2nd picture from 1st value
3rd picture from 1st value
...
2nd value (e.g. - 0.9143242)\
1st picture from 2nd value
2nd picture from 2nd value
3rd picture from 2nd value
...
...
You also need to have a list list_of_values = [1st value, 2nd value, ...]. Then your generator is defined in a following manner:
def regression_flow_from_directory(flow_from_directory_gen, list_of_values):
for x, y in flow_from_directory_gen:
yield x, list_of_values[y]
And it's crucial for a flow_from_directory_gen to have a class_mode='sparse' to make this work. Of course this is a little bit cumbersome but it works (I used this solution :) )
There's just one glitch in the accepted answer that I would like to point out. The above code fails with an error message like:
TypeError: only integer scalar arrays can be converted to a scalar index
This is because y is an array. The fix is simple:
def regression_flow_from_directory(flow_from_directory_gen,
list_of_values):
for x, y in flow_from_directory_gen:
values = [list_of_values[y[i]] for i in range(len(y))]
yield x, values
The method to generate the list_of_values can be found in https://stackoverflow.com/a/47944082/4082092

How do I create a dataset with multiple images the same format as CIFAR10?

I have images 1750*1750 and I would like to label them and put them into a file in the same format as CIFAR10. I have seen a similar answer before that gave an answer:
label = [3]
im = Image.open(img)
im = (np.array(im))
print(im)
r = im[:,:,0].flatten()
g = im[:,:,1].flatten()
b = im[:,:,2].flatten()
array = np.array(list(label) + list(r) + list(g) + list(b), np.uint8)
array.tofile("info.bin")
but it doesn't include how to add multiple images in a single file. I have looked at CIFAR10 and tried to append the arrays in the same way, but all I got was the following error:
E tensorflow/core/client/tensor_c_api.cc:485] Read less bytes than requested
Note that I am using Tensorflow to do my computations, and I have been able to isolate the problem from the data.
The CIFAR-10 binary format represents each example as a fixed-length record with the following format:
1-byte label.
1 byte per pixel for the red channel of the image.
1 byte per pixel for the green channel of the image.
1 byte per pixel for the blue channel of the image.
Assuming you have a list of image filenames called images, and a list of integers (less than 256) called labels corresponding to their labels, the following code would write a single file containing these images in CIFAR-10 format:
with open(output_filename, "wb") as f:
for label, img in zip(labels, images):
label = np.array(label, dtype=np.uint8)
f.write(label.tostring()) # Write label.
im = np.array(Image.open(img), dtype=np.uint8)
f.write(im[:, :, 0].tostring()) # Write red channel.
f.write(im[:, :, 1].tostring()) # Write green channel.
f.write(im[:, :, 2].tostring()) # Write blue channel.

Resources