Uploading and labeling pairs of photos - machine-learning

I created a ResNet18 to detect if 2 individuals are siblings or not, by giving an image of each one (the model has input_size = 2).
I need to create my dataset, in which I will specify which pair are siblings or not.
I tried:
training_set = train_datagen.flow_from_directory('training',
target_size=(28,28),
batch_size=32,
class_mode='binary')
And I got training_set.classes array([0, 0, 0, 0, 1, 1, 1, 1])
for training_set.filenames
'false\\false1\\_DSC5763.jpg',
'false\\false2\\_DSC5751.jpg',
'false\\false2\\_DSC5760.jpg',
'siblings\\siblings1\\_DSC5751.jpg',
'siblings\\siblings1\\_DSC5755_1.jpg',
'siblings\\siblings2\\_DSC5760.jpg',
'siblings\\siblings2\\_DSC5763.jpg'
The training_set.classes should be array([0, 0, 1, 1]), for my purposes.
How can I do this?

I finished my project and I thought to come back to post the answer I found. I am trying to classify if 2 individuals are siblings or not.
#Lists used for creating the dataset
categories = []
first_img= []
second_img = []
#Parsing throw the images and making 2 arrays
for filename in filenames:
category = filename.split('.')[0]
#Each pair is named <sibling/false>+<nr_of_pair>+0/1
if 'sibling' in category:
if filename.split('_')[1][0] == '0':
first_img.append(filename)
categories.append(1)
else:
second_img.append(filename)
else:
if filename.split('_')[1][0] == '0':
first_img.append(filename)
categories.append(0)
else:
second_img.append(filename)
#dataset of the first individual of the pair and it's label
df1 = pd.DataFrame({
'filename': first_img,
'category': categories
}).astype('str')
#dataset of the second individual of the pair and it's label
df2 =pd.DataFrame({
'filename': second_img,
'category': categories
}).astype('str')
And for the fit_generator I used the function.
def generate_generator_multiple(datagen):
train_generator1 = datagen.flow_from_dataframe(df1,
"../train/input/",
x_col='filename',
y_col='category',
class_mode='binary',
target_size=(image_size1, image_size2),
batch_size = batch_size)
train_generator2 = datagen.flow_from_dataframe(df2,
"../train/input/",
x_col='filename',
y_col='category',
class_mode='binary',
target_size=(image_size1, image_size2),
batch_size = batch_size)
while True:
X1i = train_generator1.next()
X2i = train_generator2.next()
yield [X1i[0], X2i[0]], X2i[1] #Yield both images and their mutual
datagen is a ImageDataGenerator object

Related

Pytorch: Deep learning model architecture for smoothing data

What I am trying to do
I want to create a model that smoothes my predictions. My predictions have a shape [num samples, 4, 7], where 4 is the sequence length and 7 is the number of classes. The class values sum to 100.
However, my predictions often fluctuate, predicting for example a value of 50 for class 5 at time step 1, and 89 for time step 2. In reality, a class rarely makes such extreme fluctuations. So, I want to smooth my predictions,
I have training data that has a similar shape [num samples, 4, 7]. I want to create a model that learns the behavior of classes using this data, and then applies that on my predictions, hopefully smoothing my results.
I understand that I can just average out results and smooth like that, but I am curious if I can use a deep learning model that understand underlying probabilities and indirectly more corrects as well as smoothes the predictions.
What I have tried
However, I am struggling to understand how one creates such an architecture. I have tried working with matrices as well as with LSTM:
class SmoothModel(nn.Module):
def __init__(self, input_size, output_size):
super(SmoothModel, self).__init__()
self.input_size = input_size
self.output_size = output_size
# Initialize the cooccurrence matrix as a learnable parameter
self.cooccurrence = nn.Parameter(torch.randn(input_size, output_size))
# Initialize the transition matrix as a learnable parameter
self.transition = nn.Parameter(torch.randn(input_size, output_size))
# Softmax layer
self.softmax = nn.Softmax(dim=-1)
def forward(self, x):
sequences_updated = []
# Update sequence based on transition and cooccurence matrix
for i in range(x.shape[0]):
# Cooccurence multiplication
seq_list = []
for j in range(x.shape[1]):
predicted_cooc = x[i, j, :].unsqueeze(0) # shape [1, 7]
updated_cooc = torch.matmul(predicted_cooc, self.cooccurrence)
seq_list.append(updated_cooc)
# Create to a sequence of 4 again, where the cooccurence is updated
seq = torch.cat(seq_list, dim=0) # create shape [4, 7]
# Transition multiplication
updated_seq = torch.matmul(seq, self.transition) # shape [4, 7]
# Append the updated sequence
sequences_updated.append(updated_seq.unsqueeze(0)) # append shape [1, 4, 7]
# Create tensor with all updated sequences
updated_tensor = torch.cat(sequences_updated, dim=0) # dim = 0 is the number of samples
# Output should sum to 100
updated_tensor = self.softmax(updated_tensor) * 100
return updated_tensor
My idea behind this model was that it would update my predictions based on learned cooccurrence and transition probabilities.
Another model I tried, but with LSTM:
class SmoothModel(nn.Module):
def __init__(self, input_size, output_size, hidden_size = 64):
super(SmoothModel, self).__init__()
self.input_size = input_size
self.output_size = output_size
# Initialize the cooccurrence as a learnable parameter
self.cooccurrence = nn.Linear(input_size, output_size)
# Initialize the transition probability as a learnable parameter
self.transition = nn.LSTM(input_size, hidden_size)
self.transition_probability = nn.Linear(hidden_size, output_size)
# Softmax layer
self.softmax = nn.Softmax(dim=-1)
def forward(self, x):
sequences_updated = []
# Update sequence based on transition and cooccurence matrix
for i in range(x.shape[0]):
# Cooccurence multiplication
seq_list = []
for j in range(x.shape[1]):
predicted_cooc = x[i, j, :].unsqueeze(0) # shape [1, 7]
updated_cooc = self.cooccurrence(predicted_cooc)
seq_list.append(updated_cooc)
# Create to a sequence of 4 again, where the cooccurence is updated
seq = torch.cat(seq_list, dim=0) # create shape [4, 7]
# Transition probability
_, (hidden, _) = self.transition(seq.unsqueeze(0)) # shape [1, 4, hidden size]
updated_seq = self.transition_probability(hidden[-1, :, :].unsqueeze(0)) # shape [1, 4, 7]
# Append the updated sequence
sequences_updated.append(updated_seq) # append
# Create tensor with all updated sequences
updated_tensor = torch.cat(sequences_updated, dim=0) # dim = 0 is the number of samples
# Output should sum to 100
updated_tensor = self.softmax(updated_tensor) * 100
return updated_tensor
I furthermore tried some variants on this, for example only updating time step per timestep and sort of Markov Chain theory. But current models don't improve results.
Question
Does anyone have experience regarding this / know what theory/architecture I could be using? Or should I look at it a total different way?
I am happy to provide further (data) information if necessary!

Questions regarding custom multiclass metrics (Keras)

could anyone explain how to write a custom multiclass metrics for Keras? I tried to write custom metric but encountered some issue. Main problem is I am not familiar with how tensor works during training (I think it is called Graph mode?). I am able to create confusion matrix and derived F1 score using NumPy or Python list.
I printed out the y-true and y_pred and tried to understand them, but the output was not what I expected:
Below is the function I used:
def f1_scores(y_true,y_pred):
y_true = K.print_tensor(y_true, message='y_true = ')
y_pred = K.print_tensor(y_pred, message='y_pred = ')
print(f"y_true_shape:{K.int_shape(y_true)}")
print(f"y_pred_shape:{K.int_shape(y_pred)}")
y_true_f = K.flatten(y_true)
y_pred_f = K.flatten(y_pred)
gt = K.argmax(y_true_f)
pred = K.argmax(y_pred_f)
print(f"pred_print:{pred}")
print(f"gt_print:{gt}")
pred = K.print_tensor(pred, message='pred= ')
gt = K.print_tensor(gt, message='gt =')
print(f"pred_shape:{K.int_shape(pred)}")
print(f"gt_shape:{K.int_shape(gt)}")
pred_f = K.flatten(pred)
gt_f = K.flatten(gt)
pred_f = K.print_tensor(pred_f, message='pred_f= ')
gt_f = K.print_tensor(gt_f, message='gt_f =')
print(f"pred_f_shape:{K.int_shape(pred_f)}")
print(f"gt_f_shape:{K.int_shape(gt_f)}")
conf_mat = tf.math.confusion_matrix(y_true_f,y_pred_f, num_classes = 14)
"""
add codes to find F1 score for each class
"""
# return an arbitrary number, as F1 scores not found yet.
return 1
The output at when epoch 1 just started:
y_true_shape:(None, 256, 256, 14)
y_pred_shape:(None, 256, 256, 14)
pred_print:Tensor("ArgMax_1:0", shape=(), dtype=int64)
gt_print:Tensor("ArgMax:0", shape=(), dtype=int64)
pred_shape:()
gt_shape:()
pred_f_shape:(1,)
gt_f_shape:(1,)
Then for the rest of the steps and epochs were similar as below:
y_true = [[[[1 0 0 ... 0 0 0]
[1 0 0 ... 0 0 0]
[1 0 0 ... 0 0 0]
...
y_pred = [[[[0.0889623 0.0624801107 0.0729747042 ... 0.0816219151 0.0735477135 0.0698677748]
[0.0857798532 0.0721047595 0.0754121244 ... 0.0723947287 0.0728530064 0.0676521733]
[0.0825942457 0.0670698211 0.0879610255 ... 0.0721599609 0.0845924541 0.0638583601]
...
pred= 1283828
gt = 0
pred_f= [1283828]
gt_f = [0]
Why is pred a number instead of a list of numbers with each number represents index of class? Similarly, why is pred_f is a list with only one number instead of list of indices?
And for gt (and gt_f), why is the value 0? I expect them to be list of indices.
I looks like argmax() simply uses the flattened y.
You need to specify which axis you want argmax() to reduce. Probably it's the last one, in your case 3. Then you'll get pred with a shape (None, 256, 256) containing integer between 0 and 13.
Try something like this: pred = K.argmax(y_pred, axis=3)
This is the documentation for tensorflow argmax. (But I'm not sure if you're using exactly that, since I can not see what K is imported as)

Pytorch Multiclass Logistic Regression Type Errors

I'm new to ML and even more naive with Pytorch. Here's the problem. (I've skipped certain parts like the random_split() which seem to work just fine)
I've to predict wine quality (red) which from the dataset is the last column with 6 classes
That's what my dataset looks like
The link to the dataset (winequality-red.csv)
features = df.drop(['quality'], axis = 1)
targets = df.iloc[:, -1] # theres 6 classes
dataset = TensorDataset(torch.Tensor(np.array(features)).float(), torch.Tensor(targets).float())
# here's where I think the error might be, but I might be wrong
batch_size = 8
# Dataloader
train_loader = DataLoader(train_ds, batch_size, shuffle = True)
val_loader = DataLoader(val_ds, batch_size)
test_ds = DataLoader(test_ds, batch_size)
input_size = len(df.columns) - 1
output_size = 6
threshold = .5
class WineModel(nn.Module):
def __init__(self):
super().__init__()
self.linear = nn.Linear(input_size, output_size)
def forward(self, xb):
out = self.linear(xb)
return out
model = WineModel()
n_iters = 2000
num_epochs = n_iters / (len(train_ds) / batch_size)
num_epochs = int(num_epochs)
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-2)
# the part below returns the error on running
iter = 0
for epoch in range(num_epochs):
for i, (x, y) in enumerate(train_loader):
optimizer.zero_grad()
outputs = model(x)
loss = criterion(outputs, y)
loss.backward()
optimizer.step()
RuntimeError: expected scalar type Long but found Float
Hopefully that is sufficient info
The targets for nn.CrossEntropyLoss are given as the class indices, which are required to be integers, to be precise they need to be of type torch.long, which is equivalent to torch.int64.
You converted the targets to floats, but you should convert them to longs:
dataset = TensorDataset(torch.Tensor(np.array(features)).float(), torch.Tensor(targets).long())
Since the targets are the indices of the classes, they must be in range [0, num_classes - 1]. As you have 6 classes that would be in range [0, 5]. Having a quick look at your data, the quality uses values in range [3, 8]. Even though you have 6 classes, the values cannot be used directly as the classes. If you list the classes as classes = [3, 4, 5, 6, 7, 8], you can see that the first class is 3, classes[0] == 3, up to the last class being classes[5] == 8.
You need to replace the class values with the indices, just like you would for named classes (e.g. if you had the classes dog and cat, dog would be 0 and cat would be 1), but you can avoid having to look them up, since the values are simply shifted by 3, i.e. index = classes[index] - 3. Therefore you can subtract 3 from the entire target tensor:
torch.Tensor(targets).long() - 3

Creating and shaping data for 1D CNN

I have a (244, 108) numpy array. It contains percentage change of close value of a trade for each minute in one day ie 108 values and like that for 244 days. Basically its a 1D vector. So in order to do 1D CNN how should I shape my data?
What i have done:
x.shape = (244, 108)
x = np.expand_dims(x, axis=2)
x.shape = (243, 108, 1)
y.shape = (243,)
Model:
class Net(torch.nn.Module):
def __init__(self):
super(Net, self).__init__()
self.layer1 = torch.nn.Conv1d(in_channels=108, out_channels=1, kernel_size=1, stride=1)
self.act1 = torch.nn.ReLU()
self.act2 = torch.nn.MaxPool1d(kernel_size=1, stride=1)
self.layer2 = torch.nn.Conv1d(in_channels=1, out_channels=1, kernel_size=1, stride=1)
self.act3 = torch.nn.ReLU()
self.act4 = torch.nn.MaxPool1d(kernel_size=1, stride=1)
self.linear_layers = nn.Linear(1, 1)
# Defining the forward pass
def forward(self, x):
x = self.layer1(x)
x = self.act1(x)
x = self.act2(x)
x = self.layer2(x)
x = self.act3(x)
x = self.act4(x)
x = self.linear_layers(x)
return x
If each day should be separate instance for convolution your data should have the shape (248, 1, 108). This seems more reasonable.
If you want all your days and minutes to be a continuum for network to learn it should be of shape (1, 1, 248*108).
Basically first dimension is batch (how many training samples), second is the number of channels or features of sample (only one in your case) and last is the number of timesteps.
Edit
Your pooling layer should be torch.nn.AdaptiveMaxPool1d(1). You should also reshape output from this layer like this: pooled.reshape(x.shape[0], -1) before pushing it through torch.nn.Linear layer.

Trying to understand Pytorch's implementation of LSTM

I have a dataset containing 1000 examples where each example has 5 features (a,b,c,d,e). I want to feed 7 examples to an LSTM so it predicts the feature (a) of the 8th day.
Reading Pytorchs documentation of nn.LSTM() I came up with the following:
input_size = 5
hidden_size = 10
num_layers = 1
output_size = 1
lstm = nn.LSTM(input_size, hidden_size, num_layers)
fc = nn.Linear(hidden_size, output_size)
out, hidden = lstm(X) # Where X's shape is ([7,1,5])
output = fc(out[-1])
output # output's shape is ([7,1])
According to the docs:
The input of the nn.LSTM is "input of shape (seq_len, batch, input_size)" with "input_size – The number of expected features in the input x",
And the output is: "output of shape (seq_len, batch, num_directions * hidden_size): tensor containing the output features (h_t) from the last layer of the LSTM, for each t."
In this case, I thought seq_len would be the sequence of 7 examples, batchis 1 and input_size is 5. So the lstm would consume each example containing 5 features refeeding the hidden layer every iteration.
What am I missing?
When I extend your code to a full example -- I also added some comments to may help -- I get the following:
import torch
import torch.nn as nn
input_size = 5
hidden_size = 10
num_layers = 1
output_size = 1
lstm = nn.LSTM(input_size, hidden_size, num_layers)
fc = nn.Linear(hidden_size, output_size)
X = [
[[1,2,3,4,5]],
[[1,2,3,4,5]],
[[1,2,3,4,5]],
[[1,2,3,4,5]],
[[1,2,3,4,5]],
[[1,2,3,4,5]],
[[1,2,3,4,5]],
]
X = torch.tensor(X, dtype=torch.float32)
print(X.shape) # (seq_len, batch_size, input_size) = (7, 1, 5)
out, hidden = lstm(X) # Where X's shape is ([7,1,5])
print(out.shape) # (seq_len, batch_size, hidden_size) = (7, 1, 10)
out = out[-1] # Get output of last step
print(out.shape) # (batch, hidden_size) = (1, 10)
out = fc(out) # Push through linear layer
print(out.shape) # (batch_size, output_size) = (1, 1)
This makes sense to me, given your batch_size = 1 and output_size = 1 (I assume, you're doing regression). I don't know where your output.shape = (7, 1) come from.
Are you sure that your X has the correct dimensions? Did you create nn.LSTM maybe with batch_first=True? There are lot of little things that can sneak in.

Resources