I want to develop a CNN model to identify 24 hand signs in American Sign Language. I created a custom dataset that contains 3000 images for each hand sign i.e. 72000 images in the entire dataset.
For training the model, I would be using 80-20 dataset split (2400 images/hand sign in the training set and 600 images/hand sign in the validation set).
My question is:
Should I randomly shuffle the images when creating the dataset? And Why?
Based on my previous experience, it led to validation loss being lower than training loss and validation accuracy more than training accuracy. Check this link.
Random shuffling of data is a standard procedure in all machine learning pipelines, and image classification is not an exception; its purpose is to break possible biases during data preparation - e.g. putting all the cat images first and then the dog ones in a cat/dog classification dataset.
Take for example the famous iris dataset:
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
y
# result:
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])
As you can clearly see, the dataset has been prepared in such a way that the first 50 samples are all of label 0, the next 50 of label 1, and the last 50 of label 2. Try to perform a 5-fold cross validation in such a dataset without shuffling and you'll find most of your folds containing only a single label; try a 3-fold CV, and all your folds will include only one label. Bad... BTW, it's not just a theoretical possibility, it has actually happened.
Even if no such bias exists, shuffling never hurts, so we do it always just to be on the safe side (you never know...).
Based on my previous experience, it led to validation loss being lower than training loss and validation accuracy more than training accuracy. Check this link.
As noted in the answer there, it is highly unlikely that this was due to shuffling. Data shuffling is not anything sophisticated - essentially, it is just the equivalent of shuffling a deck of cards; it may have happened once that you insisted on "better" shuffling and subsequently you ended up with a straight flush hand, but obviously this was not due to the "better" shuffling of the cards.
Here is my two cents on the topic.
First of all make sure to extract a test set that has equal number of samples for each hand sign. (hand sign #1 - 500 samples, hand sign #2 - 500 samples and so on)
I think this is referred to as stratified sampling.
When it comes to the training set, there is no huge mistake in shuffling the entire set. However, when splitting the training set into training and validation set make sure that the validation set is good enough to be a representation for the test set.
One of my personal experiences with shuffling:
After splitting the training set into training and validation sets, the validation set turned out to be very easy to predict. Therefore, I saw good learning metric values. However, the performance of the model on the test set was horrible.
Related
I would like to train a ML model in Caret based on training data.
I have a training data from the following structure:
df <- data.frame(Label = c("A","A","A","B","A", "A","A","B","B","A", "B","B","A","A","A"), EXPERIMENT = c("X","X","X","X","X", "Y","Y","Y","Y","Y", "Z","Z","Z","Z","Z"), VALUE1 = c( 1, 2, 1, 5, 1, 3, 1, 5, 6, 1, 7, 5, 1, 2, 2), VALUE2 = c( 9, 7, 8, 1, 8, 2, 1, 9, 8, 2, 7, 7, 2, 1, 1) )
I would want to use train and split the data according to experiments for cross-validation training (in this experiment 3 crossvalidation splits).
that is
Split1: training = X,Y and validation = Z
Split2: training = X,Z and validation = Y
Split2: training = Y,Z and validation = X
How can I do that? With traincontrol?
I found a index option in traincontrol, but did not understand, if that can do it.
The idea is simple, but the execution is bothering me.
I've created a small random dungeon generator that create a grid like this:
000001
000111
000111
001101
011101
011111
This is a sample 6x6 dungeon where 0 is a wall and 1 is an open path.
The conversion from this to some sort of tile id map is simple, and trivial, but creating the image itself is the hard part.
I want to know if there's a lib, or method to achieve that. If not, then what would you do?
This is not part of a game, and only a dungeon generator for DND. Any language is OK, but the generator was made in Go.
You can use OpenCV for this task. Probably PIL can do the same, don't have exp with it.
import cv2
import numpy as np
data_list = [
[0, 0, 0, 0, 0, 1],
[0, 0, 0, 1, 1, 1],
[0, 0, 0, 1, 1, 1],
[0, 0, 1, 1, 0, 1],
[0, 1, 1, 1, 0, 1],
[0, 1, 1, 1, 1, 1]
]
arr = np.array(data_list, dtype=np.uint8) * 255
arr = cv2.resize(arr, (0, 0), fx=50, fy=50, interpolation=cv2.INTER_NEAREST)
cv2.imshow("img", arr)
cv2.waitKey()
# or you can save on disk
cv2.imwrite("img.png", arr)
use np.block()
# a bunch of sprites/images, all the same size
# load them however you like
tiles = [...]
data_list = [
[0, 0, 0, 0, 0, 1],
[0, 0, 0, 1, 1, 1],
[0, 0, 0, 1, 1, 1],
[0, 0, 1, 1, 0, 1],
[0, 1, 1, 1, 0, 1],
[0, 1, 1, 1, 1, 1]
]
picture = np.block([
[tiles[k] for k in row]
for row in data_list
])
Or, if you use any kind of game engine, or something even more trivial, like SDL/PyGame, simply "blit" each tile.
PIL, as you found out, is perfectly capable of blitting one image (tile) onto another (whole map).
I kind of managed to get a solution, but it will be a Python only.
Using PIL I can make a mosaic with tile images and create the map. It's not a solid solution made from scratch but it can do the Job.
I'm still open for another approach.
My solution is this method here:
matrix = np.loadtxt(input_file, usecols=range(matrix_square), dtype=int)
tiles = []
for file in glob.glob("./tiles/*"):
im = Image.open(file)
tiles.append(im)
output = Image.new('RGB', (image_width,image_height))
for i in range(matrix_width):
for j in range(matrix_height):
x,y = i*tile_size,j*tile_size
index = matrix[j][i]
output.paste(tiles[index],(x,y))
output.save(output_file)
The matrix_square is the matrix dimensions (as a square). I'm still working on a better solution, but this is working fine for me.
You need to change the tile_size to match the tile resolution that you're using.
This is a generated dungeon with this method
The tiles are bad, but the grid is fine enough.
After using k-means i have 3 clusters.
I've used 10 features (marks) in k-means for this data set.
I'm understand that we can't draw 10D chart, but how can i visualize this clusters?
Should i separate data by 2 or 3 features instead 10?
What axises should i use in my case?
For drawing i'm using js and highcharts.js on client side.
Example of code (just for stackoverflow requirement), but I have 10 coordinates for every point
const kmeans = require('ml-kmeans');
let data = [[1, 1, 1, 1, 1], [1, 2, 1, 1, 1], [-1, -1, -1, 1, 1], [-1, -1, -1.5, 1, 1]];
let centers = [[1, 2, 1, 1, 1], [-1, -1, -1, 1, 1]];
let ans = kmeans(data, 2, { initialization: centers });
console.log(ans);
/*KMeansResult*/
{
clusters: [ 0, 0, 1, 1, 1 ]
centroids:
[ { centroid: [ 1, 1.5, 1, 1, 1 ], error: 0.25, size: 2 },
{ centroid: [ -1, -1, -1.25, 1, 1 ], error: 0.0625, size: 2 } ],
converged: true, iterations: 1
}
*/*
Use your favorite generic visualization approach. Clusterings do not have very special requirements.
E.g.
Scatterplot matrix
Dimensionality reduction with PCA
tSNE embeddings
MDS
UMAP
Boxplots
Violin plots
...
Why is the structuring element asymmetric in OpenCV?
cv2.getStructuringElement(cv2.MORPH_ELLIPSE, ksize=(4,4))
returns
array([[0, 0, 1, 0],
[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]], dtype=uint8)
Why isn't it
array([[0, 1, 1, 0],
[1, 1, 1, 1],
[1, 1, 1, 1],
[0, 1, 1, 0]], dtype=uint8)
instead?
Odd-sized structuring elements are also asymmetric with respect to 90-degree rotations:
array([[0, 0, 1, 0, 0],
[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1],
[0, 0, 1, 0, 0]], dtype=uint8)
What's the purpose of that?
There's no purpose for it other than it's one of many possible interpolations for such a shape. In the case of the ellipse with size 5, if it were full it would just be the same as the MORPH_RECT and if the same two were removed from the sides as from the top it would be a diamond. Either way, the way it's actually implemented in the source code is what you would expect---it creates a circle via the distance function and takes near integers to get the binary pixels. Search that file for cv::getStructuringElement and you'll find the implementation, it's nothing too fancy.
If you think an update to this function should be made, then open up a PR on GitHub with the implemented version, or an issue to discuss it first. I think a successful contribution would be easy here and I'd venture that the case for symmetry is strong. One would expect the result of a symmetric image being processed with an elliptical kernel wouldn't depend on orientation of the image.
I was trying to get the nullity and kernel of a matrix over the complex field in Maxima.
I get strange results, though.
I can define a matrix A:
M : matrix([0, 1, 1, 0], [-1, 0, 0, 1], [0, 0, 0, 1], [0, 0, -1, 0]);
A : M + %i * ident(4);
... for reference, it looks like this:
%i 1 1 0
-1 %i 0 1
0 0 %i 1
0 0 -1 %i
If I then compute the nullity with nullity(A), I get 3.
If I compute the rank with rank(A), I also get 3.
And if I compute the nullspace with nullspace(A), I get:
span([-1, %i, 0, 0], [-%i, -1, 0, 0], [2%i, 2, 0, 0])
But this is pretty weird, because -%i * second(...) is [-1, %i, 0, 0], which is the first vector.
And indeed, when I do NullSpace[{{i, 1, 1, 0}, {-1, i, 0, 1}, {0, 0, i, 1}, {0, 0, -1, i}}] in Mathematica, I get that the nullspace has basis [%i, 1, 0, 0] and is 1-dimensional (not 3-dimensional).
What am I doing wrong?
You are doing everything right, as far as I can tell. The problem is a bug in Maxima, which I have reported: https://sourceforge.net/p/maxima/bugs/3158/
I don't see any simple way to work around it. I am working on fixing the bug.