Model Training Freezing StyleGAN2 in Google CoLab - machine-learning

I want to train a model using this dataset : https://decode.mit.edu/projects/biked/
and I followed this tutorial to do such : https://github.com/jeffheaton/present/blob/master/youtube/gan/colab_gan_train.ipynb
The Problem is once I start the command to start training the dataset the tick freezing on 0
is that normal? shout I keep it? because it's taking forever, I tried to change the worker number to 2 but still the same problem, I'm using Colab Free so I don't know, I even tried to use my own dataset but all of them are the same problem
Training options:
{
"num_gpus": 1,
"image_snapshot_ticks": 10,
"network_snapshot_ticks": 10,
"metrics": [
"fid50k_full"
],
"random_seed": 0,
"training_set_kwargs": {
"class_name": "training.dataset.ImageFolderDataset",
"path": "/content/drive/MyDrive/SquareImages.zip",
"use_labels": false,
"max_size": 42799,
"xflip": false,
"resolution": 1024
},
"data_loader_kwargs": {
"pin_memory": true,
"num_workers": 3,
"prefetch_factor": 2
},
"G_kwargs": {
"class_name": "training.networks.Generator",
"z_dim": 512,
"w_dim": 512,
"mapping_kwargs": {
"num_layers": 2
},
"synthesis_kwargs": {
"channel_base": 32768,
"channel_max": 512,
"num_fp16_res": 4,
"conv_clamp": 256
}
},
"D_kwargs": {
"class_name": "training.networks.Discriminator",
"block_kwargs": {},
"mapping_kwargs": {},
"epilogue_kwargs": {
"mbstd_group_size": 4
},
"channel_base": 32768,
"channel_max": 512,
"num_fp16_res": 4,
"conv_clamp": 256
},
"G_opt_kwargs": {
"class_name": "torch.optim.Adam",
"lr": 0.002,
"betas": [
0,
0.99
],
"eps": 1e-08
},
"D_opt_kwargs": {
"class_name": "torch.optim.Adam",
"lr": 0.002,
"betas": [
0,
0.99
],
"eps": 1e-08
},
"loss_kwargs": {
"class_name": "training.loss.StyleGAN2Loss",
"r1_gamma": 52.4288
},
"total_kimg": 25000,
"batch_size": 4,
"batch_gpu": 4,
"ema_kimg": 1.25,
"ema_rampup": 0.05,
"ada_target": 0.6,
"augment_kwargs": {
"class_name": "training.augment.AugmentPipe",
"xflip": 1,
"rotate90": 1,
"xint": 1,
"scale": 1,
"rotate": 1,
"aniso": 1,
"xfrac": 1,
"brightness": 1,
"contrast": 1,
"lumaflip": 1,
"hue": 1,
"saturation": 1
},
"run_dir": "/content/drive/MyDrive/exp/00003-SquareImages-auto1"
}
Output directory: /content/drive/MyDrive/exp/00003-SquareImages-auto1
Training data: /content/drive/MyDrive/SquareImages.zip
Training duration: 25000 kimg
Number of GPUs: 1
Number of images: 42799
Image resolution: 1024
Conditional model: False
Dataset x-flips: False
Creating output directory...
Launching processes...
Loading training set...
/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py:474: UserWarning: This DataLoader will create 3 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
warnings.warn(_create_warning_msg(
Num images: 42799
Image shape: [3, 1024, 1024]
Label shape: [0]
Constructing networks...
Setting up PyTorch plugin "bias_act_plugin"... Done.
Setting up PyTorch plugin "upfirdn2d_plugin"... Done.
Generator Parameters Buffers Output shape Datatype
--- --- --- --- ---
mapping.fc0 262656 - [4, 512] float32
mapping.fc1 262656 - [4, 512] float32
mapping - 512 [4, 18, 512] float32
synthesis.b4.conv1 2622465 32 [4, 512, 4, 4] float32
synthesis.b4.torgb 264195 - [4, 3, 4, 4] float32
synthesis.b4:0 8192 16 [4, 512, 4, 4] float32
synthesis.b4:1 - - [4, 512, 4, 4] float32
synthesis.b8.conv0 2622465 80 [4, 512, 8, 8] float32
synthesis.b8.conv1 2622465 80 [4, 512, 8, 8] float32
synthesis.b8.torgb 264195 - [4, 3, 8, 8] float32
synthesis.b8:0 - 16 [4, 512, 8, 8] float32
synthesis.b8:1 - - [4, 512, 8, 8] float32
synthesis.b16.conv0 2622465 272 [4, 512, 16, 16] float32
synthesis.b16.conv1 2622465 272 [4, 512, 16, 16] float32
synthesis.b16.torgb 264195 - [4, 3, 16, 16] float32
synthesis.b16:0 - 16 [4, 512, 16, 16] float32
synthesis.b16:1 - - [4, 512, 16, 16] float32
synthesis.b32.conv0 2622465 1040 [4, 512, 32, 32] float32
synthesis.b32.conv1 2622465 1040 [4, 512, 32, 32] float32
synthesis.b32.torgb 264195 - [4, 3, 32, 32] float32
synthesis.b32:0 - 16 [4, 512, 32, 32] float32
synthesis.b32:1 - - [4, 512, 32, 32] float32
synthesis.b64.conv0 2622465 4112 [4, 512, 64, 64] float32
synthesis.b64.conv1 2622465 4112 [4, 512, 64, 64] float32
synthesis.b64.torgb 264195 - [4, 3, 64, 64] float32
synthesis.b64:0 - 16 [4, 512, 64, 64] float32
synthesis.b64:1 - - [4, 512, 64, 64] float32
synthesis.b128.conv0 1442561 16400 [4, 256, 128, 128] float16
synthesis.b128.conv1 721409 16400 [4, 256, 128, 128] float16
synthesis.b128.torgb 132099 - [4, 3, 128, 128] float16
synthesis.b128:0 - 16 [4, 256, 128, 128] float16
synthesis.b128:1 - - [4, 256, 128, 128] float32
synthesis.b256.conv0 426369 65552 [4, 128, 256, 256] float16
synthesis.b256.conv1 213249 65552 [4, 128, 256, 256] float16
synthesis.b256.torgb 66051 - [4, 3, 256, 256] float16
synthesis.b256:0 - 16 [4, 128, 256, 256] float16
synthesis.b256:1 - - [4, 128, 256, 256] float32
synthesis.b512.conv0 139457 262160 [4, 64, 512, 512] float16
synthesis.b512.conv1 69761 262160 [4, 64, 512, 512] float16
synthesis.b512.torgb 33027 - [4, 3, 512, 512] float16
synthesis.b512:0 - 16 [4, 64, 512, 512] float16
synthesis.b512:1 - - [4, 64, 512, 512] float32
synthesis.b1024.conv0 51297 1048592 [4, 32, 1024, 1024] float16
synthesis.b1024.conv1 25665 1048592 [4, 32, 1024, 1024] float16
synthesis.b1024.torgb 16515 - [4, 3, 1024, 1024] float16
synthesis.b1024:0 - 16 [4, 32, 1024, 1024] float16
synthesis.b1024:1 - - [4, 32, 1024, 1024] float32
--- --- --- --- ---
Total 28794124 2797104 - -
Discriminator Parameters Buffers Output shape Datatype
--- --- --- --- ---
b1024.fromrgb 128 16 [4, 32, 1024, 1024] float16
b1024.skip 2048 16 [4, 64, 512, 512] float16
b1024.conv0 9248 16 [4, 32, 1024, 1024] float16
b1024.conv1 18496 16 [4, 64, 512, 512] float16
b1024 - 16 [4, 64, 512, 512] float16
b512.skip 8192 16 [4, 128, 256, 256] float16
b512.conv0 36928 16 [4, 64, 512, 512] float16
b512.conv1 73856 16 [4, 128, 256, 256] float16
b512 - 16 [4, 128, 256, 256] float16
b256.skip 32768 16 [4, 256, 128, 128] float16
b256.conv0 147584 16 [4, 128, 256, 256] float16
b256.conv1 295168 16 [4, 256, 128, 128] float16
b256 - 16 [4, 256, 128, 128] float16
b128.skip 131072 16 [4, 512, 64, 64] float16
b128.conv0 590080 16 [4, 256, 128, 128] float16
b128.conv1 1180160 16 [4, 512, 64, 64] float16
b128 - 16 [4, 512, 64, 64] float16
b64.skip 262144 16 [4, 512, 32, 32] float32
b64.conv0 2359808 16 [4, 512, 64, 64] float32
b64.conv1 2359808 16 [4, 512, 32, 32] float32
b64 - 16 [4, 512, 32, 32] float32
b32.skip 262144 16 [4, 512, 16, 16] float32
b32.conv0 2359808 16 [4, 512, 32, 32] float32
b32.conv1 2359808 16 [4, 512, 16, 16] float32
b32 - 16 [4, 512, 16, 16] float32
b16.skip 262144 16 [4, 512, 8, 8] float32
b16.conv0 2359808 16 [4, 512, 16, 16] float32
b16.conv1 2359808 16 [4, 512, 8, 8] float32
b16 - 16 [4, 512, 8, 8] float32
b8.skip 262144 16 [4, 512, 4, 4] float32
b8.conv0 2359808 16 [4, 512, 8, 8] float32
b8.conv1 2359808 16 [4, 512, 4, 4] float32
b8 - 16 [4, 512, 4, 4] float32
b4.mbstd - - [4, 513, 4, 4] float32
b4.conv 2364416 16 [4, 512, 4, 4] float32
b4.fc 4194816 - [4, 512] float32
b4.out 513 - [4, 1] float32
--- --- --- --- ---
Total 29012513 544 - -
Setting up augmentation...
Distributing across 1 GPUs...
Setting up training phases...
Exporting sample images...
Initializing logs...
Training for 25000 kimg...
tick 0 kimg 0.0 time 1m 31s sec/tick 14.4 sec/kimg 3595.75 maintenance 76.8 cpumem 4.82 gpumem 11.32 augment 0.000
Evaluating metrics...
/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py:474: UserWarning: This DataLoader will create 3 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
warnings.warn(_create_warning_msg(
Train Model using StyleGAN2 in Google CoLab - Model Training Freezing

Related

Yolov5 model not able to train

I'm making a model to detect potholes in an image. I've done everything right or so it seems to me, but I can't train the model for some reason. What might be the problem here?
!python train.py --img 640 --cfg yolov5m.yaml --hyp data/hyps/hyp.scratch-med.yaml --batch 20 --epochs 300 --data data/potholeData.yaml --weights yolov5m.pt --workers 4 --name yolo_pothole_det_m
This is the final line of the code, which outputs the following.
train: weights=yolov5m.pt, cfg=yolov5m.yaml, data=data/potholeData.yaml, hyp=data/hyps/hyp.scratch-med.yaml, epochs=300, batch_size=20, imgsz=640, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, noplots=False, evolve=None, bucket=, cache=None, image_weights=False, device=, multi_scale=False, single_cls=False, optimizer=SGD, sync_bn=False, workers=4, project=runs/train, name=yolo_pothole_det_m, exist_ok=False, quad=False, cos_lr=False, label_smoothing=0.0, patience=100, freeze=[0], save_period=-1, seed=0, local_rank=-1, entity=None, upload_dataset=False, bbox_interval=-1, artifact_alias=latest
github: up to date with https://github.com/ultralytics/yolov5 βœ…
YOLOv5 πŸš€ v7.0-23-g5dc1ce4 Python-3.9.13 torch-1.13.0 CPU
hyperparameters: lr0=0.01, lrf=0.1, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.3, cls_pw=1.0, obj=0.7, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.9, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.1, copy_paste=0.0
ClearML: run 'pip install clearml' to automatically track, visualize and remotely train YOLOv5 πŸš€ in ClearML
Comet: run 'pip install comet_ml' to automatically track and visualize YOLOv5 πŸš€ runs in Comet
TensorBoard: Start with 'tensorboard --logdir runs/train', view at http://localhost:6006/
Overriding model.yaml nc=80 with nc=1
from n params module arguments
0 -1 1 5280 models.common.Conv [3, 48, 6, 2, 2]
1 -1 1 41664 models.common.Conv [48, 96, 3, 2]
2 -1 2 65280 models.common.C3 [96, 96, 2]
3 -1 1 166272 models.common.Conv [96, 192, 3, 2]
4 -1 4 444672 models.common.C3 [192, 192, 4]
5 -1 1 664320 models.common.Conv [192, 384, 3, 2]
6 -1 6 2512896 models.common.C3 [384, 384, 6]
7 -1 1 2655744 models.common.Conv [384, 768, 3, 2]
8 -1 2 4134912 models.common.C3 [768, 768, 2]
9 -1 1 1476864 models.common.SPPF [768, 768, 5]
10 -1 1 295680 models.common.Conv [768, 384, 1, 1]
11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
12 [-1, 6] 1 0 models.common.Concat [1]
13 -1 2 1182720 models.common.C3 [768, 384, 2, False]
14 -1 1 74112 models.common.Conv [384, 192, 1, 1]
15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
16 [-1, 4] 1 0 models.common.Concat [1]
17 -1 2 296448 models.common.C3 [384, 192, 2, False]
18 -1 1 332160 models.common.Conv [192, 192, 3, 2]
19 [-1, 14] 1 0 models.common.Concat [1]
20 -1 2 1035264 models.common.C3 [384, 384, 2, False]
21 -1 1 1327872 models.common.Conv [384, 384, 3, 2]
22 [-1, 10] 1 0 models.common.Concat [1]
23 -1 2 4134912 models.common.C3 [768, 768, 2, False]
24 [17, 20, 23] 1 24246 models.yolo.Detect [1, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [192, 384, 768]]
Isn't it supposed to train the model after that? What am I doing wrong for it to stop it right here?
in cmd you can see that it didn't read any images dataset. make sure that your potholedata.yaml file true located. in this file you have to write this code:
train: ../train/images #path to train images
val: ../valid/images #path to valid images
nc: 1 #number of classes
names: ['Weapon'] #name of classes
After this you can run and your train will continue

Output of vgg16 layer doesn't make sense

I have a vgg16 network without the last max pooling, fully connected and softmax layers. The network summary says that the last layer's output is going to have a size of (batchsize, 512, 14, 14). Putting an image into the network gives me an output of (batchsize, 512, 15, 15). How do I fix this?
import torch
import torch.nn as nn
from torchsummary import summary
vgg16 = torch.hub.load('pytorch/vision:v0.10.0', 'vgg16', pretrained=True)
vgg16withoutLastFewLayers = nn.Sequential(*list(vgg16.children())[:-2][0][0:30]).cuda()
image = torch.zeros((1,3,244,244)).cuda()
output = vgg16withoutLastFewLayers(image)
summary(vgg16withoutLastFewLayers, (3,224,224))
print(output.shape)
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [-1, 64, 224, 224] 1,792
ReLU-2 [-1, 64, 224, 224] 0
Conv2d-3 [-1, 64, 224, 224] 36,928
ReLU-4 [-1, 64, 224, 224] 0
MaxPool2d-5 [-1, 64, 112, 112] 0
Conv2d-6 [-1, 128, 112, 112] 73,856
ReLU-7 [-1, 128, 112, 112] 0
Conv2d-8 [-1, 128, 112, 112] 147,584
ReLU-9 [-1, 128, 112, 112] 0
MaxPool2d-10 [-1, 128, 56, 56] 0
Conv2d-11 [-1, 256, 56, 56] 295,168
ReLU-12 [-1, 256, 56, 56] 0
Conv2d-13 [-1, 256, 56, 56] 590,080
ReLU-14 [-1, 256, 56, 56] 0
Conv2d-15 [-1, 256, 56, 56] 590,080
ReLU-16 [-1, 256, 56, 56] 0
MaxPool2d-17 [-1, 256, 28, 28] 0
Conv2d-18 [-1, 512, 28, 28] 1,180,160
ReLU-19 [-1, 512, 28, 28] 0
Conv2d-20 [-1, 512, 28, 28] 2,359,808
ReLU-21 [-1, 512, 28, 28] 0
Conv2d-22 [-1, 512, 28, 28] 2,359,808
ReLU-23 [-1, 512, 28, 28] 0
MaxPool2d-24 [-1, 512, 14, 14] 0
Conv2d-25 [-1, 512, 14, 14] 2,359,808
ReLU-26 [-1, 512, 14, 14] 0
Conv2d-27 [-1, 512, 14, 14] 2,359,808
ReLU-28 [-1, 512, 14, 14] 0
Conv2d-29 [-1, 512, 14, 14] 2,359,808
ReLU-30 [-1, 512, 14, 14] 0
================================================================
torch.Size([1, 512, 15, 15])
The output shape should be [512, 14, 14], assuming that the input image is [3, 224, 224]. Your input image size is [3, 244, 244]. For example,
image = torch.zeros((1,3,224,224))
# torch.Size([1, 512, 14, 14])
output = vgg16withoutLastFewLayers(image)
Therefore, by increasing the image size, the spatial size [W, H] of your output tensor also increases.
Your input shapes are not the same size...
image = torch.zeros((1,3,244,244)).cuda()
output = vgg16withoutLastFewLayers(image)
summary(vgg16withoutLastFewLayers, (3,224,224))
print(output.shape)
Difference: 244 vs 224.
Because those VGG layers are only convolutional layers, when you increase the size of the input image, the output will also be increased in size. This would cause issues if there was a classification head (with no global pooling, etc) was applied directly on top of this as they have fixed-size inputs. You're not doing this, but it's something to keep in mind.

How to prevent my model from outputting zero-vectors while training using one-hot encoded vectors?

I have been training a model for a study on one-shot learning.
It has 19280 examples in the training dataset (basically the popular Omniglot dataset), and a 300-length vector for each data sample.
The model consists of the following architecture-
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [-1, 32, 50, 50] 1,600
BatchNorm2d-2 [-1, 32, 50, 50] 64
ReLU-3 [-1, 32, 50, 50] 0
Conv2d-4 [-1, 32, 50, 50] 9,248
BatchNorm2d-5 [-1, 32, 50, 50] 64
ReLU-6 [-1, 32, 50, 50] 0
Conv2d-7 [-1, 32, 50, 50] 9,248
BatchNorm2d-8 [-1, 32, 50, 50] 64
ReLU-9 [-1, 32, 50, 50] 0
Conv2d-10 [-1, 64, 24, 24] 18,496
BatchNorm2d-11 [-1, 64, 24, 24] 128
ReLU-12 [-1, 64, 24, 24] 0
Conv2d-13 [-1, 64, 24, 24] 36,928
BatchNorm2d-14 [-1, 64, 24, 24] 128
ReLU-15 [-1, 64, 24, 24] 0
Conv2d-16 [-1, 256, 11, 11] 147,712
BatchNorm2d-17 [-1, 256, 11, 11] 512
ReLU-18 [-1, 256, 11, 11] 0
Conv2d-19 [-1, 512, 5, 5] 1,180,160
BatchNorm2d-20 [-1, 512, 5, 5] 1,024
ReLU-21 [-1, 512, 5, 5] 0
Conv2d-22 [-1, 1024, 2, 2] 4,719,616
BatchNorm2d-23 [-1, 1024, 2, 2] 2,048
ReLU-24 [-1, 1024, 2, 2] 0
Linear-25 [-1, 300] 1,229,100
================================================================
Total params: 7,356,140
Trainable params: 7,356,140
Non-trainable params: 0
----------------------------------------------------------------
Basically the input is a 105x105 image, single channel (grayscale). I have applied sigmoid on the output.
While I train the model, I use simple Mean-squared error. I use the Adam optimizer with the learning rate set to $10^{-5}$ , to help tuning.
The model gets struck at the same constant loss after 2 or 3 epochs.
On further investigating, I found out that the model generalizes and makes outputs a zero vector in each case. I am assuming it is stuck on the local minima, but how do I go about training my model successfully?
Also, the model architecture has been chosen randomly (not literally, but no particular logic behind the dimensionality retained at the end of each layer), so please advise me if you think there is any present irregularity that would better training after rectification, in terms of layers.
I would love to hear some tips :) .

how to extract matrix from big matrix in torch/lua

In torch7 / lua
There is a matrix:
[ 1, 2, 3, 4;
5, 6, 7, 8;
9, 10, 11, 12;
13, 14, 15, 16 ]
how to extract this:
[ 6, 7;
10, 11 ]
and how to overwrite it by matrix operation
[ 1, 2, 3, 4;
5, 78, 66, 8;
9, 45, 21, 12;
13, 14, 15, 16 ]
Thanks in advance.
matrix
matrix:sub(2,3,2,3)
z = torch.Tensor({{78,66},{45,21}})
matrix:sub(2,3,2,3):copy(z)

Image Vectorizer [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I'm looking for a library/tool/image processing technique which can create vectors out of images (similar to text vectorization like TFIDF or so). Can anybody share some ideas how to proceed?
I am not sure what programming language you are using. Below is a sample i am using in R
This is how to use the Pixmap library to read in an image as a matrix.
library(pixmap)
the next command may only work on Linux
system("convert foo.tiff foo.ppm")
img <- read.pnm("foo.ppm")
To get info on your new object:
str(img)
Although included in the previous output, the size of the image can be extracted by:
img#size
Then to extract the red channel from the image for the first ten rows:
myextract <- img#red[1:10,]
Or to extract the entire red channel to an actual matrix:
red.mat<-matrix(NA,img#size[1],img#size[2])
red.mat<-img#red
Refer this : how to convert a JPEG to an image matrix in R
You can use Python- numpy also
>>> arr = np.array(im)
>>> arr = np.arange(150).reshape(5, 10, 3)
>>> x, y, z = arr.shape
>>> indices = np.vstack(np.unravel_index(np.arange(x*y), (y, x))).T
#or indices = np.hstack((np.repeat(np.arange(y), x)[:,np.newaxis], np.tile(np.arange(x), y)[:,np.newaxis]))
>>> np.hstack((arr.reshape(x*y, z), indices))
array([[ 0, 1, 2, 0, 0],
[ 3, 4, 5, 0, 1],
[ 6, 7, 8, 0, 2],
[ 9, 10, 11, 0, 3],
[ 12, 13, 14, 0, 4],
[ 15, 16, 17, 1, 0],
[ 18, 19, 20, 1, 1],
[ 21, 22, 23, 1, 2],
[ 24, 25, 26, 1, 3],
[ 27, 28, 29, 1, 4],
[ 30, 31, 32, 2, 0],
[ 33, 34, 35, 2, 1],
[ 36, 37, 38, 2, 2],
...
[129, 130, 131, 8, 3],
[132, 133, 134, 8, 4],
[135, 136, 137, 9, 0],
[138, 139, 140, 9, 1],
[141, 142, 143, 9, 2],
[144, 145, 146, 9, 3],
[147, 148, 149, 9, 4]])
Where arr = np.array(im) is my image

Resources