Does Electron support 4K USB Video resolution? - electron

We are developing a windows application using Electron framework. This application intends to show video streaming from USB Camera. We are using "navigator.mediaDevices" apis to access USB video streams. Problem is Electron application unable work for 4K USB video resolution and throws error.
We are able to see the video stream into Electron application but it works only for lower resolution e.g. 480p and hence the quality is not up to mark. The USB camera can support till 4K resolution. However, when we supply higher resolution it works till 720p but beyond that it gives "OverconstrainedError". (code snippet below for refn)
navigator.mediaDevices.getUserMedia({
video:
{
width: {
exact: usbWidth // works till 1280
},
height: {
exact: usbHeight // works till 720
},
deviceId: {
exact: deviceid
},
}
}).then(function (stream) {
stream.getTracks().forEach(function(track) {
console.log(track.getSettings());
})
My question is:
Does Electron,which is built on top of Chromium, has any limitations with respect to USB Video resolution ?
Can 4K USB Video resolution support exist in Electron app ?
Is there any reference code which exist to point us in the right direction ?
Here is the response of the FFMPEG Command:-
ffmpeg version N-94112-gbb11584924 Copyright (c) 2000-2019 the FFmpeg developers
built with gcc 9.1.1 (GCC) 20190621
configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt
libavutil 56. 29.100 / 56. 29.100
libavcodec 58. 53.100 / 58. 53.100
libavformat 58. 28.100 / 58. 28.100
libavdevice 58. 7.100 / 58. 7.100
libavfilter 7. 55.100 / 7. 55.100
libswscale 5. 4.101 / 5. 4.101
libswresample 3. 4.100 / 3. 4.100
libpostproc 55. 4.100 / 55. 4.100
[dshow # 0000016c508c8dc0] DirectShow video device options (from video devices)
[dshow # 0000016c508c8dc0] Pin "Capture" (alternative pin name "0")
[dshow # 0000016c508c8dc0] vcodec=h264 min s=1920x1080 fps=30 max s=1920x1080 fps=60.0002
[dshow # 0000016c508c8dc0] vcodec=h264 min s=1920x1080 fps=30 max s=1920x1080 fps=60.0002
[dshow # 0000016c508c8dc0] vcodec=h264 min s=1280x720 fps=30 max s=1280x720 fps=60.0002
[dshow # 0000016c508c8dc0] vcodec=h264 min s=1280x720 fps=30 max s=1280x720 fps=60.0002
[dshow # 0000016c508c8dc0] vcodec=h264 min s=640x360 fps=30 max s=640x360 fps=60.0002
[dshow # 0000016c508c8dc0] vcodec=h264 min s=640x360 fps=30 max s=640x360 fps=60.0002
[dshow # 0000016c508c8dc0] vcodec=h264 min s=1280x960 fps=30 max s=1280x960 fps=60.0002
[dshow # 0000016c508c8dc0] vcodec=h264 min s=1280x960 fps=30 max s=1280x960 fps=60.0002
[dshow # 0000016c508c8dc0] vcodec=h264 min s=1024x768 fps=30 max s=1024x768 fps=60.0002
[dshow # 0000016c508c8dc0] vcodec=h264 min s=1024x768 fps=30 max s=1024x768 fps=60.0002
[dshow # 0000016c508c8dc0] vcodec=h264 min s=640x480 fps=30 max s=640x480 fps=60.0002
[dshow # 0000016c508c8dc0] vcodec=h264 min s=640x480 fps=30 max s=640x480 fps=60.0002
[dshow # 0000016c508c8dc0] vcodec=h264 min s=1280x1024 fps=30 max s=1280x1024 fps=60.0002
[dshow # 0000016c508c8dc0] vcodec=h264 min s=1280x1024 fps=30 max s=1280x1024 fps=60.0002
[dshow # 0000016c508c8dc0] vcodec=h264 min s=1280x800 fps=30 max s=1280x800 fps=60.0002
[dshow # 0000016c508c8dc0] vcodec=h264 min s=1280x800 fps=30 max s=1280x800 fps=60.0002
[dshow # 0000016c508c8dc0] vcodec=mjpeg min s=1920x1080 fps=15 max s=1920x1080 fps=30
[dshow # 0000016c508c8dc0] vcodec=mjpeg min s=1920x1080 fps=15 max s=1920x1080 fps=30
[dshow # 0000016c508c8dc0] vcodec=mjpeg min s=1280x720 fps=15 max s=1280x720 fps=30
[dshow # 0000016c508c8dc0] vcodec=mjpeg min s=1280x720 fps=15 max s=1280x720 fps=30
[dshow # 0000016c508c8dc0] vcodec=mjpeg min s=640x360 fps=15 max s=640x360 fps=30
[dshow # 0000016c508c8dc0] vcodec=mjpeg min s=640x360 fps=15 max s=640x360 fps=30
[dshow # 0000016c508c8dc0] vcodec=mjpeg min s=1280x960 fps=15 max s=1280x960 fps=30
[dshow # 0000016c508c8dc0] vcodec=mjpeg min s=1280x960 fps=15 max s=1280x960 fps=30
[dshow # 0000016c508c8dc0] vcodec=mjpeg min s=1024x768 fps=15 max s=1024x768 fps=30
[dshow # 0000016c508c8dc0] vcodec=mjpeg min s=1024x768 fps=15 max s=1024x768 fps=30
[dshow # 0000016c508c8dc0] vcodec=mjpeg min s=640x480 fps=15 max s=640x480 fps=30
[dshow # 0000016c508c8dc0] vcodec=mjpeg min s=640x480 fps=15 max s=640x480 fps=30
[dshow # 0000016c508c8dc0] vcodec=mjpeg min s=1280x1024 fps=15 max s=1280x1024 fps=30
[dshow # 0000016c508c8dc0] vcodec=mjpeg min s=1280x1024 fps=15 max s=1280x1024 fps=30
[dshow # 0000016c508c8dc0] vcodec=mjpeg min s=1280x800 fps=15 max s=1280x800 fps=30
[dshow # 0000016c508c8dc0] vcodec=mjpeg min s=1280x800 fps=15 max s=1280x800 fps=30
[dshow # 0000016c508c8dc0] pixel_format=yuyv422 min s=640x480 fps=15 max s=640x480 fps=30
[dshow # 0000016c508c8dc0] pixel_format=yuyv422 min s=640x480 fps=15 max s=640x480 fps=30
[dshow # 0000016c508c8dc0] pixel_format=yuyv422 min s=640x360 fps=15 max s=640x360 fps=30
[dshow # 0000016c508c8dc0] pixel_format=yuyv422 min s=640x360 fps=15 max s=640x360 fps=30
[dshow # 0000016c508c8dc0] pixel_format=yuyv422 min s=320x240 fps=15 max s=320x240 fps=30
[dshow # 0000016c508c8dc0] pixel_format=yuyv422 min s=320x240 fps=15 max s=320x240 fps=30
[dshow # 0000016c508c8dc0] pixel_format=yuyv422 min s=320x180 fps=15 max s=320x180 fps=30
[dshow # 0000016c508c8dc0] pixel_format=yuyv422 min s=320x180 fps=15 max s=320x180 fps=30

Related

How to get the each class results seperately in multiclass confusion matrix

I have actual class and res class here - https://extendsclass.com/csv-editor.html#46eaa9e
I wanted to calculate the sensitivity, specificity, pos predictivity for each of the class A, N,O. Here is my code
Here is the code
from sklearn.metrics import multilabel_confusion_matrix
import numpy as np
mcm = multilabel_confusion_matrix(act_class, pred_class)
tps = mcm[:, 1, 1]
tns = mcm[:, 0, 0]
recall = tps / (tps + mcm[:, 1, 0]) # Sensitivity
specificity = tns / (tns + mcm[:, 0, 1]) # Specificity
precision = tps / (tps + mcm[:, 0, 1]) # PPV
print(recall)
print(specificity)
print(precision)
print(classification_report(act_class, pred_class))
Which gives me results like this
[0.31818182 0.96186441 nan nan]
[0.99576271 0.86363636 0.86092715 0.99337748]
[0.95454545 0.96186441 0. 0. ]
precision recall f1-score support
A 0.95 0.32 0.48 66
N 0.96 0.96 0.96 236
O 0.00 0.00 0.00 0
~ 0.00 0.00 0.00 0
accuracy 0.82 302
macro avg 0.48 0.32 0.36 302
weighted avg 0.96 0.82 0.86 302
The problem here is - I can not deduce clearly what is the sensitivity, specificity, pos predictivity for each of the class A, N,O.
This might be quicker to explain visually:
By default the labels should occur in sorted order (for your problem: A, N, O, ~).
If you want a different order, you can specify one with the labels= parameter. The following has two classes, and orders them by: [3, 2]
from sklearn.metrics import multilabel_confusion_matrix
from sklearn.metrics import classification_report
y_true = [2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3]
y_pred = [2, 2, 2, 3, 3, 2, 2, 2, 3, 3, 3, 3]
mcm = multilabel_confusion_matrix(y_true, y_pred, labels=[3, 2])
tps = mcm[:, 1, 1]
precision = tps / (tps + mcm[:, 0, 1])
print(precision)
print(f"Precision class 3: {precision[0]}. Precision class 2: {precision[1]}")
print(classification_report(y_true, y_pred, labels=[3, 2]))
Output:
[0.66666667 0.5 ]
Precision class 3: 0.6666666666666666. Precision class 2: 0.5
precision recall f1-score support
3 0.67 0.57 0.62 7
2 0.50 0.60 0.55 5
accuracy 0.58 12
macro avg 0.58 0.59 0.58 12
weighted avg 0.60 0.58 0.59 12

DCGAN error: Using a target size (torch.Size([128])) that is different to the input size (torch.Size([128, 1, 5, 5])) is deprecated

I keep getting this error when trying to run a DCGAN using the EMNIST dataset, I'm fairly new to this and struggling to debug the issue
class Generator(nn.Module):
def __init__(self, ngpu):
super(Generator, self).__init__()
self.ngpu = ngpu
self.main = nn.Sequential(
# input is Z, going into a convolution
nn.ConvTranspose2d( nz, ngf * 8, 4, 1, 0, bias=False),
nn.BatchNorm2d(ngf * 8),
nn.ReLU(True),
# state size. (ngf*8) x 4 x 4
nn.ConvTranspose2d(ngf * 8, ngf * 4, 4, 2, 1, bias=False),
nn.BatchNorm2d(ngf * 4),
nn.ReLU(True),
# state size. (ngf*4) x 8 x 8
nn.ConvTranspose2d( ngf * 4, ngf * 2, 4, 2, 1, bias=False),
nn.BatchNorm2d(ngf * 2),
nn.ReLU(True),
# state size. (ngf*2) x 16 x 16
nn.ConvTranspose2d( ngf * 2, ngf, 4, 2, 1, bias=False),
nn.BatchNorm2d(ngf),
nn.ReLU(True),
# state size. (ngf) x 32 x 32
nn.ConvTranspose2d( ngf, nc, 4, 2, 1, bias=False),
nn.Tanh()
# state size. (nc) x 64 x 64
)
def forward(self, input):
return self.main(input)
class Discriminator(nn.Module):
def __init__(self, ngpu):
super(Discriminator, self).__init__()
self.ngpu = ngpu
self.main = nn.Sequential(
# input is (nc) x 64 x 64
nn.Conv2d(nc, ndf, 4, 2, 1, bias=False),
nn.BatchNorm2d(ndf),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf) x 32 x 32
nn.Conv2d(ndf, ndf * 2, 4, 2, 1, bias=False),
nn.BatchNorm2d(ndf * 2),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf*2) x 16 x 16
nn.Conv2d(ndf * 2, ndf * 4, 4, 2, 1, bias=False),
nn.BatchNorm2d(ndf * 4),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf*4) x 8 x 8
#nn.Conv2d(ndf * 4, ndf * 8, 4, 2, 1, bias=False),
#nn.BatchNorm2d(ndf * 8),
#nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf*8) x 4 x 4
nn.Conv2d(ndf * 4, 1, 4, 1, 0, bias=False),
nn.Sigmoid()
)
def forward(self, input):
return self.main(input)
# Lists to keep track of progress
img_list = []
G_losses = []
D_losses = []
iters = 0
print("Starting Training Loop...")
# For each epoch
for epoch in range(num_epochs):
# For each batch in the dataloader
for i, data in enumerate(dataloader, 0):
############################
# (1) Update D network: maximize log(D(x)) + log(1 - D(G(z)))
###########################
## Train with all-real batch
netD.zero_grad()
# Format batch
real_cpu = data[0].to(device)
b_size = real_cpu.size(0)
label = torch.full((b_size,), real_label, dtype=torch.float, device=device)
# Forward pass real batch through D
output = netD(real_cpu).view(-1)
# Calculate loss on all-real batch
errD_real = criterion(output, label)
# Calculate gradients for D in backward pass
errD_real.backward()
D_x = output.mean().item()
## Train with all-fake batch
# Generate batch of latent vectors
noise = torch.randn(b_size, nz, 1, 1, device=device)
# Generate fake image batch with G
fake = netG(noise)
label.fill_(fake_label)
# Classify all fake batch with D
output = netD(fake.detach()).view(-1)
# Calculate D's loss on the all-fake batch
errD_fake = criterion(output, label)
# Calculate the gradients for this batch, accumulated (summed) with previous gradients
errD_fake.backward()
D_G_z1 = output.mean().item()
# Compute error of D as sum over the fake and the real batches
errD = errD_real + errD_fake
# Update D
optimizerD.step()
############################
# (2) Update G network: maximize log(D(G(z)))
###########################
netG.zero_grad()
label.fill_(real_label) # fake labels are real for generator cost
# Since we just updated D, perform another forward pass of all-fake batch through D
output = netD(fake).view(-1)
# Calculate G's loss based on this output
errG = criterion(output, label)
# Calculate gradients for G
errG.backward()
D_G_z2 = output.mean().item()
# Update G
optimizerG.step()
# Output training stats
if i % 50 == 0:
print('[%d/%d][%d/%d]\tLoss_D: %.4f\tLoss_G: %.4f\tD(x): %.4f\tD(G(z)): %.4f / %.4f'
% (epoch, num_epochs, i, len(dataloader),
errD.item(), errG.item(), D_x, D_G_z1, D_G_z2))
# Save Losses for plotting later
G_losses.append(errG.item())
D_losses.append(errD.item())
# Check how the generator is doing by saving G's output on fixed_noise
if (iters % 500 == 0) or ((epoch == num_epochs-1) and (i == len(dataloader)-1)):
with torch.no_grad():
fake = netG(fixed_noise).detach().cpu()
img_list.append(vutils.make_grid(fake, padding=2, normalize=True))
iters += 1
I'm wanting to do this with just the conv2d layers that are in use and I have tried using the .squeeze function to get it working but had no luck

How to repeat a group (pattern of shapes) across the canvas?

I've been trying to achieve a seamless continuing brick wall across the canvas using Konva React. At the moment, I have the following code. Don't mind the repeating code, there is still clean-up to do using the Group tag. The only thing is, I do not know how I can repeat those shapes across the canvas. Does anyone have any idea?
import { Stage, Group, Layer, Path, Image, Rect, Text, Circle, Line } from 'react-konva';
import useImage from 'use-image';
export default function Main(){
const [pattern, status] = useImage('https://res.cloudinary.com/dhmpnnhd0/image/upload/v1666003977/materials/mocpex-texture.jpg')
const rows = 7
const columns = 7
const scale = 0.5
const line = 5
if(status == 'loaded'){
const baseWidth = pattern.width / columns
const baseHeight = pattern.height / rows
return (
<Stage width={window.innerWidth} height={window.innerHeight}>
<Layer>
<Group scaleX={0.5} scaleY={0.5}>
<Rect width={baseWidth * 2} height={baseHeight * 2} x={0} y={0} fillPatternImage={pattern} scaleX={scale} scaleY={scale} fillPatternOffsetX={getRandomNumber(pattern.width)} ></Rect>
<Rect width={baseWidth * 1 - line / 2} height={baseHeight * 2} x={0} y={baseHeight + line} fillPatternImage={pattern} scaleX={scale} scaleY={scale} fillPatternOffsetX={getRandomNumber(pattern.width)} ></Rect>
<Rect width={baseWidth * 2 + line * 2} height={baseHeight * 2} x={baseWidth + line} y={baseHeight / 2 + line} fillPatternImage={pattern} scaleX={scale} scaleY={scale} fillPatternOffsetX={getRandomNumber(pattern.width)} ></Rect>
<Rect width={baseWidth * 1 - line * 2} height={baseHeight * 1} x={baseWidth / 2 + line} y={baseHeight + line} fillPatternImage={pattern} scaleX={scale} scaleY={scale} fillPatternOffsetX={getRandomNumber(pattern.width)}></Rect>
<Rect width={baseWidth * 2} height={baseHeight * 2} x={baseWidth / 2 + line} y={baseHeight * 1.50 + line * 2} fillPatternImage={pattern} scaleX={scale} scaleY={scale} fillPatternOffsetX={getRandomNumber(pattern.width)} ></Rect>
<Rect width={baseWidth * 1} height={baseHeight - line} x={baseWidth / 2 + line * 1} y={baseHeight * 2.5 + line * 3} fillPatternImage={pattern} scaleX={scale} scaleY={scale} fillPatternOffsetX={getRandomNumber(pattern.width)} ></Rect>
<Rect width={baseWidth * 3} height={baseHeight * 2} x={baseWidth + line * 2} y={baseHeight * 2.5 + line * 3} fillPatternImage={pattern} scaleX={scale} scaleY={scale} fillPatternOffsetX={getRandomNumber(pattern.width)}></Rect>
<Rect width={baseWidth * 1} height={baseHeight * 1} x={baseWidth * 1.5 + line * 2} y={baseHeight * 1.5 + line * 2} fillPatternImage={pattern} scaleX={scale} scaleY={scale} fillPatternOffsetX={getRandomNumber(pattern.width)} ></Rect>
<Rect width={baseWidth * 2} height={baseHeight - line * 2} x={baseWidth * 1.5 + line * 2} y={baseHeight * 2 + line * 3} fillPatternImage={pattern} scaleX={scale} scaleY={scale} fillPatternOffsetX={getRandomNumber(pattern.width)} ></Rect>
<Rect width={baseWidth} height={baseHeight} x={baseWidth * 2.5 + line * 3} y={baseHeight * 3 + line * 3} fillPatternImage={pattern} scaleX={scale} scaleY={scale} fillPatternOffsetX={getRandomNumber(pattern.width)}></Rect>
<Rect width={baseWidth * 2} height={baseHeight * 2 - line * 2} x={baseWidth * 2.5 + line * 3} y={baseHeight * 2 + line * 3} fillPatternImage={pattern} scaleX={scale} scaleY={scale} fillPatternOffsetX={getRandomNumber(pattern.width)} ></Rect>
<Rect width={baseWidth * 2} height={baseHeight * 3 + line * 2} x={baseWidth * 2 + line * 3} y={baseHeight * 0.5 + line} fillPatternImage={pattern} scaleX={scale} scaleY={scale} fillPatternOffsetX={getRandomNumber(pattern.width)} ></Rect>
</Group>
</Layer>
</Stage>
)
}
}
function getRandomNumber(dimension){
const min = 0;
const max = dimension;
const rand = min + Math.random() * (max - min);
return rand;
}

how to build a generator and discriminator for a DCGAN with images of size 256x256

I have the following generators and discriminators for a DCGAN with images of size 128x128, it works excellent.
However, I would like to use the same code to generate images with a size of 256x256, but I cannot build the generators and discriminators.
# direccion del directorio de entrenamiento
dataroot = "./dataset 128x128"
# Number of workers for dataloader
workers = 6
# Batch size during training
batch_size = 1
# Spatial size of training images. All images will be resized to this
# size using a transformer.
image_size = 128
# Number of channels in the training images. For color images this is 3
nc = 3
# Size of z latent vector (i.e. size of generator input)
nz = 100
# Size of feature maps in generator
ngf = 32
# Size of feature maps in discriminator
ndf = 32
# Number of training epochs
num_epochs = 20
# Learning rate for optimizers
lr = 0.0002
# Beta1 hyperparam for Adam optimizers
beta1 = 0.5
# Number of GPUs available. Use 0 for CPU mode.
ngpu = 2
print("Dataset done")
# Generator Code
class Generator(nn.Module):
def __init__(self, ngpu):
super(Generator, self).__init__()
self.ngpu = ngpu
self.main = nn.Sequential(
# input is Z, going into a convolution
nn.ConvTranspose2d( nz, ngf * 16, 4, 1, 0, bias=False),
nn.BatchNorm2d(ngf * 16),
nn.ReLU(True),
# state size. (ngf*16) x 4 x 4
nn.ConvTranspose2d(ngf * 16, ngf * 8, 4, 2, 1, bias=False),
nn.BatchNorm2d(ngf * 8),
nn.ReLU(True),
# state size. (ngf*8) x 8 x 8
nn.ConvTranspose2d(ngf * 8, ngf * 4, 4, 2, 1, bias=False),
nn.BatchNorm2d(ngf * 4),
nn.ReLU(True),
# state size. (ngf*4) x 16 x 16
nn.ConvTranspose2d(ngf * 4, ngf * 2, 4, 2, 1, bias=False),
nn.BatchNorm2d(ngf * 2),
nn.ReLU(True),
# state size. (ngf*2) x 32 x 32
nn.ConvTranspose2d(ngf * 2, ngf, 4, 2, 1, bias=False),
nn.BatchNorm2d(ngf),
nn.ReLU(True),
# state size. (ngf) x 64 x 64
nn.ConvTranspose2d( ngf, nc, 4, 2, 1, bias=False),
nn.Tanh()
# state size. (nc) x 128 x 128
)
def forward(self, input):
return self.main(input)
class Discriminator(nn.Module):
def __init__(self, ngpu):
super(Discriminator, self).__init__()
self.ngpu = ngpu
self.main = nn.Sequential(
# input is (nc) x 128 x 128
nn.Conv2d(nc, ndf, 4, stride=2, padding=1, bias=False),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf) x 64 x 64
nn.Conv2d(ndf, ndf * 2, 4, stride=2, padding=1, bias=False),
nn.BatchNorm2d(ndf * 2),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf*2) x 32 x 32
nn.Conv2d(ndf * 2, ndf * 4, 4, stride=2, padding=1, bias=False),
nn.BatchNorm2d(ndf * 4),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf*4) x 16 x 16
nn.Conv2d(ndf * 4, ndf * 8, 4, stride=2, padding=1, bias=False),
nn.BatchNorm2d(ndf * 8),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf*8) x 8 x 8
nn.Conv2d(ndf * 8, ndf * 16, 4, stride=2, padding=1, bias=False),
nn.BatchNorm2d(ndf * 16),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf*16) x 4 x 4
nn.Conv2d(ndf * 16, 1, 4, stride=1, padding=0, bias=False),
nn.Sigmoid()
# state size. 1
)
def forward(self, input):
return self.main(input)
# Lists to keep track of progress
img_list = []
G_losses = []
D_losses = []
iters = 0
print("Starting Training Loop...")
# For each epoch
for epoch in range(num_epochs):
# For each batch in the dataloader
for i, data in enumerate(dataloader, 0):
############################
# (1) Update D network: maximize log(D(x)) + log(1 - D(G(z)))
###########################
## Train with all-real batch
netD.zero_grad()
# Format batch
real_cpu = data[0].to(device)
b_size = real_cpu.size(0)
label = torch.full((b_size,), real_label, dtype=torch.float, device=device)
# Forward pass real batch through D
output = netD(real_cpu).view(-1)
# Calculate loss on all-real batch
errD_real = criterion(output, label)
# Calculate gradients for D in backward pass
errD_real.backward()
D_x = output.mean().item()
## Train with all-fake batch
# Generate batch of latent vectors
noise = torch.randn(b_size, nz, 1, 1, device=device)
# Generate fake image batch with G
fake = netG(noise)
label.fill_(fake_label)
# Classify all fake batch with D
output = netD(fake.detach()).view(-1)
# Calculate D's loss on the all-fake batch
errD_fake = criterion(output, label)
# Calculate the gradients for this batch, accumulated (summed) with previous gradients
errD_fake.backward()
D_G_z1 = output.mean().item()
# Compute error of D as sum over the fake and the real batches
errD = errD_real + errD_fake
# Update D
optimizerD.step()
############################
# (2) Update G network: maximize log(D(G(z)))
###########################
netG.zero_grad()
label.fill_(real_label) # fake labels are real for generator cost
# Since we just updated D, perform another forward pass of all-fake batch through D
output = netD(fake).view(-1)
# Calculate G's loss based on this output
errG = criterion(output, label)
# Calculate gradients for G
errG.backward()
D_G_z2 = output.mean().item()
# Update G
optimizerG.step()
# Output training stats
if i % 50 == 0:
print('[%d/%d][%d/%d]\tLoss_D: %.4f\tLoss_G: %.4f\tD(x): %.4f\tD(G(z)): %.4f / %.4f'
% (epoch, num_epochs, i, len(dataloader),
errD.item(), errG.item(), D_x, D_G_z1, D_G_z2))
# Save Losses for plotting later
G_losses.append(errG.item())
D_losses.append(errD.item())
# Check how the generator is doing by saving G's output on fixed_noise
if (iters % 500 == 0) or ((epoch == num_epochs-1) and (i == len(dataloader)-1)):
with torch.no_grad():
fake = netG(fixed_noise).detach().cpu()
img_list.append(vutils.make_grid(fake, padding=2, normalize=True))
iters += 1
How to modify those generators and discriminators for an image of size 256x256?

Pytorch DCGAN example for kernel 3

I tried out the pytorch dcgan example and it worked fine but when I tried to change the kernel from 4x4 to 3x3. I only changed the kernel and It gave the following error. Why this error is occurring? To solve this what changes are needed?
ValueError: Using a target size (torch.Size([64])) that is different to the input size (torch.Size([256])) is deprecated. Please ensure they have the same size.
Here is the generator:
class Generator(nn.Module):
def __init__(self, ngpu):
super(Generator, self).__init__()
self.ngpu = ngpu
self.main = nn.Sequential(
# input is Z, going into a convolution
nn.ConvTranspose2d( nz, ngf * 8, kernel_size, 1, 0, bias=False),
nn.BatchNorm2d(ngf * 8),
nn.ReLU(True),
# state size. (ngf*8) x 4 x 4
nn.ConvTranspose2d(ngf * 8, ngf * 4, kernel_size, 2, 1, bias=False),
nn.BatchNorm2d(ngf * 4),
nn.ReLU(True),
# state size. (ngf*4) x 8 x 8
nn.ConvTranspose2d( ngf * 4, ngf * 2, kernel_size, 2, 1, bias=False),
nn.BatchNorm2d(ngf * 2),
nn.ReLU(True),
# state size. (ngf*2) x 16 x 16
nn.ConvTranspose2d( ngf * 2, ngf, kernel_size, 2, 1, bias=False),
nn.BatchNorm2d(ngf),
nn.ReLU(True),
# state size. (ngf) x 32 x 32
nn.ConvTranspose2d( ngf, nc, kernel_size, 2, 1, bias=False),
nn.Tanh()
# state size. (nc) x 64 x 64
)
def forward(self, input):
return self.main(input)
discriminator code:
class Discriminator(nn.Module):
def __init__(self, ngpu):
super(Discriminator, self).__init__()
self.ngpu = ngpu
self.main = nn.Sequential(
# input is (nc) x 64 x 64
nn.Conv2d(nc, ndf, kernel_size, 2, 1, bias=False),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf) x 32 x 32
nn.Conv2d(ndf, ndf * 2, kernel_size, 2, 1, bias=False),
nn.BatchNorm2d(ndf * 2),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf*2) x 16 x 16
nn.Conv2d(ndf * 2, ndf * 4, kernel_size, 2, 1, bias=False),
nn.BatchNorm2d(ndf * 4),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf*4) x 8 x 8
nn.Conv2d(ndf * 4, ndf * 8, kernel_size, 2, 1, bias=False),
nn.BatchNorm2d(ndf * 8),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf*8) x 4 x 4
nn.Conv2d(ndf * 8, 1, kernel_size, 1, 0, bias=False),
# 1x1x1
nn.Sigmoid()
)
def forward(self, input):
return self.main(input)
The error showing line is:
In the training loop
# Calculate loss on all-real batch
errD_real = criterion(output, label)
Here criterion = nn.BCELoss()
Whole training loop:
img_list = []
G_losses = []
D_losses = []
iters = 0
# For each epoch
for epoch in range(num_epochs):
# For each batch in the dataloader
for i, data in enumerate(dataloader, 0):
############################
# (1) Update D network: maximize log(D(x)) + log(1 - D(G(z)))
###########################
## Train with all-real batch
netD.zero_grad()
# Format batch
real_cpu = data[0].to(device)
#real_cpu = (data.unsqueeze(dim=1).type(torch.FloatTensor)).to(device)
b_size = real_cpu.size(0)
label = torch.full((b_size,), real_label,dtype=torch.float, device=device, )
# Forward pass real batch through D
output = netD(real_cpu).view(-1)
# Calculate loss on all-real batch
errD_real = criterion(output, label)#ERROR
# Calculate gradients for D in backward pass
errD_real.backward()
D_x = output.mean().item()
## Train with all-fake batch
# Generate batch of latent vectors
noise = torch.randn(b_size, nz, 1, 1, device=device)
# Generate fake image batch with G
fake = netG(noise)
label.fill_(fake_label)
# Classify all fake batch with D
output = netD(fake.detach()).view(-1)
# Calculate D's loss on the all-fake batch
errD_fake = criterion(output, label)
# Calculate the gradients for this batch
errD_fake.backward()
D_G_z1 = output.mean().item()
# Add the gradients from the all-real and all-fake batches
errD = errD_real + errD_fake
# Update D
optimizerD.step()
############################
# (2) Update G network: maximize log(D(G(z))) using 'log D' trick
###########################
netG.zero_grad()
label.fill_(real_label) # fake labels are real for generator cost
# Since we just updated D, perform another forward pass of all-fake batch through D
output = netD(fake).view(-1)
# Calculate G's loss based on this output
errG = criterion(output, label)
# Calculate gradients for G
errG.backward()
D_G_z2 = output.mean().item()
# Update G
optimizerG.step()
# Output training stats
if i % 50 == 0:
print('[%d/%d][%d/%d]\tLoss_D: %.4f\tLoss_G: %.4f\tD(x): %.4f\tD(G(z)): %.4f / %.4f'
% (epoch, num_epochs, i, len(dataloader),
errD.item(), errG.item(), D_x, D_G_z1, D_G_z2))
# Save Losses for plotting later
G_losses.append(errG.item())
D_losses.append(errD.item())
# Check how the generator is doing by saving G's output on fixed_noise
if (iters % 500 == 0) or ((epoch == num_epochs-1) and (i == len(dataloader)-1)):
with torch.no_grad():
fake = netG(fixed_noise).detach().cpu()
img_list.append(vutils.make_grid(fake, padding=2, normalize=True))
iters += 1

Resources