StyleGAN3 images become under- or overexposed with an evolving dataset - nvidia

I am trying to use StyleGAN3 with an evolving dataset, my current logic is as follows:
Train for about half an hour
Regenerate dataset with added images
Resume training with --resume
For testing, I've been using the metfaces dataset and started out with around 100 images, adding about 50 new images to the dataset every half an hour.
Upon resuming the results gradually become very dark or very light.
I am training with parameters as suggested at stylegan3 github:
train.py --cfg=stylegan3-r --gpus=1 --batch=32 --gamma=2 --batch-gpu=8 --mirror=1
Any suggestions on how to go about training with a growing dataset?

100 images to start with is not enough and the network will collapse. Start with at least 1,000 images but preferably 10,000.

Related

Using U2net (trained on Coco + DUTS dataset) for foreground segmentation leads to a foreground object being contaminated with colors

I am currently training the U-2-Net on a larger dataset of 50k images from scratch.
The results I am getting are already promising. However, they are not perfect. Since my train loss is at around 1.1, further training might increase accuracy.
For the case, my segmented images are still being contaminated with colors from the surroundings, what opportunities do I have to get rid of them in Python?
Would this be another use case for deep learning? Or are there opportunities without machine learning to improve the outcome?
I am aiming for similar results to remove.bg. Their results are absolutely perfect.
Here are some foreground objects after the background has been removed (Have a look at the edges of the objects: the greenish and skin-colored contamination around the T-shirt):
Thanks a lot!

I have built an image search using VGG16. It takes 4 mins to go through the search. what are the techniques that I can use to shorten this time?

I have built an image search using VGG16 engine, I have a data set of about 20,000 images. It takes 4 mins to go through the search. what are the techniques that I can use to shorten this time?
I have reduced the time latency to about one fourth of the previous one. I can be done by using a classifier on the images prior to finding the eucledian distance between the embeddings of the images.I used SVM and trained it with about 10 different categories , the average latency was thus reduced to about one forth because of decrease in the number of comparisons required.

Is there any dynamic neural network design which can handle increasingly changing output?

I am working on a case where the dimension of labels would increasing with time. For example, at time t, the output is a 10 by 1 vector. Later, at time t+5, the output becomes a 15 by 1 vector.
In this case, for the same input, the first 10 entries of output at time t are the same as the ones at time t+5. But the rest 5 are different. The reason that the dimension of output vector is increased is that every time when we are given a new training sample, the dimension of the label of all previous training sample increases by 1. So the expected output of the neural network is changed correspondingly.
The trivial solution is to re-train the whole model such that it can handle desired dimension of output. I know it might sound weird but I am wondering is there any smart design to build up a dynamic network such that the network can be incrementally trained by incrementally changing labels.

Caffe accuracy increases too fast

I'm doing a AlexNet fine tuning for face detection following this: link
The only difference with the link is that I am using another dataset (facescrub and some images from imagenet as negative examples).
I noticed the accuracy increasing too fast, in 50 iterations it goes from 0.308 to 0.967 and when it is about 0.999 I stop the training and use the model using the same python script as the above link.
I use for testing an image from the dataset and the result is nowhere near good, test image result. As you can see the box in the faces is too big (and the dataset images are tightly cropped), not to mention the box not containing a face.
My solver and train_val files are exactly the same, only difference is batch sizes and max iter size.
The reason was that my dataset has way more face examples than non-face examples. I tried the same setup with the same number of positive and negative examples and now the accuracy increases slower.

How can I train a naivebayes classifier incrementally?

Using Accord.NET I've created a NaiveBayes classifier. It will classify a pixel based on 6 or so sets of image processing results. My images are 5MP, so a training set of 50 images creates a very large set of training data.
6 int array per pixel * 5 million pixels * 50 images.
Instead of trying to store all that data in memory, is there a way to incrementally train the NaiveBayes classifier? Calling Learn() multiple times overwrites the old data each time rather than adding to it.
Right now is not possible to train a Naive Bayes model incrementally using Accord.NET.
However, since all that Naive Bayes is going to do is to try to fit some distributions to your data, and since your data has very few dimensions, maybe you could try to learn your model on a subsample of your data rather than all of it at once.
When you go loading images to build your training set, you can try to randomly discard x% of the pixels in each image. You can also plot the classifier accuracy for different values of x to find the best balance between memory and accuracy for your model (hint: for such a small model and this large amount of training data, I expect that it wont make that much of a difference even if you dropped 50% of your data).

Resources