I am using StyleGAN2 ADA in Google Colab. I have managed to train a GAN nicely but everytime I run generate.py I am unable to generate images because it is set to greyscale. I have tried changing the generate.py file "RGB" to "L", is there anything else I can do?
Related
I am trying to do my own object detection using my own dataset. I started my first machine learning program from google tensorflow object detection api, the link is here:eager_few_shot_od_training_tf2_colab.ipynb
In the colab tutorial, the author use javascript label the images, the result like this:
gt_boxes = [
np.array([[0.436, 0.591, 0.629, 0.712]], dtype=np.float32),
np.array([[0.539, 0.583, 0.73, 0.71]], dtype=np.float32),
np.array([[0.464, 0.414, 0.626, 0.548]], dtype=np.float32),
np.array([[0.313, 0.308, 0.648, 0.526]], dtype=np.float32),
np.array([[0.256, 0.444, 0.484, 0.629]], dtype=np.float32)
]
When I run my own program, I use labelimg replace to javascript, but the dataset is not compatible.
Now I have two questions, the first one is what is the dataset type in colab tutorial? coco, yolo, voc, or any other? the second is how transform dataset between labelimg data and colab tutorial data? My target is using labelimg to label data then substitute in colab tutorial.
The "data type" are just ratio values based on the height and width of the image. So the coordinates are just ratio values for where to start and end the bounding box. Since each image is going to be preprocessed, that is, it's dimensions are changed when fed into the model (batch,height,width,channel) the bounding box coordinates must have the correct ratio as the image might change dimensions from it's original size.
Like for the example, the model expects images to be 640x640. So if you provide an image of 800x600 it has to be resized. Now if the model gave back the coordinates [100,100,150,150] for an 640x640, clearly that would not be the same for 800x600 images.
However, to get this data format you should use PascalVOC when using labelImg.
The typical way to do this is to create TFRecord files and decode them in your training script order to create datasets. However, you are free to choose whatever method you like Tensorflow dataset in order to train your model.
Hope this answered your questions.
I would like to generate augmented data for images by Random rotation, shifts, shear and flips.
I have found this keras function.
The function keras.preprocessing.image.ImageDataGenerator But I've seen this being used to directly train networks.
Is there a way to input images and then save the transformed images on HDD instead of how if currently works in examples in this link
Or is there another simple plug and use python package I can use instead of implementing everything with numpy or opencv ?
Basically - this is generator which is infinitely returning a batches of images. One could do the following:
def save_images_from_generator(maximal_nb_of_images, generator):
nb_of_images_processed = 0
for x, _ in generator:
nb_of_images += x.shape[0]
if nb_of_images <= maximal_nb_of_images:
for image_nb in range(x.shape[0]):
your_custom_save(x[image_nb]) # your custom function for saving images
else:
break
to save images from keras image generator.
You can save the images outputted by ImageGenerator to HDD. One option is to use datagen.flow as follows:
for X_batch, y_batch in datagen.flow(X_train, y_train, batch_size=9, save_to_dir='images', save_prefix='aug', save_format='png')
A second option is to manually loop over each image, load it, and apply a random transformation. Once you have instantiated your ImageGenerator, just call:
img_trans = datagen.random_transform(img)
Then, save the transformed image to HDD using PIL etc.
A third option is to manually loop over each image, load it, and apply a random transformation using a third party program. I recommend imgaug, found here.
I am new to using TensorFlow. So I wastrying the MNIST tutorials in ML for beginners. The code runs just fine. But what if I want to input an image of my own, which has say a handwritten number on it, and se if it predicts what number it might be? How do I feed my own image into the TensorFlow program?
Assuming you're using this file.
If you look at x, the shape is [None, 784]. To feed your own image in, you'll have to store the image as a variable (loading it using PIL or OpenCV or something), flatten it, wrap it in a list, and pass it to the graph in the feed_dict, looking something like this:
sess.run(y_, feed_dict={x: [np.flatten(image_you_loaded_in)]})
It will need to be a 28x28 image in order for this code to work without modification.
I am trying to implement the OpenCV LBPHFaceRecognizer() and make it work for the images of digits from the MNIST dataset. These images are 28 x 28 px and look like this:
But for this task I need an haarcascade.xml file which is able to recognize digits. In the OpenCV package I only find xml files which are suited for facerecognition and russian plate numbers.
Here is my code, I basicly just need to replace the cascadePath = "haarcascade_frontalface_default.xml" with an apropriate xml for digits, but where do I get one?
All in all I want to test facerecognition with numbers instead of faces. So an input image where a "1" is shown should be able to recognize all other "1"`s in the dataset.
For this, you need to train a cascade. Here two link to explain how to do this :
1 This is the Opencv documentation for opencv_traincascade which is the opencv app to train cascade (generate .xml)
2 This is a useful tutorial to train cascade with opencv. It explains what to do and give some tricks to generate input file.
I Have decided to use OpenCV to build a 3d scene by using a series of 2D Images. I found the example code that came with OpenCV [ build3dmodel.cpp Here ].
I just want to run this once and see what kind of outcome this gives. My knowledge with OpenCV is low, I don't want to understand the whole code, I just want to know how to give inputs to this program (the image set) to see the output.
The line command of this code example requires the following parameters:
build3dmodel -i intrinsics_filename.yml [-d detector] [-de
descriptor_extractor] -m model_name.yml
The first file is the camera matrix which you obtain after the calibration process (there is an especific example with it). Detector and descriptor detector must match with valid FeatureDetector and DescriptorExtractor names. Model name is a bit confusing, it looks like part of the yml file name where data will be saved.
First see some tutorial like introduction to OpenCv or OpenCV tutorial. Also, see input and output with OpenCv.