Running Detectron2 inference in Caffe2 - machine-learning

I have a Detectron2 .pth model that I converted successfully to Caffe2 .pb via the Detectron2 tools functionality located here: https://github.com/facebookresearch/detectron2/blob/master/tools/caffe2_converter.py
As recommended, used the --run-eval flag to confirm results while converting and the results are very similar to original detectron2 results.
To run inference on a new image using the resulting model.pb and model_init.pb files, used functionality located here:
https://github.com/facebookresearch/detectron2/blob/master/detectron2/export/api.py (mostly)
https://github.com/facebookresearch/detectron2/blob/master/detectron2/export/caffe2_inference.py
However, inference results are not even close. Can anybody suggest reasons why this might happen? Detectron2 repo says all preprocessing is done in the caffe2 scripts, but am I missing something?
I can provide my inference code:
caffe2_model = Caffe2Model.load_protobuf(input_directory)
img = cv2.imread(input_image)
image = torch.as_tensor(img.astype("float32").transpose(2, 0, 1))
data = {'image': image, 'height': image.shape[1], 'width': image.shape[2]}
output = caffe2_model([data])

Your input_image should be multiple of 32.
So problably do you need make a resize of your input img
So do you need :
caffe2_model = Caffe2Model.load_protobuf(input_directory)
img = cv2.imread(input_image)
img = cv2.resize(img, (64, 64))
image = torch.as_tensor(img.astype("float32").transpose(2, 0, 1))
data = {'image': image, 'height': image.shape[1], 'width': image.shape[2]}
output = caffe2_model([data])
See the class: classdetectron2.export.Caffe2Tracer
In link : https://detectron2.readthedocs.io/en/latest/modules/export.html#detectron2.export.Caffe2Tracer

Related

SageMaker Serverless Inference Preprocessing Question

I've recently seen that there is a serverless version of SageMaker and I wanted to use it for a personal project (first time using SageMaker). I used the guide below to try and deploy my model, only modifying some preprocessing steps, steps which I also did when doing predictions locally and on Lambda).
def input_handler(data, context):
if context.request_content_type == 'application/x-image':
image_as_bytes = io.BytesIO(data.read())
image = Image.open(image_as_bytes)
image = image.convert('RGB')
image = image.resize((150,150))
instance = np.array(image, dtype='f')
instance = instance / 255
instance = np.expand_dims(image, axis=0)
payload = json.dumps({"instances": instance.tolist()})
return payload
else:
_return_error(415, 'Unsupported content type "{}"'.format(context.request_content_type or 'Unknown'))
with open(file_name, 'rb') as f:
image_data = f.read()
response = runtime.invoke_endpoint(EndpointName=endpoint_name,
ContentType='application/x-image',
Body=image_data)
When invoking the endpoint via the runtime (as in the guide) I always get the same prediction for any of the images from different classes.
Hope my explanation is ok as it is the first time I am asking a question. Not sure what I am missing but any help is appreciated.
https://github.com/shashankprasanna/sagemaker-video-examples/tree/master/sagemaker-serverless-inference
I tried doing predictions locally (same model, same preprocessing steps) via the predict api and it was working as expected. I have the logs from CloudWatch (which I deleted in the code above so it's not cluttered): Opened image for inference -> Image mode set to: RGB -> Resized to: (150, 150) -> Converted image to: float32 -> Normalized array: [[[0.7529412 0.7019608 0.67058825] ... -> Expanded to: (1, 150, 150, 3) -> payload: {"instances": [[[[192, 179, 171], [187, 174, 166], [188, 175, 167], [195, 182, 174] ...
Not sure if it has something to do with it, but shouldn't the list be the same as normalized np array?

pytorch model predicts fixed label when it exports to onnx

I trained resnet-18 model in pytorch. And it works well in pytorch.
But, when I converts it to onnx and predicts in cv2, model predicts only 1~2 label(it should predict 0~17 labels).
this is my model export code
model.eval()
x = torch.randn(1, 3, 512, 384, requires_grad=True)
# export model
torch.onnx.export(model, x, "model.onnx", export_params=True, opset_version=10, do_constant_folding=True, input_names = ['input'], output_names = ['output'])
And this is my code for inference in cv2
self.transform = albumentations.Compose([
albumentations.Resize(512, 384, cv2.INTER_LINEAR),
albumentations.GaussianBlur(3, sigma_limit=(0.1, 2)),
albumentations.Normalize(mean=(0.5), std=(0.2)),
albumentations.ToFloat(max_value=255)
])
...
#image crop code: works fine in pytorch
image = frame[ymin:ymax, xmin:xmax] #type(frame)==numpy.array, RGB form
augmented = self.transform(image=image)
image = augmented["image"]
...
#inference code: does not work well
net=cv2.dnn.readNet("Model.onnx")
blob = cv2.dnn.blobFromImage(image, swapRB=False, crop=False)
net.setInput(blob)
label = np.array(net.forward())
text = 'Label: '+str(np.argmax(label[0]))
All transform settings works well in pytorch.
What can be the problem in this code?
The problem with your code probably has to do with preprocessing the images differently: self.transform rescales the image, but when you are reading blob, you are not doing that. To verify this, you can read the same image and check if the image and blob are equal (e.g. using torch.allclose), when the actual (random) augmentations are disabled.

Image preprocessing - Train and test image data are not being read in corresponding order

I am trying to train a CNN model for an image processing problem statement. - I am facing a major issue in the preprocessing stage, where the train datasets of both train_rain and train_no_rain are not in the order I wish for them to be. This is affecting the performance of my model, as it is important for my model to ID an image with rain streaks and then the same image without them.
Any solutions to this issue?
Here are the samples of what I am trying to imply -
Say after reading the datasets as shown below:
path_1 = "gdrive/My Drive/Rain100H/train/rainy"
train_rain = []
no_train_rain = 0
gauss_img = []
for img in glob.glob(path_1+"/*.png"):
im = cv.imread(img)
im = cv.resize(im,(128,128))
#Gaussian Blur
im_gb = cv.GaussianBlur(im,(5,5),0)
gauss_img.append(im_gb)
cv.waitKey()
no_train_rain+=1
train_rain.append(im)
train_no_rain = []
no_train_no_rain = 0
path_2 = "gdrive/My Drive/Rain100H/train/no rain"
for img in glob.glob(path_2+"/*.png"):
im = cv.imread(img)
im = cv.resize(im,(128,128))
cv.waitKey()
no_train_no_rain+=1
train_no_rain.append(im)
Now I want to read the first images from train_rain and train_no_rain, AFTER converting them to numpy arrays. and I did that using this -
import matplotlib.pyplot as plt
first image from train_rain
plt.imshow(train_rain[1])
first image from train_no_rain
plt.imshow(train_no_rain[1])
But ideally, the first image in train_no_rain should be:
PS: The datasets have all the images beforehand, it's just that they are not being read in a particular order.
Any sort of help would be much appreciated :)

In TensorFlow 2, are Datasets less efficient when doing image augmentation and combining numpy and ImageDataGenerator?

tf.keras.preprocessing.image.ImageDataGenerator is super simple and easy to use to perform data augmentation. However, it seems to be much slower than using tf.data.Dataset to load data. I tried to load images tf.data.Dataset but couldn't figure out how to do the same data augmentation such as randomly shifting the width and height, randomly rotate, etc. I see in github that all these tf.keras.image.ImageDataGenerator augmentations seem to be using this function tf.keras.preprocessing.image.apply_affine_transform; however, the input must be numpy arrays.
For the best performance, do I have to rewrite the tf.keras.preprocessing.image.apply_affine_transform function so it takes TensorFlow tensors or can I just change the image to numpy during preprocessing phase in tf.data.Dataset, use the apply_affine_transform function for data augmentation, and cast it back to a Tensor?
Additionally, there is also tf.data.Dataset.from_generator() which looks like it can take an ImageDataGenerator.
Which is faster and more efficient for data loading: option 1, 2, or 3?
import tensorflow as tf
AUTOTUNE = tf.data.experimental.AUTOTUNE
batch_size=32
option 1
def preprocess(filename):
tfImg = tf.io.read_file(file_path)
numpyImg = tf.image.decode_jpeg(tfImg, channels=3).numpy()
augNumpy = tf.keras.preprocessing.image.apply_affine_transform(numpyImg, change_some_arguments_here_for_augmentation)
return tf.cast(augNumpy, tf.float32). # return augmented image as Tensor
list_ds = tf.data.Dataset.list_files(str(data_dir/'*/*'))
newAugmentedDataset = list_ds.map(preprocess, num_parallel_calls=AUTOTUNE).shuffle(buffer_size=1000).repeat().batch_size(batch_size).prefetch(AUTOTUNE)
option 2
def preprocess(filename):
tfImg = tf.io.read_file(file_path)
tfImg = tf.image.decode_jpeg(tfImg, channels=3)
return rewritten_apply_affine_function(tfImg,...)
list_ds = tf.data.Dataset.list_files(str(data_dir/'*/*'))
newAugmentedDataset = list_ds.map(preprocess, num_parallel_calls=AUTOTUNE).shuffle(buffer_size=1000).repeat().batch_size(batch_size).prefecth(AUTOTUNE)
option 3
train_data_generator = ImageDataGenerator(augmentations_here)
ftrain_generator = train_data_generator.flow_from_directory(directory_here, shuffle=False)
ftrain_generator_ds = tf.data.Dataset.from_generator(lambda : ftrain_generator,
output_types=(tf.float32),
output_shapes = (tf.TensorShape([None, height_here, width_here, num_channel]).prefetch(AUTOTUNE)
any of the options above goes into .fit
model.fit(dataset_here, steps_per_epoch=sample_count/batch_size)

Tensorflow Image Shape Error

I have trained a classifier and I now want to pass any single image through.
I'm using the keras library with Tensorflow as the backend.
I'm getting an error I can't seem to get past
img_path = '/path/to/my/image.jpg'
import numpy as np
from keras.preprocessing import image
x = image.load_img(img_path, target_size=(250, 250))
x = image.img_to_array(x)
x = np.expand_dims(x, axis=0)
preds = model.predict(x)
Do I need to reshape my data to have None as the first dimension? I'm confused why Tensorflow would expect None as the first dimension?
Error when checking : expected convolution2d_input_1 to have shape (None, 250, 250, 3) but got array with shape (1, 3, 250, 250)
I'm wondering if there has been an issue with the architecture of my trained model?
edit: if i call model.summary() give convolution2d_input_1 as...
Edit: I did play around with the suggestion below but used numpy to transpose instead of tf - still seem to be hitting the same issue!
None matches any number. Usually, when you pass some data to a model, it is expected that you pass tensor of dimensions: None x data_size, meaning the first dimension is any dimension and denotes batch size. In your case, the problem is that you pass 250 x 250 x 3, and it is expected 3 x 250 x 250. Try:
x = image.load_img(img_path, target_size=(250, 250))
x_trans = tf.transpose(x, perm=[2, 0, 1])
x_expanded = np.expand_dims(x_trans, axis=0)
preds = model.predict(x_expanded)
Ok so using feedback rom Sygi i think i have half solved it,
The error was actually telling me i needed to pass in my dimensions as [1, 250, 250, 3] so that was an easy fix; i must say im not sure why TF is expecting the dimensions in this order as looking at the docs it doesnt seem right so more research required here.
Moving ahead im not sure transpose is the way to go as if i use a different input image the dimensions may not be in the same order meaning the transpose doesnt work properly,
Instead of transpose I'm probably trying to t call x_reshape = img.reshape((1, 250, 250, 3)) depending on what i find out about dimension order in reshaping for TS
thanks for the hints Sygi :)

Resources