How to configure Caffe deploy.prototxt? - image-processing

I followed #ypx instructions on this question. Now I want to predict some pictures. So I'm using:
MODEL_FILE = '/tmp/deploy.prototxt'
PRETRAINED = '/tmp/ck.caffemodel'
IMAGE_FILE = '/tmp/img.png'
net = caffe.Classifier(MODEL_FILE, PRETRAINED, image_dims=(200, 200))
But I get this message:
I1002 13:49:24.331648 28172 net.cpp:435] Input 0 -> data
I1002 13:49:24.331667 28172 layer_factory.hpp:76] Creating layer data
I1002 13:49:24.332259 28172 net.cpp:110] Creating Layer data
F1002 13:49:24.332272 28172 net.cpp:427] Top blob 'data' produced by multiple sources.
*** Check failure stack trace: ***
I think that my problem is on my deploy.prototxt file. This is my deploy.prototxt and This is my train.prototxt
Can someone help me to configure my deploy file?

You should remove the training input layer (lines 8--21) from your deploy net.
That is, discard this:
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
transform_param {
scale: 0.00392156862745
}
data_param {
source: "/tmp/db"
batch_size: 64
backend: LMDB
}
}

Related

OCR confidence score from Google Vision API

I am using Google Vision OCR for extracting text from images in python.
Using the following code snippet.
However, the confidence score always shows 0.0 which is definitely incorrect.
How to extract the OCR confidence score for individual char or word from the Google response?
content = cv2.imencode('.jpg', cv2.imread(file_name))[1].tostring()
img = types.Image(content=content)
response1 = client.text_detection(image=img, image_context={"language_hints": ["en"]})
response_annotations = response1.text_annotations
for x in response1.text_annotations:
print(x)
print(f'confidence:{x.confidence}')
Ex: output for an iteration
description: "Date:"
bounding_poly {
vertices {
x: 127
y: 11
}
vertices {
x: 181
y: 10
}
vertices {
x: 181
y: 29
}
vertices {
x: 127
y: 30
}
}
confidence:0.0
I managed to reproduce your issue. I used the following function and obtained confidence 0.0 for all items.
from google.cloud import vision
def detect_text_uri(uri):
client = vision.ImageAnnotatorClient()
image = vision.types.Image()
image.source.image_uri = uri
response = client.text_detection(image=image)
texts = response.text_annotations
print('Texts:')
for text in texts:
print('\n"{}"'.format(text.description))
vertices = (['({},{})'.format(vertex.x, vertex.y)
for vertex in text.bounding_poly.vertices])
print('bounds: {}'.format(','.join(vertices)))
print("confidence: {}".format(text.confidence))
if response.error.message:
raise Exception(
'{}\nFor more info on error messages, check: '
'https://cloud.google.com/apis/design/errors'.format(
response.error.message))
However, when using the same image with the "Try the API" option in the documentation I obtained a result with confidences non 0. This happened also when detecting text from a local image.
One should expect confidences to have the same value using both methods. I've opened an issue tracker, check it here.
Working code that retrieves the right confidence values of GOCR response.
(using document_text_detection() instead of text_detection())
def detect_document(path):
"""Detects document features in an image."""
from google.cloud import vision
import io
client = vision.ImageAnnotatorClient()
# [START vision_python_migration_document_text_detection]
with io.open(path, 'rb') as image_file:
content = image_file.read()
image = vision.types.Image(content=content)
response = client.document_text_detection(image=image)
for page in response.full_text_annotation.pages:
for block in page.blocks:
print('\nBlock confidence: {}\n'.format(block.confidence))
for paragraph in block.paragraphs:
print('Paragraph confidence: {}'.format(
paragraph.confidence))
for word in paragraph.words:
word_text = ''.join([
symbol.text for symbol in word.symbols
])
print('Word text: {} (confidence: {})'.format(
word_text, word.confidence))
for symbol in word.symbols:
print('\tSymbol: {} (confidence: {})'.format(
symbol.text, symbol.confidence))
if response.error.message:
raise Exception(
'{}\nFor more info on error messages, check: '
'https://cloud.google.com/apis/design/errors'.format(
response.error.message))
# [END vision_python_migration_document_text_detection]
# [END vision_fulltext_detection]
# add your own path
path = "gocr_vision.png"
detect_document(path)

Unexpected exception happened during extracting attributes Open Vino

I was trying to convert a Caffe model using mo_caffe.py script. I always get errors like below, but in random nodes (all have the common "BatchNorm" op). Trained the model using Nvidia-Digits (https://github.com/NVIDIA/DIGITS)
Model Optimizer arguments:
Common parameters:
- Path to the Input Model: /home/deploy.caffemodel
- Path for generated IR: /dldt/model-optimizer/.
- IR output name: deploy
- Log level: INFO
- Batch: Not specified, inherited from the model
- Input layers: Not specified, inherited from the model
- Output layers: Not specified, inherited from the model
- Input shapes: Not specified, inherited from the model
- Mean values: Not specified
- Scale values: Not specified
- Scale factor: Not specified
- Precision of IR: FP32
- Enable fusing: True
- Enable grouped convolutions fusing: True
- Move mean values to preprocess section: False
- Reverse input channels: False
Caffe specific parameters:
- Path to Python Caffe* parser generated from caffe.proto: mo/front/caffe/proto
- Enable resnet optimization: True
- Path to the Input prototxt: /home/deploy.prototxt
- Path to CustomLayersMapping.xml: Default
- Path to a mean file: Not specified
- Offsets for a mean file: Not specified
Model Optimizer version: unknown version
[ INFO ] Importing extensions from: /dldt/model-optimizer/mo
[ INFO ] New subclass: <class 'mo.ops.crop.Crop'>
[ INFO ] Registered a new subclass with key: Crop
[ INFO ] New subclass: <class 'mo.ops.deformable_convolution.DeformableConvolution'>
[ INFO ] Registered a new subclass with key: DeformableConvolution
[ INFO ] New subclass: <class 'mo.ops.concat.Concat'>
[ INFO ] Registered a new subclass with key: Concat
[ INFO ] New subclass: <class 'mo.ops.split.Split'>
...
Some log info. I don't think anything interesting here.
...
[ WARNING ] Skipped <class 'extensions.front.override_batch.OverrideBatch'> registration because it was already registered or it was disabled.
[ WARNING ] Skipped <class 'extensions.front.TopKNormalize.TopKNormalize'> registration because it was already registered or it was disabled.
[ WARNING ] Skipped <class 'extensions.front.reshape_dim_normalizer.ReshapeDimNormalizer'> registration because it was already registered or it was
disabled.
[ WARNING ] Skipped <class 'extensions.front.ArgMaxSqueeze.ArgMaxSqueeze'> registration because it was already registered or it was disabled.
[ WARNING ] Skipped <class 'extensions.front.standalone_const_eraser.StandaloneConstEraser'> registration because it was already registered or it wa
s disabled.
[ WARNING ] Skipped <class 'extensions.front.TransposeOrderNormalizer.TransposeOrderNormalizer'> registration because it was already registered or i
t was disabled.
[ WARNING ] Skipped <class 'mo.front.common.replacement.FrontReplacementOp'> registration because it was already registered or it was disabled.
[ WARNING ] Skipped <class 'extensions.front.restore_ports.RestorePorts'> registration because it was already registered or it was disabled.
[ WARNING ] Skipped <class 'extensions.front.softmax.SoftmaxFromKeras'> registration because it was already registered or it was disabled.
[ WARNING ] Skipped <class 'extensions.front.reduce_axis_normalizer.ReduceAxisNormalizer'> registration because it was already registered or it was
disabled.
[ WARNING ] Skipped <class 'extensions.front.freeze_placeholder_value.FreezePlaceholderValue'> registration because it was already registered or it
was disabled.
[ WARNING ] Skipped <class 'extensions.front.no_op_eraser.NoOpEraser'> registration because it was already registered or it was disabled.
[ WARNING ] Node attributes: {'_in_ports': {}, 'model_pb': name: "conv2_3_sep_bn_left"
type: "BatchNorm"
bottom: "conv2_3_sep_left"
top: "conv2_3_sep_left"
param {
lr_mult: 0.0
decay_mult: 0.0
}
param {
lr_mult: 0.0
decay_mult: 0.0
}
param {
lr_mult: 0.0
decay_mult: 0.0
}
blobs { [0/1963]
shape {
dim: 32
}
}
blobs {
shape {
dim: 32
}
}
blobs {
shape {
dim: 1
}
}
phase: TRAIN
, 'kind': 'op', '_out_ports': {}, 'pb': name: "conv2_3_sep_bn_left"
type: "BatchNorm"
bottom: "conv2_3_sep_left"
top: "conv2_3_sep_left"
param {
lr_mult: 0.0
decay_mult: 0.0
}
param {
lr_mult: 0.0
decay_mult: 0.0
}
param {
lr_mult: 0.0
decay_mult: 0.0
}
, 'type': 'Parameter'}
[ ERROR ] Unexpected exception happened during extracting attributes for node relu1.
Original exception message: list index (0) out of range
This is the error message I get. I also have created a GitHub issue
Here is the model that I want to convert from Caffe to OpenVino
The problem was that Nvidia-Digits uses a customized forked version of Caffe. That's why the model weights were not read properly by OpenVino.
I had to use this script to convert the model before converting it with OpenVino.

Loading custom dataset of images using PyTorch

I'm using the coil-100 dataset which has images of 100 objects, 72 images per object taken from a fixed camera by turning the object 5 degrees per image. Following is the folder structure I'm using:
data/train/obj1/obj01_0.png, obj01_5.png ... obj01_355.png
.
.
data/train/obj85/obj85_0.png, obj85_5.png ... obj85_355.png
.
.
data/test/obj86/obj86_0.ong, obj86_5.png ... obj86_355.png
.
.
data/test/obj100/obj100_0.ong, obj100_5.png ... obj100_355.png
I have used the imageloader and dataloader classes. The train and test datasets loaded properly and I can print the class names.
train_path = 'data/train/'
test_path = 'data/test/'
data_transforms = {
transforms.Compose([
transforms.Resize(224, 224),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])
}
train_data = torchvision.datasets.ImageFolder(
root=train_path,
transform= data_transforms
)
test_data = torchvision.datasets.ImageFolder(
root = test_path,
transform = data_transforms
)
train_loader = torch.utils.data.DataLoader(
train_data,
batch_size=None,
num_workers=1,
shuffle=False
)
test_loader = torch.utils.data.DataLoader(
test_data,
batch_size=None,
num_workers=1,
shuffle=False
)
print(len(train_data))
print(len(test_data))
classes = train_data.class_to_idx
print("detected classes: ", classes)
In my model I wish to pass every image through pretrained resnet and make a dataset from the output of resnet to feed into a biderectional LSTM.
For which I need to access the images by classname and index.
for ex. pre_resnet_train_data['obj01'][0] should be obj01_0.png and post_resnet_train_data['obj01'][0] should be the resnet output of obj01_0.png and so on.
I'm a beginner in Pytorch and for the past 2 days, I have read many tutorials and stackoverflow questions about creating a custom dataset class but couldn't figure out how to achieve what I want.
please help!
Assuming you only plan on running resent on the images once and save the output for later use, I suggest you write your own data set, derived from ImageFolder.
Save each resnet output at the same location as the image file with .pth extension.
class MyDataset(torchvision.datasets.ImageFolder):
def __init__(self, root, transform):
super(MyDataset, self).__init__(root, transform)
def __getitem__(self, index):
# override ImageFolder's method
"""
Args:
index (int): Index
Returns:
tuple: (sample, resnet, target) where target is class_index of the target class.
"""
path, target = self.samples[index]
sample = self.loader(path)
if self.transform is not None:
sample = self.transform(sample)
if self.target_transform is not None:
target = self.target_transform(target)
# this is where you load your resnet data
resnet_path = os.path.join(os.path.splitext(path)[0], '.pth') # replace image extension with .pth
resnet = torch.load(resnet_path) # load the stored features
return sample, resnet, target

Why is training using custom python layer in Pycaffe is extremely slow?

I created a custom layer in python so that I can feed the data directly.
but I noticed it runs extremely slow and the GPU usage is at most 1% ( the memory is allocated, i.e. I can see that when I run the script, it allocates 2100MB VRAM and terminating the training, frees around 1G.
I'm not sure if this is an expected behavior or I'm doing something wrong.
Here is the script I wrote (based on this former pr) :
import json
import caffe
import numpy as np
from random import shuffle
from PIL import Image
class MyDataLayer(caffe.Layer):
"""
This is a simple datalayer for training a network on CIFAR10.
"""
def setup(self, bottom, top):
self.top_names = ['data', 'label']
# === Read input parameters ===
params = eval(self.param_str)
# Check the paramameters for validity.
check_params(params)
# store input as class variables
self.batch_size = params['batch_size']
# Create a batch loader to load the images.
self.batch_loader = BatchLoader(params, None)
# === reshape tops ===
# since we use a fixed input image size, we can shape the data layer
# once. Else, we'd have to do it in the reshape call.
top[0].reshape(self.batch_size, 3, params['im_height'], params['im_width'])
# this is for our label, since we only have one label we set this to 1
top[1].reshape(self.batch_size, 1)
print_info("MyDataLayer", params)
def forward(self, bottom, top):
"""
Load data.
"""
for itt in range(self.batch_size):
# Use the batch loader to load the next image.
im, label = self.batch_loader.load_next_image()
# Add directly to the caffe data layer
top[0].data[itt, ...] = im
top[1].data[itt, ...] = label
def reshape(self, bottom, top):
"""
There is no need to reshape the data, since the input is of fixed size
(rows and columns)
"""
pass
def backward(self, top, propagate_down, bottom):
"""
These layers does not back propagate
"""
pass
class BatchLoader(object):
"""
This class abstracts away the loading of images.
Images can either be loaded singly, or in a batch. The latter is used for
the asyncronous data layer to preload batches while other processing is
performed.
labels:
the format is like :
png_data_batch_1/leptodactylus_pentadactylus_s_000004.png 6
png_data_batch_1/camion_s_000148.png 9
png_data_batch_1/tipper_truck_s_001250.png 9
"""
def __init__(self, params, result):
self.result = result
self.batch_size = params['batch_size']
self.image_root = params['image_root']
self.im_shape = [params['im_height'],params['im_width']]
# get list of images and their labels.
self.image_labels = params['label']
#getting the list of all image filenames along with their labels
self.imagelist = [line.rstrip('\n\r') for line in open(self.image_labels)]
self._cur = 0 # current image
# this class does some simple data-manipulations
self.transformer = SimpleTransformer()
print ("BatchLoader initialized with {} images".format(len(self.imagelist)))
def load_next_image(self):
"""
Load the next image in a batch.
"""
# Did we finish an epoch?
if self._cur == len(self.imagelist):
self._cur = 0
shuffle(self.imagelist)
# Load an image
image_and_label = self.imagelist[self._cur] # Get the image index
#read the image filename
image_file_name = image_and_label[0:-1]
#load the image
im = np.asarray(Image.open(self.image_root +'/'+image_file_name))
#im = scipy.misc.imresize(im, self.im_shape) # resize
# do a simple horizontal flip as data augmentation
flip = np.random.choice(2)*2-1
im = im[:, ::flip, :]
# Load and prepare ground truth
#read the label
label = image_and_label[-1]
#convert to onehot encoded vector
#fix: caffe automatically converts the label into one hot encoded vector. so we only need to simply use the decimal number (i.e. the plain label number)
#one_hot_label = np.eye(10)[label]
self._cur += 1
return self.transformer.preprocess(im), label
def check_params(params):
"""
A utility function to check the parameters for the data layers.
"""
required = ['batch_size', 'image_root', 'im_width', 'im_height', 'label']
for r in required:
assert r in params.keys(), 'Params must include {}'.format(r)
def print_info(name, params):
"""
Ouput some info regarding the class
"""
print ("{} initialized for split: {}, with bs: {}, im_shape: {}.".format(
name,
params['image_root'],
params['batch_size'],
params['im_height'],
params['im_width'],
params['label']))
class SimpleTransformer:
"""
SimpleTransformer is a simple class for preprocessing and deprocessing
images for caffe.
"""
def __init__(self, mean=[125.30, 123.05, 114.06]):
self.mean = np.array(mean, dtype=np.float32)
self.scale = 1.0
def set_mean(self, mean):
"""
Set the mean to subtract for centering the data.
"""
self.mean = mean
def set_scale(self, scale):
"""
Set the data scaling.
"""
self.scale = scale
def preprocess(self, im):
"""
preprocess() emulate the pre-processing occuring in the vgg16 caffe
prototxt.
"""
im = np.float32(im)
im = im[:, :, ::-1] # change to BGR
im -= self.mean
im *= self.scale
im = im.transpose((2, 0, 1))
return im
def deprocess(self, im):
"""
inverse of preprocess()
"""
im = im.transpose(1, 2, 0)
im /= self.scale
im += self.mean
im = im[:, :, ::-1] # change to RGB
return np.uint8(im)
And in my train_test.prototxt file I have :
name: "CIFAR10_SimpleTest_PythonLayer"
layer {
name: 'MyPythonLayer'
type: 'Python'
top: 'data'
top: 'label'
include {
phase: TRAIN
}
python_param {
#the python script filename
module: 'mypythonlayer'
#the class name
layer: 'MyDataLayer'
#needed parameters in json
param_str: '{"phase":"TRAIN", "batch_size":10, "im_height":32, "im_width":32, "image_root": "G:/Caffe/examples/cifar10/testbed/Train and Test using Pycaffe", "label": "G:/Caffe/examples/cifar10/testbed/Train and Test using Pycaffe/train_cifar10.txt"}'
}
}
layer {
name: 'MyPythonLayer'
type: 'Python'
top: 'data'
top: 'label'
include {
phase: TEST
}
python_param {
#the python script filename
module: 'mypythonlayer'
#the class name
layer: 'MyDataLayer'
#needed parameters in json
param_str: '{"phase":"TEST", "batch_size":10, "im_height":32, "im_width":32, "image_root": "G:/Caffe/examples/cifar10/testbed/Train and Test using Pycaffe", "label": "G:/Caffe/examples/cifar10/testbed/Train and Test using Pycaffe/test_cifar10.txt"}'
}
}
Whats wrong here?
Your data layer is not efficient enough and it takes most of the training time (you should try caffe time ... to get a more detailed profiling). At each forward pass you are waiting for the python layer to read batch_size images from disk one after the other. This can take forever.
You should consider using Multiprocessing to perform the reading at the background while the net is processing the previous batches: this should give you good CPU/GPU utilization.
See this example for multiprocessing python data layer.
Python layers are executed on CPU not the GPU so it's slow because things have to keep going between the CPU and GPU when training. That's also why you see low gpu usage because its waiting on the cpu to execute the python layer.

Theano: expected 2, got 1 with shape (128,).')

Im trying to make a neural network. I have followed the video from
https://www.youtube.com/watch?v=S75EdAcXHKk
I have loaded the adult.data training set.
I am now on my way of training and i have these lines where the code fails.
while(epocs<5):
i = 0
for start, end in zip(range(0, len(trX), 128), range(128, len(trX), 128)):
print(trX.shape)
tr = trX[start:end]
print(tr.shape[0])
print(tr.shape[1])
self.cost = train(tr.reshape(tr.shape[0],tr.shape[1]), trY[start:end])
epocs+=1
I am strugling with an error message which is:
n.training()
File "C:\Users\Bjornars\PycharmProjects\cogs-118a\Project\NN\Network.py", line 101, in training
self.cost = train(tr.reshape(128,106), trY[start:end])
File "C:\Anaconda3\lib\site-packages\theano\compile\function_module.py", line 513, in call
allow_downcast=s.allow_downcast)
File "C:\Anaconda3\lib\site-packages\theano\tensor\type.py", line 169, in filter
data.shape))
TypeError: ('Bad input argument to theano function with name "C:\Users\Bjornars\PycharmProjects\cogs-118a\Project\NN\Network.py:84" at index 1(0-based)', 'Wrong number of dimensions: expected 2, got 1 with shape (128,).')
The shape of the array im sending in is (5000,106)
---Solved----
Used this, it expected array not number in trY
def preprocess(self,trDmatrix,labels):
for i in range(len(trDmatrix)):
numbers = [0.0]*2
numbers[int(labels[i])]= 1.0
labels[i] = numbers
return trDmatrix, labels

Resources