Using Torch trained VGG face - machine-learning

Is there any way I can pass existing images in my system through a trained VGG with torch? I am using Ubuntu 14.04 and unfortunately do not have a GPU. I have searched quite extensively but all the ones I have found require a GPU. Are there other ways to use VGG without torch? I'm open to suggestions but the method should not require a GPU.

While running the network using a GPU will make things a lot faster, you can run the network in CPU mode only.
Once you load the model, pretrained on a GPU you can simple convert it to CPU as follow:
model = model:float()
You can easily load an image from your computer with the help of the image library and then do a forward pass
local img = image.load(imagefile,3,'byte')
local output = model:forward(img)

Related

Efficient inference of 3D deep learning model (pytorch)

I am trying to use a Pytorch 3D UNet for inference (from here: https://github.com/wolny/pytorch-3dunet) which receives images of size (96, 96, 96). I would like to use it on CPU instances, but I am getting very high memory usages (~18 GB). After researching on the subject I found out that this was due to the way convolutions are implemented on CPU (see https://discuss.pytorch.org/t/pytorch-high-memory-demand/2798/5). I thus have the following questions:
Is there a way to use a more memory-efficient implementation of the convolution in Pytorch?
How can I optimize my model for CPU inference? I saw that some tools like AWS Neo, Intel OpenVINO, etc. exist; could they solve my problem?
Does Tensorflow have a similar problem for using convolutions on CPU?
Any other tip, link on how to deploy such models in an efficient way is welcome!
Thanks!
You could benchmark your model's performance with DNN-Bench and choose the best inference engine for your application and your hardware. You might need to convert your model to ONNX first.

Can we use CPU instead of GPU to train custome YOLO model for object detection

I want to train YOLO model for my custom objects data-set. I read about it everywhere on various sites and everybody is talking about GPU should be used to train and run YOLO custom model.
But, due to I don't have GPU I am confused about what to do? Because I can not buy a GPU for that. Also, I read about Google Colab but I can not use it, that I want to use my model on offline system.
I am scared after seeing the system utilization of the YOLO used in the program from github:
https://github.com/AhmadYahya97/Fully-Automated-red-light-Violation-Detection.git.
I was running this on my laptop having configuration:
RAM: 4 GB
Processor: Intel i3, 2.40 GHz
OS: Ubuntu 18.04 LTS
Although it is going to be lot slower, yes you can use CPU only when training and make prediction. If you are using original Darknet Framework, set the GPU flag in Makefile when installing darknet to GPU=0.
How to install darknet : https://pjreddie.com/darknet/install/
Then you can start to train or predict following this guide : https://pjreddie.com/darknet/yolo/

How does MTCNN perform vs DLIB for face detection?

I saw MTCNN being recommended but haven't seen a direct comparison of DLIB and MTCNN.
I assume since MTCNN uses a neural networks it might work better for more use cases, but also have some surprisingly horrible edge cases?
Has anyone done an analysis of error rate, performance under different conditions (GPU and CPU), and general eyeball observations of the two?
You can have a look at this amazing kaggle notebook by timesler. Comparison is made between facenet-pytorch, DLIB & MTCNN.
https://www.kaggle.com/timesler/comparison-of-face-detection-packages
"Each package is tested for its speed in detecting the faces in a set of 300 images (all frames from one video), with GPU support enabled. Detection is performed at 3 different resolutions.
Any one-off initialization steps, such as model instantiation, are performed prior to performance testing."
You can test it within deepface easily. My experiments show that mtcnn overperforms than dlib.
#!pip install deepface
from deepface import DeepFace
backends = ['opencv', 'ssd', 'dlib', 'mtcnn']
DeepFace.detectFace("img.jpg", detector_backend = backends[0])

How do I run a Tensorflow Object Detection API model in iOS?

I have just trained a model with satisfactory results and I have the frozen_inference_graph.pb . How would I go about running this on iOS? It was trained on SSD Mobilenet V1 if that helps. Optimally I'd like to run it using the GPU (I know the tensorflow API can't do that on iOS), but it would be great to just have it on CPU first.
Support was just announced for importing TensorFlow models into Core ML. This is accomplished using the tfcoreml converter, which should take in your .pb graph and output a Core ML model. From there, you can use this model with Core ML and either take in still images or video frames for processing.
At that point, it's up to you to make sure you're providing the correct input colorspace and size, then extracting and processing the SSD results correctly to get your object classes and bounding boxes.

Speeding up inference of Keras models

I have a Keras model which is doing inference on a Raspberry Pi (with a camera). The Raspberry Pi has a really slow CPU (1.2.GHz) and no CUDA GPU so the model.predict() stage is taking a long time (~20 seconds). I'm looking for ways to reduce that by as much as possible. I've tried:
Overclocking the CPU (+ 200 MhZ) and got a few extra seconds of performance.
Using float16's instead of float32's.
Reducing the image input size as much as possible.
Is there anything else I can do to increase the speed during inference? Is there a way to simplify a model.h5 and take a drop in accuracy? I've had success with simpler models, but for this project I need to rely on an existing model so I can't train from scratch.
VGG16 / VGG19 architecture is very slow since it has lots of parameters. Check this answer.
Before any other optimization, try to use a simpler network architecture.
Google's MobileNet seems like a good candidate since it's implemented on Keras and it was designed for more limited devices.
If you can't use a different network, you may compress the network with pruning. This blog post specifically do pruning with Keras.
Maybe OpenVINO will help. OpenVINO is an open-source toolkit for network inference, and it optimizes the inference performance by, e.g., graph pruning and fusing some operations. The ARM support is provided by the contrib repository.
Here are the instructions on how to build an ARM plugin to run OpenVINO on Raspberry Pi.
Disclaimer: I work on OpenVINO.

Resources