I'm working on a project that requires object detection and recognition on images fed by a live camera.
For it to work well it should perform the evaluation for a single frame as quick as possible.
Going straight for TensorFlow instead of using Keras with TF backend will improve the performance of the whole model during the evaluation?
Related
I am developing an image segmentation algorithm, I used Deeplabv3+ to train a model, which performs fine. The question is if I can improve its performance. I would like to know if there is preprocessing methods to use before feeding the image.
I want to detect the charging connector, as depicted in the image:
I can detect it without any problem. But just looking for some improvements if exist.
I am using the Inception Model to extract the features for Transfer Learning in Tensorflow. Major issue is that it only extract feature for one image at a time. How can I use it for Batch Mode to make it fast?
Is it possible to feed image features, say SIFT features, to a convolutional neural network model in Tensorflow? I am trying a tensorflow implementation of this project in which a grayscale image is coloured. Will image features be a better choice than feeding the images as is to the model?
PS. I am a novice to machine learning and is not familiar with creating neural n/w models
You can feed tensorflow neural net almost anything.
If you have extra features for each pixel, then instead of using one channel (intensity) you would use multiple channels.
If you have extra features, which are about whole image, you can make separate input a merge features at some upper layer.
As for the better performance, you should try both approaches.
General intuition is that, extra features help if you don't have many samples and their effect is diminishing if you have many samples and network can learn features by itself.
Also one more point: If you are novice, I strongly recommend using higher level framework like keras.io (which is layer over tensorflow) instead of tensorflow.
I am working on creating a Real-time image processor for a self driving small scale car project for uni, It uses a raspberry pi to get various information to send to the program to base a decision by.
the only stage i have left is to create a Neural network which will view the image displayed from the camera ( i already have to code to send the array of CV_32F values between 0-255 etc.
I have been scouring the internet and cannot seem to find any example code that is related to my specific issue or my kind of task in general (how to implement a neural network of this kind), so my question is is it possible to create a NN of this size in c++ without hard coding it (aka utilising openCv's capabilities): it will need 400 input nodes for each value (from 20x20 image) and produce 4 outputs of left right fwd or backwards respectively.
How would one create a neural network in opencv?
Does openCV provide a backpropogation(training) interface /function or would I have to write this myself.
once it is trained am I correct in assuming I can load the neural network using ANN_MLP load etc? following this pass the live stream frame (as an array of values) to it and it should be able to produce the correct output.
edit:: I have found this OpenCV image recognition - setting up ANN MLP. and It is very simple in comparison to what I want to do, and I am not Sure how to adapt that to my problem.
OpenCV is not a neural network framework and in turn won't find any advanced features. It's far more common to use a dedicated ANN library and combine it with OpenCV. Caffe is a great choice as a computer vision dedicated deep learning framework (with C++ API), and it can be combined with OpenCV.
I need to do realtime augmentation on my dataset for input to CNN, but i am having a really tough time finding suitable libraries for it. I have tried caffe but the DataTransform doesn't support many realtime augmentations like rotating etc. So for ease of implementation i settled with Lasagne. But it seems that it also doesn't support realtime augmentation. I have seen some posts related to Facial Keypoints detection where he's using Batchiterator of nolearn.lasagne. But i am not sure whether its realtime or not. There's no proper tutorial for it. So finally how should i do realtime augmentation in Lasagne either through nolearn or otherwise?
You can use Keras framework for real time data augmentation for CNN training. Here is the example code for CIFAR10 dataset from github. You can also change it to adapt your needs or copy source code and add to lasagne project but I have not tried importing to lasagne before. Basic idea behind this is randomly augmenting data in every batch. If you have for loop of batches that fits network, you can call your augmentation function before sending data to network.
Yes you can do real-time data augmentation in Lasagne. The simplest way is using the GaussianNoiseLayer. Simply insert it after your input layer. If Gaussian noise is not what you need, then at least you have GaussianNoiseLayer as an example for how to implement your own.
Note how the deterministic parameter is used in Lasagne. It is off by default, and so during training the noise is added. During testing you set deterministic=True and the augmentation is simply avoided.
Yes, the Facial Keypoints Recognition tutorial that you mention does use real-time (on the fly) augmentation to flip the input images (and target coordinates) at random.
The nolearn-utils library has a ton of examples of iterators that do several types of augmentation. E.g. AffineTransformBatchIteratorMixin does random affine transforms on the fly.