I have retrained Tensorflow inception image classification model on my own collected dataset and is working fine. Now, I want to make a continuous image classifier on a live camera video. I have a raspberry pi camera for input.
Here's I/O 2017 link(https://www.youtube.com/watch?v=ZvccLwsMIWg&index=18&list=PLOU2XLYxmsIJqntMn36kS3y_lxKmQiaAS) I want to do the same as shown in the video at 3:20/8:49
Is there any tutorial to achieve this?
Step one
Put your tensorflow model aside for this first step. Follow different tutorials online like this one that show how to get an image from your raspberry pi.
You should be able to prove that your code works to yourself by displaying the images to a device or ftp'ing them to another computer that has a screen.
You should also be able to benchmark the rate at which you can capture images, and it should be about 5 per second or faster.
Step two
Look up and integrate image resizing as needed. Google and Stack Overflow are great places to search for how to do that. Again, verify that you are able to resize the image to exactly what your tensorflow needs.
Step three
copy over some of the images to your dev environment and verify that they work as is.
Step four
ftp your trained tensorflow model to the pi along with installing supporting libraries. Integrate the pieces into one codebase and turn it on.
Related
I have one image stored in my bundle or in the application.
Now I want to scan images in camera and want to compare that images with my locally stored image. When image is matched I want to play one video and if user move camera from that particular image to somewhere else then I want to stop that video.
For that I have tried Wikitude sdk for iOS but it is not working properly as it is crashing anytime because of memory issues or some other reasons.
Other things came in mind that Core ML and ARKit but Core ML detect the image's properties like name, type, colors etc and I want to match the image. ARKit will not support all devices and ios and also image matching as per requirement is possible or not that I don't have idea.
If anybody have any idea to achieve this requirement they can share. every help will be appreciated. Thanks:)
Easiest way is ARKit's imageDetection. You know the limitation of devices it support. But the result it gives is wide and really easy to implement. Here is an example
Next is CoreML, which is the hardest way. You need to understand machine learning even if in brief. Then the tough part - training with your dataset. Biggest drawback is you have single image. I would discard this method.
Finally mid way solution is to use OpenCV. It might be hard but suit your need. You can find different methods of feature matching to find your image in camera feed. example here. You can use objective-c++ to code in c++ for ios.
Your task is image similarity you can do it simply and with more reliable output results using machine learning. Since your task is using camera scanning. Better option is CoreML.You can refer this link by apple for Image Similarity.You can optimize your results by training with your own datasets. Any more clarifications needed comment.
Another approach is to use a so-called "siamese network". Which really means that you use a model such as Inception-v3 or MobileNet and both images and you compare their outputs.
However, these models usually give a classification output, i.e. "this is a cat". But if you remove that classification layer from the model, it gives an output that is just a bunch of numbers that describe what sort of things are in the image but in a very abstract sense.
If these numbers for two images are very similar -- if the "distance" between them is very small -- then the two images are very similar too.
So you can take an existing Core ML model, remove the classification layer, run it twice (once on each image), which gives you two sets of numbers, and then compute the distance between these numbers. If this distance is lower than some kind of threshold, then the images are similar enough.
I am supposed to do traffic symbols recognition from live streaming data. Please tell me how to automate the process of segmentation. I am able to recognize the symbols using Neural Networks from segmented data but stuck in the segmentation part.
I have tried it using YOLO, but I think I am lacking something.
I have also tried it with openCV.
please help
INPUT IMAGE FRAME FROM LIVE STREAM
OUTPUT
I would suggest you follow this link:
https://github.com/AlexeyAB/darknet/tree/47c7af1cea5bbdedf1184963355e6418cb8b1b4f#how-to-train-pascal-voc-data
It's very simple to follow. You basicly need to do 2 steps. Installing and creating the data you want to use (road signs in your case).
So follow the installation guide and then try to find a dataset of road signs, use your own or create your own data set. You will need the annotation files as well (you can generate them yourself easily if you use your own dataset(s) - this is explained in the link as well). You don't need a huge amount of pictures, because darknet will augment the images automaticly (just resizing though). If you use a pretrained version you should get "ok" results pretty fast ~after 500 iterations.
I need to recognize specific images using the iPhone camera. My goal is to have a set of 20 images, that when a print or other display of one of them is present in front of the camera, the app recognizes that image.
I thought about using classifiers (CoreML), but I don't think it would give the intended result. For example, if I had a model that recognizes fruits, and then I showed it two different pictures of a banana, It would recognize them both as bananas, which is not what I want. I want my app to recognize specific images, regardless of its content.
The behavior I want is exactly what ARToolKit does (https://www.artoolkit.org/documentation/doku.php?id=3_Marker_Training:marker_nft_training), but I do not wish to use this library.
So my question is: Are the any other libraries, or other ways, for me to recognize specific images from the camera on iOS (preferably in Swift).
Since you are using images specific to your use case there isn't going to be an existing model that you can use. You'd have to create a model, train it, and then import it into CoreML. It's hard to provide specific advice since I know nothing about your images.
As far as libraries are concerned checkout this list and Swift-AI.
Swift-AI has a neural network that you might be able to train if you had enough images.
Most likely you will have to create the model in another language, such as Python and then import it into your Xcode project.
Take a look at this question.
This blog post goes into some detail about how to train your own model for CoreML.
Keras is probably your best bet to build your model. Take a look at this tutorial.
There are other problems too though like you only have 20 images. This is certainly not enough to train an accurate model. Also the user can present modified versions of these images. You'd have to generate realistic sample of each possible image and then use that entire set to train the model. I'd say you need a minimum of 20 images of each image (400 total).
You'll want to pre-process the image and extract features that you can compare to the known features of your images. This is how facial recognition works. Here is a guide for facial recognition that might be able to help you with feature extraction.
Simply put without a model that is based on your images you can't do much.
Answering my own question.
I ended up following this awesome tutorial that uses OpenCV to recognize specific images, and teaches how to make a wrapper so this code can be accessed by Swift.
I am working on a use case where a bunch of images devices on field are provided. Some of the images have bad devices, and some of them have good images. This data has to be stored on hadoop. I have been trying to identify the images with bad devices based on certain criteria such as "white spots" in the image using OpenCv-Python libraries.
However, it is not giving accurate results. Which algorithm, and software package would be good to do image classification in above scenario.
Here is what I am planning to try, but looking for guidance so that I don't tread the wrong path too far.
Use Open-cv with python API, tried to load images as spark RDDs of numpy arrays.
Get SIFT keys for images and use them as features. Apply SVM algorithm to classify the images with good devices vs images with bad devices ( I am new to Open-CV)
(or)
Use tensorflow and build a convolutional neural network to classify the images into the ones with good devices & bad devices (I am completely new to this.)
I also have to identify the GPS coordinates of these images, which may be possible using FLIR software. Does anyone know if there is an open source version of this package. Found libFLIR but don't have any examples on how to use.
I am working on creating a Real-time image processor for a self driving small scale car project for uni, It uses a raspberry pi to get various information to send to the program to base a decision by.
the only stage i have left is to create a Neural network which will view the image displayed from the camera ( i already have to code to send the array of CV_32F values between 0-255 etc.
I have been scouring the internet and cannot seem to find any example code that is related to my specific issue or my kind of task in general (how to implement a neural network of this kind), so my question is is it possible to create a NN of this size in c++ without hard coding it (aka utilising openCv's capabilities): it will need 400 input nodes for each value (from 20x20 image) and produce 4 outputs of left right fwd or backwards respectively.
How would one create a neural network in opencv?
Does openCV provide a backpropogation(training) interface /function or would I have to write this myself.
once it is trained am I correct in assuming I can load the neural network using ANN_MLP load etc? following this pass the live stream frame (as an array of values) to it and it should be able to produce the correct output.
edit:: I have found this OpenCV image recognition - setting up ANN MLP. and It is very simple in comparison to what I want to do, and I am not Sure how to adapt that to my problem.
OpenCV is not a neural network framework and in turn won't find any advanced features. It's far more common to use a dedicated ANN library and combine it with OpenCV. Caffe is a great choice as a computer vision dedicated deep learning framework (with C++ API), and it can be combined with OpenCV.