I have trained a YoloV4 CNN. It's pretty good already. I want more images as training data but there is no point of manually annotate most of the stuff because CNN can do it for me. I could review and re-correct if there are any issues. Is there a image annotation tool/service that can do that? I'm currently using Supervisely. I also tried CVAT and VoTT Couldn't find such feature.
I created a tool python project to generate supervisely project using darknet. It's available on github.
https://github.com/s1n7ax/partially-annotate
Related
I'm trying to figure out the easiest way to run object detection from a Tensorflow model (Inception or mobilenet) in an iOS app.
I have iOS Tensorflow image classification working in my own app and network following this example
and have Tensorflow image classification and object detection working in Android for my own app and network following this example
but the iOS example does not contain object detection, only image classification, so how to extend the iOS example code to support object detection, or is there a complete example for this in iOS? (preferably objective-C)
I did find this and this, but it recompiles Tensorflow from source, which seems complex,
also found Tensorflow lite,
but again no object detection.
I also found an option of converting Tensorflow model to Apple Core ML, using Core ML, but this seems very complex, and could not find a complete example for object detection in Core ML
You need to train your own ML model. For iOS it will be easier to just use Core ML. Also tensorflow models can be exported in Core ML format. You can play with this sample and try different models. https://developer.apple.com/documentation/vision/recognizing_objects_in_live_capture
Or here:
https://github.com/ytakzk/CoreML-samples
So I ended up following this demo project,
https://github.com/csharpseattle/tensorflowiOS
It provided a working demo app/project, and was easy to switch its Tensorflow pb file for my own trained network file.
The instructions in the readme are pretty straight forward.
You do need to checkout and recompile Tensorflow, which takes several hours and 10gb of space. I did have the thread issue, used the gsed instructions, which worked. You also need to install Homebrew.
I have not looked at Core ML yet, but from what I have read converting from Tensorflow to Core ML is complicated, and you may loose parts of your model.
It ran quite fast on iPhone, even using an Inception model instead of Mobilenet.
I am working on image classification problem. How to find out specific features from the image manually that will help to build a DNN? Consider an image of a man talking on phone while driving for classification as distracted.
You don't do this. Having a good feature extractor is why we take DNNs in the first place
Also: you forgot to look to https://www.kaggle.com/c/state-farm-distracted-driver-detection
I use VNImageRequestHandler and VNDetectRectanglesRequest to handle request to find rectangles in a image. But since Vision in iOS11 only provide barcode、rectangle、face finding,but I want to find cars in an image ,what should I change code to find specify object in an image?
If you’re looking for Apple to create an API named VNDetectCarRequest you should probably file a feature request. (And if it happens, I’m sure the “Apple is making a car!” rumor mill will start up again...)
For general-purpose image recognition, the path to take with Vision is to use VNCoreMLRequest and supply a machine learning model trained for the image recognition task you have in mind.
On the native programming side, all image recognition/classification tasks are the same — you can start by reusing Apple’s Classifying Images with Vision and Core ML sample code, which sets up VNCoreMLRequest and handles the VNClassificationObservation results it produces. The special sauce that changes a general “what is this” classifier into a “hotdog or not a hotdog” classifier or a “what kind of vehicle is this (if it’s one at all)” classifier is all in the model.
There might be a machine learning model that already does the task you’re looking for out there — if you find one, you can wrap it in a Core ML Model file using the scripts Apple provides.
Otherwise, you’ll need to look at one of the general purpose image classifier models out there (again, there are several already conveniently gathered on developer.apple.com) and work on specializing / retraining it to your more specific task. That part of your work is outside Apple’s API ecosystem, and there are many possible options. Web searches for “train caffe image model” or “train keras image model” or similar should be helpful there.
Once you’ve trained your model, use the Core ML tools to get it into Core ML to use with Vision.
So I am working on a project for school and what we are trying to do is to teach a neural network to recognize buildings from non-buildings. The problem I am having right now is representing the data in a form, that would be "readable" by the classifier function.
The training data is a bunch of pictures + .wkt file with coordinates of buildings on a picture. So far we have been able to rescale the polygons, but kinda got stuck there.
Can you give any hints or ideas of how to bring this all to an appropriate form?
Edit: I do not need the code written for me, a link to an article on a similar subject or a book is more of stuff I am looking for.
You did not mention what framework you are using, but I will give an answer for caffe.
Your problem is very close to detecting objects within an image. You have full images with object (building in your case) bounding boxes.
The easiest way of doing this is through a python data layer which reads an image and a file with stored coordinates for that image and feeds that into your network. A tutorial on how to use it can be found here: https://github.com/NVIDIA/DIGITS/tree/master/examples/python-layer
To accelerate the process you may want to store image, coordinate pairs in your custom lmdb database.
Finally a good working example with complete caffe implementation can be found within Faster-RCNN library here: https://github.com/rbgirshick/caffe-fast-rcnn/
You should check roi_pooling_layer.cpp in their custom caffe branch and roi_data_layer on how the data is fed into the network.
I am struggling to create a custom haar classifier. I have found a couple tutorials on the web, but they do not specify which version of opencv they are using. What I need is a very concise and simplified example of the steps that are required, along with a simple dataset of images. I also need to know the opencv version and the OS platform so I can get it running. I have tried a matrix of opencv versions on both windows and linux and I have run into memory error after memory error. I would like to start with a known good set of data and simple commands before expanding it to fit my problem.
Thanks for your help,
Chris
OpenCV provides two utility commands createsamples.exe and haartraining.exe, which can generate xml files used by Haar Classifiers. That is, with the xml file outputted from haartraining.exe, you can directly use the face detection sample with your xml file to detect any customized objects.
About the detailed procedures to use the commands, you may consult Page 513-516 in the book "Learning OpenCV", or this tutorial.
About the internal mechanism of how the classifier works, you may consult the paper "Rapid Object Detection using a Boosted Cascade of Simple
Features", which has been cited 5500+ times.