Is there any good open source image dataset generation software? - machine-learning

I'm looking for services or scripts that can help to generate images for computer vision machine learning tasks. Not something like this where they just put together several layers of objects over other objects but more a like a 3D generation of objects in different environments.

I'm not aware of an out-of-the-box solution where you can create arbitrary objects and environments but if you are fine using Blender, here is an example how you can use Blender to automatically create images as a training set for machine learning algorithms.

Related

On replacing the LJ-Speech dataset with your own

In most github repositories for machine learning based text to speech, the LJ-Speech dataset is being used and optimized for.
Having unsucessfully tried to use my own wave files for it, I am interested in the right approach to prepare your dataset for an optimized framework to likely convert.
With Mozilla TTS, you can have a look at the LJ-Speech script used to prepare the data to have an idea of what is needed for your own dataset:
https://github.com/erogol/TTS_recipes/blob/master/LJSpeech/DoubleDecoderConsistency/train_model.sh

Is there a way to get the functionality of ML Kit in the web browser?

I'm developing a cross-platform app (iOS/Android/web) and am loving the fast, cheap on-device image labeling feature of ML Kit on mobile. Is there a way to replicate the behavior on the web? Are the ML Kit models available for re-use with a different ML library so it can be repurposed?
Unfortunately, it does not seem like ML Kit allows you to export models created using it, only import models. However, tensorflow.js lets you run TensorFlow models on the web. If you are looking for an easy way to create models there are several web-based programs which allow you to easily create ML models and export as TensorFlow Lite (which can be run in tensorflow.js or even hosted on Firebase). A couple I have heard of are: lobe.ai and ml5.js. Hope this helps.

Data processing while using tensorflow serving (Docker/Kubernetes)

I am looking to host 5 deep learning models where data preprocessing/postprocessing is required.
It seems straightforward to host each model using TF serving (and Kubernetes to manage the containers), but if that is the case, where should the data pre and post-processing take place?
I'm not sure there's a single definitive answer to this question, but I've had good luck deploying models at scale bundling the data pre- and post-processing code into fairly vanilla Go or Python (e.g., Flask) applications that are connected to my persistent storage for other operations.
For instance, to take the movie recommendation example, on the predict route it's pretty performant to pull the 100 films a user has watched from the database, dump them into a NumPy array of the appropriate size and encoding, dispatch to the TensorFlow serving container, and then do the minimal post-processing (like pulling the movie name, description, cast from a different part of the persistent storage layer) before returning.
Additional options to josephkibe's answer, you can:
Implementing processing into model itself (see signatures for keras models and input receivers for estimators in SavedModel guide).
Install Seldon-core. It is a whole framework for serving that handles building images and networking. It builds service as a graph of pods with different API's, one of them are transformers that pre/post-process data.

How to improve VNDetectRectanglesRequest to VNDetectCarRequest?

I use VNImageRequestHandler and VNDetectRectanglesRequest to handle request to find rectangles in a image. But since Vision in iOS11 only provide barcode、rectangle、face finding,but I want to find cars in an image ,what should I change code to find specify object in an image?
If you’re looking for Apple to create an API named VNDetectCarRequest you should probably file a feature request. (And if it happens, I’m sure the “Apple is making a car!” rumor mill will start up again...)
For general-purpose image recognition, the path to take with Vision is to use VNCoreMLRequest and supply a machine learning model trained for the image recognition task you have in mind.
On the native programming side, all image recognition/classification tasks are the same — you can start by reusing Apple’s Classifying Images with Vision and Core ML sample code, which sets up VNCoreMLRequest and handles the VNClassificationObservation results it produces. The special sauce that changes a general “what is this” classifier into a “hotdog or not a hotdog” classifier or a “what kind of vehicle is this (if it’s one at all)” classifier is all in the model.
There might be a machine learning model that already does the task you’re looking for out there — if you find one, you can wrap it in a Core ML Model file using the scripts Apple provides.
Otherwise, you’ll need to look at one of the general purpose image classifier models out there (again, there are several already conveniently gathered on developer.apple.com) and work on specializing / retraining it to your more specific task. That part of your work is outside Apple’s API ecosystem, and there are many possible options. Web searches for “train caffe image model” or “train keras image model” or similar should be helpful there.
Once you’ve trained your model, use the Core ML tools to get it into Core ML to use with Vision.

How intensive is training a machine learning algorithm?

I'd like to make an app using iOS's new CoreML framework that does image recognition. To do so I'd probably have to train my own model, and I'm wondering exactly how much data and compute power it would require. Is it something I could feasibly accomplish on an dual core i5 Macbook Pro using Google Images for source data or would it be much more involved?
It depends on what sort of images you want to train your model to recognize.
What is often done is fine-tuning an existing model. You take a pretrained version of Inception-v3 (let's say) and then replace the final layer with your own. You train this last layer on your own images.
You still need a fair number of training images (a few 100 per category, but more is better) but you can do this on your MacBook Pro in anywhere between 30 minutes to a few hours.
TensorFlow comes with a script that makes it really easy to do this. Keras has a great blog post on how to do this. I used the TensorFlow script to re-train Inception-v3 to tell apart my two cats, from 50 or so images of each cat.
If you want to train from scratch you probably want to do this in the cloud using AWS, Google's Cloud ML Engine, or something easy like FloydHub.

Resources