I use VNImageRequestHandler and VNDetectRectanglesRequest to handle request to find rectangles in a image. But since Vision in iOS11 only provide barcode、rectangle、face finding,but I want to find cars in an image ,what should I change code to find specify object in an image?
If you’re looking for Apple to create an API named VNDetectCarRequest you should probably file a feature request. (And if it happens, I’m sure the “Apple is making a car!” rumor mill will start up again...)
For general-purpose image recognition, the path to take with Vision is to use VNCoreMLRequest and supply a machine learning model trained for the image recognition task you have in mind.
On the native programming side, all image recognition/classification tasks are the same — you can start by reusing Apple’s Classifying Images with Vision and Core ML sample code, which sets up VNCoreMLRequest and handles the VNClassificationObservation results it produces. The special sauce that changes a general “what is this” classifier into a “hotdog or not a hotdog” classifier or a “what kind of vehicle is this (if it’s one at all)” classifier is all in the model.
There might be a machine learning model that already does the task you’re looking for out there — if you find one, you can wrap it in a Core ML Model file using the scripts Apple provides.
Otherwise, you’ll need to look at one of the general purpose image classifier models out there (again, there are several already conveniently gathered on developer.apple.com) and work on specializing / retraining it to your more specific task. That part of your work is outside Apple’s API ecosystem, and there are many possible options. Web searches for “train caffe image model” or “train keras image model” or similar should be helpful there.
Once you’ve trained your model, use the Core ML tools to get it into Core ML to use with Vision.
Related
I want to make an image classification service, that can detect whether or not a light is green or red.
I would like to send images to a ruby on rails application, which will then call the 'AI' magic which has the model to perform this recognition.
I have little experience with AI and am searching for a suitable, simple way to make this happen.
Experience and pointers on the creation of this image classification model would be greatly appreciated.
You can use tensorflow.rb to build a simple image recognition software, and once you implement it adapt it to your needs. You can find it here: https://github.com/somaticio/tensorflow.rb
Tensorflow.rb is just a port of tensorflow to ruby, there's an introductory which you can find here. A second approach would be to build a microservice that receives an image (a file basically) and uses normal tensorflow which you can use with python.
A third approach would be to use an external microservice, such as microsoft's image recognition API and store the results of that API call itself. The downside of this is that you'd most likely have to pay in the long run for this service. The upside is that you'll have a well trained algorithm working, while also reducing overhead of development by externalizing this service to a third party.
I'm new to machine learning and trying to figure out where to start and how to apply it to my app.
My app is pulling a bunch of health metrics and based on all of them is suggesting a dose of medication (some abstract medication, doesn't matter) to take. Taking a medication is affecting health metrics and I can see if my suggestion was right of if it needs adjustments to be more precise the next time. Medications are being taken constantly so I have a lot of results and data to work with.
Does that seem like a good case for machine learning and using some of neural networks to train and make better predictions? If so - could you recommend an example for Tensorflow or Keras?
So far I only found image recognition examples and not sure how to apply similar algorithms to my problem.
I'm also a beginner into machine learning, but based on my knowledge, one way would be to use supervised learning with Keras, which uses Tensorflow as a backend. Keras is a lot easier to program than Tensorflow, but eventually Tensorflow might as well do the trick (depending on your familiarity with machine learning libraries).
You mentioned that your algorithm suggests medication based on data (from the patient).
One way to predict medication is to store all your preexisting data in a CSV file, and use the CSV module to read it. This tutorial covers the basics of reading CSV files (https://pythonprogramming.net/reading-csv-files-python-3/).
Next, you can store the data in a multi-dimensional array, and run a neural network through it. Just make sure that you have sufficiently enough data (the more the better) in comparison with the size of your neural network.
Another way, as you mentioned, would be using Convolutional Neural Networks, which theoretically could and should work, but I have very little experience programming them, so I'm afraid I can't give you any advice for that (you can program CNNs in both Keras and Tensorflow).
I do wish you good luck in your project!
I've got a bunch of images (~3000) which have been manually classified (approved/rejected) based on some business criteria. I've processed these images with Google Cloud Platform obtaining annotations and SafeSearch results, for example (csv format):
file name; approved/rejected; adult; spoof; medical; violence; annotations
A.jpg;approved;VERY_UNLIKELY;VERY_UNLIKELY;VERY_UNLIKELY;UNLIKELY;boat|0.9,vehicle|0.8
B.jpg;rejected;VERY_UNLIKELY;VERY_UNLIKELY;VERY_UNLIKELY;UNLIKELY;text|0.9,font|0.8
I want to use machine learning to be able to predict if a new image should be approved or rejected (second column in the csv file).
Which algorithm should I use?
How should I format the data, especially the annotations column? Should I obtain first all the available annotation types and use them as a feature with the numerical value (0 if it doesn't apply)? Or would it be better to just process the annotation column as text?
I would suggest you try convolutional neural networks.
Maybe the fastest way to test your idea if it will work or not (problem could be the number of images you have, which is quite low), is to use transfer learning with Tensorflow. There are great tutorials made by Magnus Erik Hvass Pedersen, who published them on youtube.
I suggest you go through all the videos, but the important ones are #7 and #8.
Using transfer learning allows you to use the models they build at google to classify images. But with transfer learning, you are able to use your own data with your own labels.
Using this approach you will be able to see if this is suitable for your problem. Then you can dive into convolutional neural networks and create the pipeline that will work the best for your problem.
I have a conceptual question, regarding a software process/architecture setup for machine learning. I have a web app and I am trying incorporate some machine learning algorithms that work like Facebook's face recognition (except with objects in general). So the model gets better at classifying specific images uploaded into my service (like how fb can classify specific persons, etc).
The rough outline is:
event: User uploads image; image attempts to be classified
if failure: draw a bounding box on object in image; return image
interaction: user tags object in box; send image back to server with tag
????: somehow this new image/label pair will fine tune the image classifier
I need help with the last step. Typically in transfer learning or training in general, a programmer has a large database full of images. In my case, I have a pretrained model (google's inception-v3) but my fine-tuning database is non-existent until a user starts uploading content.
So how could I use that tagging method to build a specialized database? I'm sure FB ran into this problem and solved it, but I can find their solution. After some thought (and inconclusive research), the only strategies I can think of is to either:
A) stockpile tagged images and do a big batch train
B) somehow incrementally input a few tagged images as they get
uploaded, and slowly over days/weeks, specialize the image classifier.
Ideally, I would like to avoid option A, but I not sure how realistic B is, nor if there are other ways to accomplish this task. Thanks!
Yes, this sounds like a classic example of online learning.
For deep conv nets in particular, given some new data, one can just run a few iterations of stochastic gradient descent on it, for example. It is probably a good idea to adjust the learning rate if needed as well (so that one can adjust the importance of a given sample, depending on, say, one's confidence in it).
You could also, as you mentioned, save up "mini-batches" with which to do this (depends on your setup).
Also, if you want to allow a little more specialization with your learner (e.g. between users), look up domain adaptation.
I am a novice in the field of image processing و And I'm learning common concepts between machine learning and image processing .
Suppose there is a camera in a store , that take movies from people who are into the shop ,
what we want from this movie is :
give me the number 1 if you see affable person ,
so is it related to machine learning or no it's just image processing from consecutive images ؟؟
Extracting relevant information from images(movie frames in your case) is image processing.
For example in your case you could find person face on the image.
To accomplish that you probably need some filtering and image segmentation to extract face from image. That part is pure image processing.
Next you need define some relevant descriptors like characteristic face points like lips corners etc. and perform classification based on chosen descriptors which is in the field of machine learning.
Let us look at a bigger picture, you are working in image recognition, the field, which gives you problem, which is based on some data and aim. Now, you use image processing as a set of tools and methods which make your raw data more comprehensible, you are transforming and simplifying. Finally you have a simplified description of the problem and the aim, it is up to you how to solve it. There are many approaches, one could for example find an exact solution - it would be algorithmics. Some might find out that the only reasonable solution is to create an operator-based system, where one needs access to human experts and so implement required infrostructure - this would be software engineering solution. Finally you can use existing data to create a statistical model, and this is nowadays called machine learning. So machine learning is a way of building a solution to the problem based on statistical analysis; image processing is about preparing raw data into format required for such analysis; and image recognition is the field giving you problems to solve. It is worth noting that nowadays more and more researchers try to skip image processing part and apply machine learning directly to the raw data - this is one of the main ideas behind deep convolutional neural networks - we want approaches which do not need engineers in between. We just want data and solution.