AI: Cutting out a tshirt from a picture - machine-learning

I have basic theory of AI and deep learning techniques. However, most examples of things like Neural Nets are understanding and classifying images.
What I need is to be able to say "this is a photo containing a tshirt (on a person or just lying down), give me the cutout (alpha mask etc.) of the tshirt without anything else in the background.
What AI techniques would you suggest? I am open to either building the AI or using services. I have tried the Watson API visual recognition service, but again it seems to be for a different (classification) problems.
Thank you!

What you need is object detection not object recognition. Object Detection is the task of 'cropping' down a certain area of an image and classifying it.
There are a few services out there, try Google Vision API.

Related

Neural Network for Learning Cut VS Uncut Grass

I've got a script to take pictures like the one provided, with colored loops encircling either uncut grass, cut grass, or other background details (for purposes of rejecting non-grass regions), and generate training data in the form of a bunch of small images from inside the colored loops of those types of training data. I'm struggling to find which type of neural network that would work best for learning from this training data and telling me in real time from a video feed mounted on a lawn mower which sections of the image is uncut grass or cut grass as it is mowing though a field. Is there anyone on here experienced with neural networks, and can either tell me some I could use, or just point me in the right direction?
Try segmentation network. There are many types of segmentation.
Mind that for neuron networks, training data is necessary. Your case (to detect cut and uncut grass) is considered special, which means existing models may not fit your purpose. If so, you'll need a dataset including images and annotations. There are also tools for labeling segmentation images.
Hope it helps.

iOS AVFoundation - is it possible to add text/image into the video via position/motion tracking?

I am aware of how to add text / image overlay into a video on iOS with AVFoundation.
Is there some way to do this using position/motion tracking of certain objects / areas in the video?
What exactly is this type of video editing feature called?
Let's say I have a video of a car moving from left to right. I want to place an image of another car at the position of the original car so that as the car in the video is moving from left to right, my image follows on top of that car. I would also want this to be properly skewed as the car moves from left to right.
Another example would be a video of a monitor. And me placing an image on the screen of that monitor.
Please let me know if I need to explain further.
Other than iOS is there some other library which is able to do this? Like ffmpeg?
What you're broadly looking for is Object Recognition, which is a fairly complex topic in it's own right and part of the field of Computer Vision.
AVFoundation includes support for Face Detection and does a fairly reasonable job of it https://developer.apple.com/reference/avfoundation/avmetadatafaceobject but that's about it.
To do what you're trying to do, I'd start with OpenCV (which includes support for iOS) and investigate from there http://opencv.org/
You're not going to find a literal "find me a car" API, what you will find is lots of different algorithms which are implemented that allow you to train them and detect the objects they are trained for. One potential algorithm is using Haar Cascades. There's more detail on working with those and training your own classifier here https://github.com/andrewssobral/vehicle_detection_haarcascades

Wikitude/AR SDK's to Pick out objects in 3d space

Im looking at integrating an AR Kit into our iOS App so we can use the camera to scan a room or field of view for objects. Above is an example of what i mean, if you were to bring up the camera it would highlight the separate objects in the room and allow them to be clicked and "added" into the system.
Does anyone know if this is achievable with the current AR kits or anything else out there? It all seems to be the fact that objects that you are looking for have to be pre-defined and loaded into a database so the app can find them. Im hoping it should be able to pick out the objects realtime. It doesnt need to actually know any details on the actual object just so that can be pulled off the base scenary.
Any ideas?
OpenCV library (iOS) contains many algorithms to compare different image blobs. If you want to match some simple template to find objects then try Viola & Jones algorithm and so called Haar cascades. OpenCV has trained collection of templates in XML files for detecting faces for example. OpenCV contains utility for training thus you are able to generate cascades for other kinds of objects.
Some example projects:
https://github.com/alexmac/alcexamples/blob/master/OpenCV-2.4.2/doc/user_guide/ug_traincascade.rst Cascade Classifier Training
https://github.com/lukagabric/iOS-OpenCV Example code for detecting Colors and Circle shapes
https://github.com/BloodAxe/OpenCV-Tutorial Feature Detection (SURF, ORB, FREAK)
https://github.com/foundry/OpenCVSquaresSL Square Detection using Pyramid scaling, Canny, contours, contour simpification

Logo detection using OpenCV

I'm attempting to implement an easter egg in a mobile app I'm working on. These easter egg will be triggered when a logo is detected in the camera view. The logo I'm trying to detect is this one: .
I'm not quite sure what the best way to approach this is as I'm pretty new to computer vision. I'm currently finding horizontal edges using the Canny algorithm. I then find line segments using the probabilistic Hough transform. The output of this looks as follows (blue lines represent the line segments detected by the probabilistic Hough transform):
The next step I was going to take would be to look for a group of around 24 lines (fitting within a nearly square rectangle), each line would have to be approximately the same length. I'd use these two signals to indicate the potential presence of the logo. I realise that this is probably a very naive approach and would welcome suggestions as to how to better detect this logo in a more reliable manner?
Thanks
You may want to go with SIFT using Rob Hess' SIFT Library. It's using OpenCV and also pretty fast. I guess that easier than your current way of approaching the logo detection :)
Try also looking for SURF, which claims to be faster & robuster than SIFT. This Feature Detection tutorial will help you.
You may just want to use LogoGrab's technology. It's the best out there and offers all sorts of APIs (both mobile and HTTP). http://www.logograb.com/technologyteam/
I'm not quite sure if you would find such features in the logo to go with a SIFT/SURF approach. As an alternative you can try training a Haar-like feature classifier and use it for detecting the logo, just like opencv does for face detection.
You could also try the Tensorflow's object detection API here:
https://github.com/tensorflow/models/tree/master/research/object_detection
The good thing about this API is that it contains State-of-the-art models in Object Detection & Classification. These models that tensorflow provide are free to train and some of them promise quite astonishing results. I have already trained a model for the company I am working on, that does quite amazing job in LOGO detection from Images & Video Streams. You can check more about my work here:
https://github.com/kochlisGit/LogoLens

fingertip detection and tracking

i am working on a project detecting and tracking fingers. Though i find there is quiet a lot resource on this task, i haven't found a effective one yet :(.
So far i have thought of methods to detect hands as follow:
Haar training. But firstly we don't have a trained set(xml) as that in the face detection. Secondly, if we do the training ourselves, we don't have enough samples (i am still a college student)
skin color detection in HSV space. I have tried this one but the result has a lot of noises so cannot helps me continue the further detection on fingertip.
3.use Handvu. But i have heart that this lib is hard to set up and used in windows...
So in a word, can anyone give me any suggestions on how to detect hands effectively? (After that i may consider about detecting fingertips..)
Thanks!!
Here is a pretty in-depth paper on finger segmentation using Zernike moments. Here is a good paper on using Zernike moments for image recognition as a basis for the first paper.
Can you explain more about your experimental setup? Are you trying to track fingers against a cluttered background, or a plain cardboard sheet?
Haar like features perform very well for face recognition (the Viola Jones paper being a classic example) however I would not recommend them for your task. Although they can be computed fast using the integral image, they work well using a CASCADED Adaboost classification framework.
For skin colour detection, it depends on your setup. As a first step you could try doing background subtraction: simply learn the distribution (histogram) of pixels for foreground (ie. the hand) and the background and use these to do image segmentation.
I don't know what Handvu is
Zernike moments are also very good shape descriptors that are rotation invariant and can be made to be both scale and translation invariant.
I hope this helps!

Resources