I'm trying to track cars using video from dash cam. Most of the time there is
slight shifting of a vehicle in front of me
on/off brake lights
zoom in when it uses brakes
Zoom out when it accelerates.
What algorithm will be the best for this case? Of course, I can just run open cv, but I want to understand how it works.
Thank you!
I think that for your task you can use the Haar Cascade Classifier. It is a machine learning based approach where a cascade function is trained from a lot of positive and negative images. It is then used to detect objects in other images.
There is a good openCV's implementation, with both the trainer and the detector.
On the web you can even find a lot of .xml files, that are the result of the training part, and use these .xml files to do directly the detection.
Even if i'm not really sure that you can find these files for the detection of the back of a car.
At this link you can learn the bases of the method and you can even understand how to use it in openCV http://docs.opencv.org/master/d7/d8b/tutorial_py_face_detection.html#gsc.tab=0
In this case you don't need the 4 features that you suggested, but maybe you can use that with another algotrithm at the end of the pipeline of the Haar Cascade Classifier for a double check.
Related
I want to make an application for counting game statistics automatically. For that purpose, I need some sort of computer vision for handling screenshots of the game.
There are bunch of regions with different skills in always the same place that app needs to recognize. I assume that it should have a database of pictures or maybe some trained samples.
I've started to learn opencv lib, but not sure what will be better for this purpouse.
Would you please give me some hints or algorithms that I could use?
Here is the example of game screenshot.
You can covert it into gray scale and then use any haar cascade classifier to read the words in that image and then save it into any file format (csv) this way you can utilize your game pics for gathering data so that you can train your models
I am new to computer vision but I am trying to code an android/ios app which does the following:
Get the live camera preview and try to detect one flat image (logo or painting) in that. In real-time. Draw a rect around the logo if found. If there is no match, dont draw the rectangle.
I found the Tensorflow Object Detection API as a good starting point. And support was just announced for importing TensorFlow models into Core ML.
I followed a lot of tutorials to train my own object detector. The training data is the key. I found a pretty good library to generate augmented image. I have created hundreds of variation of my image source (rotation, skew etc ...).
But it has failed! This dataset is probably good for image classification (with my image in full screen) but not in context (the room).
I think transfer-learning is the key, In my case, I used the ssd_mobilenet_v1_coco model as a base. I tried to fake the context of my augmented image with the Random Erasing Data Augmentation technique without success.
What are my available solutions? Do I tackle the problem rightly? I need to make the model training as fast as possible.
May I have to use some datasets for indoor-outdoor image classification and put my image randomly above? How important are the perspectives?
Thank you!
I have created hundreds of variation of my image source (rotation, skew etc ...). But it has failed!
So that mean your model did not converge or the final performance was bad? If your model did not converge then add more data. "Hundred of samples" is very few. So use more images and make more samples, and make your sample s dispersed as possible.
I think transfer-learning is the key, In my case, I used the ssd_mobilenet_v1_coco model as a base. I tried to fake the context of my augmented image with the Random Erasing Data Augmentation technique without success.
You mean fine-tuning. Did you reduced the label to 2 (your image and background) and did fine-tuning. If you didn't then you surely failed. Oh man, you should at least show me your model definition.
What are my available solutions? Do I tackle the problem rightly? I need to make the model training as fast as possible.
To make training converge faster, just add more GPUs and train on multiple GPUs. If you don't have money, rent some GPU cluster on Azure. Believe me, it is not that expensive.
Hope that help
The problem statement:
Given two images such as the two images of Brad Pitt below, figure out if the image contains the same person or no. The difficulty is that we have only one reference image for each person and what to figure out if any other incoming image contains the same person or no.
Some research:
There are a few different methods of solving this task, these are
Using color histograms
Keypoint oriented methods
Using deep convolutional neural networks or other ML techniques
The histogram methods involve calculating histograms based on color and defining some sort of metric between them and then deciding upon a threshold. One that I have tried is the Earth Mover's Distance. However this method is lacking in accuracy.
The best approach therefore should be some sort of mix between 2nd and 3rd methods, and some preprocessing.
For preprocessing obvious steps to perform are:
Run a face detection such as Viola-Jones and separate the regions containing faces
Convert the said faces to grayscale
Run eye,mouth,nose detection algorithms perhaps using haar_cascades of opencv
Align the face images according found landmarks
All of this is done using opencv.
Extracting features such as SIFT and MSER generate accuracy of between 73-76%. After some additional research I've come across this paper using fisherfaces. And the fact that opencv has now the ability to create fisherface detectors and train them is great and works fantastically, achieving the accuracy promised by the paper on the Yale datasets.
The complication of the problem is that in my case I don't have a database with several images of the same person, to train the detector on. All I have is a single image corresponding to a single person, and given another image I want to understand whether this is the same person or no.
So what I am interested in knowing is`
Has anyone tried anything of the sort? What are some papers/methods/libraries that I should look into?
Do you have any suggestions on how to tackle problem?
Since you have only one image, you can give this method using DLib a try. I have used 3-4 images per person and it is giving good results.
Detect face (sample_face)
Get face descriptor (128 D vector) using dlib compute_face_descriptor (Check link)
Get the new picture in which you want to recognise the face
detect face and compute the descriptor(lets call test_face).
Compute euclidean distance between test_face descriptor and all sample_faces descriptor
assign the test_face with class(person name) with least euclidean distance.
Give this a whirl, you can play with face aligning if you start getting good results.
This is one of the hot topic for computer visin area. To handle as you have written there are many kind of solutions are available.
But i suggest to look OpenFace which has very high accuracy. There is a implementation of that project at Github.
Thanks
You need to understand that machine learning doesn't work that way, there are intensive training carried out before your model can give some good results.
with the single image of a person you just cannot predict that its the same person, cause you need to train your model over different images of the person under different light intensities, angles and many other varying scenarios.
Still i would like to try this link :
http://hanzratech.in/2015/02/03/face-recognition-using-opencv.html
you may find some match for the image atleast.
So what I am interested in knowing is` Has anyone tried anything of
the sort?
Yes. This is 2017 and facial recognition has been researched for decades.
What are some papers/methods/libraries that I should look into?
Anything google throws at you searching "single image/sample face recognition"
Do you have any suggestions on how to tackle problem?
See above
Extracting features such as SIFT and MSER generate accuracy of between 73-76%.
I doubt humans, who's facial recognition is unmatched perform much better with only 1 image as reference. I mean I couldn't tell for sure if that's Brad Pitt or if one is just a look-alike and I have seen him on houndreds of pictures and hours of movies...
There is a way to do object detection, retraining Inception model provided by Google in Tensorflow? The goal is to predict wheter an image contains a defined category of objects (e.g. balls) or not. I can think about it as a one-class classification or multi-class with only two categories (ball and not-ball images). However, in the latter I think that it's very difficult to create a good training set (how many and which kind of not-ball images I need?).
Yes, there is a way to tell if something is a ball. However, it is better to use Google's Tensorflow Object Detection API for Tensorflow. Instead of saying "ball/no ball," it will tell you it thinks something is a ball with XX% accuracy.
To answer your other questions: with object detection, you don't need non-ball images for training. You should gather about 400-500 ball images (more is almost always better), split them into a training and an eval group, and label them with this. Then you should convert your labels and images into a .record file according to this. After that, you should set up Tensorflow and train.
This entire process is not easy. It took me a good couple of weeks with an iOS background to successfully train a single object detector. But it is worth it in the end, because now I can rapidly switch out images to train a different object detector whenever an app needs it.
Bonus: use this to convert your new TF model into a .mlmodel usable by iOS/Android.
I'm new in haar cascade and i would like to know if it is possible to detect children faces by using adult faces as positive samples, or it will need children faces for the positive samples?
You can use adult's face HAAR without problem unless target contains some very little kids(only few days old :P ). All the basic features for kids and adults are same.
And if you are going to make your own HAAR, you can yourself add kids faces also. There shall not be any additional effort. Be sure to cover wide varieties.
PS: Also, You don't need train or make haar for yourself. You can use some-one else' haar and use them directly. OpenCV has one as an example at
/path/to/opencv/data/haarcascades/haarcascade_frontalface_default.xml
You can download more sophisticated haar online. Here is a link:
alereimondo.no-ip.org/OpenCV/34
I assume you have code to execute over this or you can get easily online.
Happy Coding :)