I'm building an in androidStudio that uses OpenCV to identify an object. The detection is ok, but I don't know how a simple file XML allow my programm to idetify my object.
Everything that I know is that somehow OpenCV uses a convolucional neural network to do it, so it's necessary that I do the training of the CNN do adjust the internal parameter, but what does the XML exactly??? How to works this magic thing??
I'm not completely sure if this is what you mean, but I'm going to make a guess.
If you use an xml inside openCV to do some sort of detection, you're most likely working with a Haar feature-based cascade classifier or something very similar. You can learn more about it on this page!
There is also a very good blog post written on which steps openCV takes to make the detection happen: here
Hopefully this can dispel a bit of this magic and clear things up!
In very short, the xml holds a sort of 'pattern' that was learned using machine learning and a lot of examples. Once trained, openCV can search images for the pattern that it found in the xml.
Related
I decided to take a dip into ML and with a lot of trial and error was able to create a model using TS' inception.
To take this a step further, I want to use their Object Detection API. But their input preparation instructions, references the use of Pascal VOC 2012 dataset but I want to do the training on my own dataset.
Does this mean I need to setup my datasets to either Pascal VOC or Oxford IIT format? If yes, how do I go about doing this?
If no (my instinct says this is the case), what are the alternatives of using TS object detection with my own datasets?
Side Note: I know that my trained inception model can't be used for localization because its a classifier
Edit:
For those still looking to achieve this, here is how I went about doing it.
The training jobs in the Tensorflow Object Detection API expect to get TF Record files with certain fields populated with groundtruth data.
You can either set up your data in the same format as the Pascal VOC or Oxford-IIIT examples, or you can just directly create the TFRecord files ignoring the XML formats.
In the latter case, the create_pet_tf_record.py or create_pascal_tf_record.py scripts are likely to still be useful as a reference for which fields the API expects to see and what format they should take. Currently we do not provide a tool that creates these TFRecord files generally, so you will have to write your own.
Except TF Object Detection API you may look at OpenCV Haar Cascades. I was starting my object detection way from that point and if provide well prepared data set it works pretty fine.
There are also many articles and tutorials about creating your own cascades, so it`s easy to start.
I was using this blog, it helps me a lot.
I'm working on a project for visually impaired people that converts the visual world to audio.
We prefer to create a prototype that doesn't need an internet connection. So we chose to work with OpenCV. After reading (a lot of) tutorials and documentation we were able to train OpenCV in recognizing specific objects.
For example: we trained OpenCV to recognize a certain chair and a door. That works fine.
But, we also tried to train OpenCV on a "generic" level. It should be possible to recognize (almost) all chairs. We did that by training OpenCV with a lot of positive and negative images as explained here: http://coding-robin.de/2013/07/22/train-your-own-opencv-haar-classifier.html
The actual result wasn't what we expected -he could not recognize any chair-. I know, there are a lot of different parameters to take into account (maybe we did something wrong with that) and we experimented a lot. But our time (and unfortunately our knowledge of opencv) is limited.
We are looking for some advice on how to train opencv to recognize generic objects.
Where do we start?
Is opencv even suited to do that?
Thank you for your time!
Open CV is the library to use. But object recognition is tricky. Often when people say they are doing "object recognition" they are not, they are processing one image, or at best a series of related images, to separate into object and background.
To recognise a "chair" - everything from an armchair to a dining chair to a throne - would be almost impossible. I'd want at least stereo images to give a chance to detect flat surfaces. I don't doubt that with a lot of work you can get quite a good result, maybe just recognising dining -style chairs, but it's skilled work, it's not just a case of feeding a few parameters to a hierarchical classifier.
I have seen multiple haarcascade xmls in opencv for face detection, eye detection , ear detection, Human body detection etc., But couldnt see proper documentation or explanation for these xmls.
For example in a application if I need to detect side faces which xml should I use and what are the parameters to be passed for detectMultiScale?
In some cases if I vary the parameters to detectMultiScale the false detections get reduced, but I did all the tests with trial and error method. I couldnt find any definite articles on explaining the use of each xml and parameters.
Can some one provide the documents on this if any, else some explanation on this would be grateful.
OpenCV has a built-in profile face classifier xml under "..\data\haarcascades". If you want to create your own cascade classifier, you should follow this procedure. Here is another link regarding that.
To learn about the detectMultiScale method, check out the documentation. To understand the how the classifier and its parameters work, check out the viola-jones (2001) article or its explanation.
Here is a paper by Vadim Pisarevsky, one of the OpenCV developers, which may be helpful, in understanding some of the parameters.
On the other hand, if using OpenCV is not a hard requirement, please take a look at vision.CascadeObjectDetector in the Computer Vision System Toolbox for Matlab, which provides the same functionality. It also saves you the trouble of figuring out which xml file to use for profile faces.
I am struggling to create a custom haar classifier. I have found a couple tutorials on the web, but they do not specify which version of opencv they are using. What I need is a very concise and simplified example of the steps that are required, along with a simple dataset of images. I also need to know the opencv version and the OS platform so I can get it running. I have tried a matrix of opencv versions on both windows and linux and I have run into memory error after memory error. I would like to start with a known good set of data and simple commands before expanding it to fit my problem.
Thanks for your help,
Chris
OpenCV provides two utility commands createsamples.exe and haartraining.exe, which can generate xml files used by Haar Classifiers. That is, with the xml file outputted from haartraining.exe, you can directly use the face detection sample with your xml file to detect any customized objects.
About the detailed procedures to use the commands, you may consult Page 513-516 in the book "Learning OpenCV", or this tutorial.
About the internal mechanism of how the classifier works, you may consult the paper "Rapid Object Detection using a Boosted Cascade of Simple
Features", which has been cited 5500+ times.
I have a set of image files that I can identify. Rather than an OCR, I'd like to search only for matches within the set. What's the ideal platform to quickly find matches?
OpenCV is an advanced computer vision library. It can recognize text blocks, colors, shapes, etc. so it might be of use.
Tesseract can be trained to handle languages, but I can't see a reason why you couldn't train it with shapes. Here's a really confusing training guide.
ImageMagick can also be useful. It's pretty hardcore endless parameter chaining, but you can get it to find images. It's not perfect for this application, but it's been done before. The documentation is insanely huge, but it's about as complete and illustrated as I could wish for (I'm a frequent user, as it's useful for quick image operations via CLI). Here's the image comparison documentation.
I would suggest OpenCV, but it's up to you. Good luck!