Computer Vision with Mathematica - opencv

Does anybody here do computer vision work on Mathematica? I would like to know what external libraries are available for doing that. The built in image processing functions are not enough. I am looking for things like SURF, stereo, camera calibration, multi-view geometry etc.
How difficult would it be to wrap OpenCV for use in Mathematica?

Apart from the extensive set of image processing tools that are now (version 8) natively present in Mathematica, and which include a number of CV algorithms like finding morphologic objects, image segmentation and feature detection (see figure below), there's the new LibraryLink functionality, which makes working with DLLs very easy. You wouldn't have to change OpenCV much to be able to call it from Mathematica. Just some wrappers for the functions to be called and you're basically done.

I don't think such a thing exists, but I'm getting started.
It has the advantage that you can perform some analytic methods... for example rather than hacking in openCV or even Matlab endlessly, you can compute analytically a quantity, and see that the method leading to this matrix is numerically unstable as a function of input variables. Thus you do not need to hack, as it would be pointless.
As for wrapping opencv, that doesn't seem to make sense. The correct procedure would be to fix bad implementations in opencv based on your analysis in Mathematica and on paper.

Agreeing with Peter, I don't believe that forcing Mathematica to use OpenCV is a great thing.
All of the computer vision people that I've talked to, read about, and seen examples are using Matlab and the Imaging toolkit. Its either that, or go with a OpenCV compatible language + OpenCV.

Mathematica has a rich set of tools for image processing, but I'm uncertain about the computer vision capabilities.

Related

Stereovision algorithms

For my project, supposed to segment closest hand region from camera, I initially try openCV's stereovision example. However, disparity map looks very bad and its useless for me.
Is there any other method which is better than openCV implementation and have some output(image-video). Because, my time is limited, I must choose one better algorithm and implement this.
Thank you.
OpenCV implements a number of stereo block matching algorithms some of them pretty cutting edge.
Disparity maps always look bad except in very simple circumstances - the first step is to try and improve the source images, the lighting and the background. I
If it was easy then everybody would eb doing it and there would be no market for expensive 3D laser scanners.
Try the different block matching algorithms provided by OpenCV. The little bit of experimentation I've done so far seems to indicate that cv::StereoSGBM gives better disparity maps than cv::StereoBM, but is slower.
The performance of the block matching algorithms will depend on what parameters they are initialized with. Have a look at the stereo examples again here, notice line 195-222 where the algorithms are initialized.
I also suggest you use some basic GUI (OpenCV:s highgui for example) to manipulate these parameters real-time when finetuning the algorithm.

OpenCV detect numbers

I'm using OpenCV on the iPhone and need to detect numbers in an image. I split the image into smaller images so each image has only one number (1-9). All numbers are printed, NOT handwritten.
What would be the best approach to figure out the numbers with OpenCV?
UPDATE:
I have successfully found the numbers and extracted them. They look like this:
http://img198.imageshack.us/img198/5671/101ht.jpg
http://img824.imageshack.us/img824/539/606yu.jpg
When they are extracted they are in the same size and so on. I have saved a bunch of images and put them in a OCR dir where they are categorized into numbers. Like: ocr/1/100.jpg 101.jpg.... and ocr/2/200.jpg 201.jpg....
Then I was going to use the same approach as in the Basic OCR tutorial:http://blog.damiles.com/?p=93
However, I'm programming for iPhone and can't use C++ code (error on compiling and so on) and I don't have access to highgui.
I tried using cvMatchTemplate() and match a bunch of images but it seems to work pretty bad...
Any other ideas I can try?
You could start by reading about Principal Component Analysis (PCA), Fisher's Linear Discriminant Analysis (LDA), and Support Vector Machines (SVMs). These are classification methods that are extremely useful for OCR, and there are libraries in any language including C++, Python, C# etc.
It turns out that OpenCV already includes excellent implementations on PCAs and SVMs[dead link]. I haven't seen any OpenCV code examples for OCR, but you can use some modified version of face classification to perform character classification. An excellent resource for face recognition code for OpenCV is this website[dead link].
If the numbers are printed, the job is quite simple, you just need to figure out a nice set of features to match. If the numbers are one font, you can get away with this approach:
Extract the number
Find the bounding box
Scale the image down to something like 10x8, try to match the aspect ratio
Do this for a small training set, take the 'average' image for each number
For new images, follow the steps above, but the last is just a absolute image difference with each of the number-templates. Then take the sum of the differences (pixels in the difference image). The one with the minimum is your number.
All above are basic OpenCV operations.
Basically your problem is just to classify a feature vector, which is the set of pixel intensities after some preprocessing steps. You can use any classifier for this task, like eg. neural networks, which should have a C implementation inside OpenCV. You might also try a C libsvm library for Support Vector Machines.
There is a good site related to this problem with a lot of papers and a training database.
Maybe the most simple and convinient way is to use svm as ml algorithm
http://opencv.willowgarage.com/documentation/cpp/support_vector_machines.html
and gray images as feature vectors.
Objective C++?
Try renaming your .m files to .mm and you can then use c++ in your iPhone project.
Convolution Neural Networks are by far the best algorithms for hand written digits. The are implemented in most systems like USPS etc. Here are few papers explaining the algorithms.
http://yann.lecun.com/exdb/lenet/
This is a nice open source ,It is a ORCDemo on iPhone.Hope it is useful to you
Simple Digit Recognition OCR in OpenCV-Python
This might help you out. Converting the code from Python to C++ is not a difficult task, since OpenCV API's are same for the both.
Tesseract is also a nice free OCR engine that is readily available for iPhone and allows you to use your own sets of training images:
http://tinsuke.wordpress.com/2011/11/01/how-to-compile-and-use-tesseract-3-01-on-ios-sdk-5/
HOG + SVM (Try to play with kernels)

How to check images for custom characters?

I have a set of image files that I can identify. Rather than an OCR, I'd like to search only for matches within the set. What's the ideal platform to quickly find matches?
OpenCV is an advanced computer vision library. It can recognize text blocks, colors, shapes, etc. so it might be of use.
Tesseract can be trained to handle languages, but I can't see a reason why you couldn't train it with shapes. Here's a really confusing training guide.
ImageMagick can also be useful. It's pretty hardcore endless parameter chaining, but you can get it to find images. It's not perfect for this application, but it's been done before. The documentation is insanely huge, but it's about as complete and illustrated as I could wish for (I'm a frequent user, as it's useful for quick image operations via CLI). Here's the image comparison documentation.
I would suggest OpenCV, but it's up to you. Good luck!

what are the steps in object detection?

I'm new to image processing and I want to do a project in object detection. So help me by suggesting a step-by-step procedure to this project. Thanx.
Object detection is a very complex problem that includes some real hardcore math and long tuning of parameters to the computation methods involved. Your best bet is to use some freely available library for that - Google will help.
There are lot of algorithms about the theme and no one is the best of all. It's usually a mixture of them what makes the best solution to the solution.
For example, for object movement detection you could look at frame differencing and misture of gaussians.
Also, it's very dependent of your application, the environment (i.e. noise, signal quality), the processing capacity you may have available, the allowable error margin...
Besides, for it to work, most of time it's first necessary to do some kind of image processing to the input data like median filter, sobel filter, contrast enhancement and a large so on.
I think you should start reading all you can: books, google and, very important, a lot of papers about the subjects (there are many free in internet) you are interested in.
And first of all, i think it's fundamental (at least it has been for me) having a good library for testing. The one i have used/use is OpenCV. It's very complete, implement many of the actual more advanced algorithms, is very active, has a big community and it's free.
Open Computer Vision Library (OpenCV)
Have luck ;)
Take a look at AForge.NET. It's nowhere near Project Natal's levels of accuracy or usefulness, but it does give you the tools to learn the algorithms easily. It's an image processing and AI library and there are several tutorials on colored object tracking and motion detection.
Another one to look at is OpenCV from Intel. I believe it's a bit more advanced, but it's written in C.
Take a look at this. It might get you started in this complex field. The algorithm pages that it links to are interesting reading.
http://sun-valley.stanford.edu/projects/helicopters/final.html
This lecture by Jeff Hawkins, will give you an idea about the state of the art in this super-difficult field.
Seems that video disappeared... but this vid should cover similar ground.

Image processing language/environment

I am interested in studying some image processing. I imagine matlab is the best way to go about that but right now I don't have access to matlab. I tried octave but for some reason it can't even load a png, bmp or anything other than 1 specific format. R doesn't seem to be the key here either.
What is the language of choice here? Perl?
Also can anyone point me to any other good tutorials that I may have missed on image processing?
Opencv is an excellent image processing library. Although written in C it comes with some high level tools to display images handle image files, mouse events etc so you can experiment without writing a lot of windows code.
It also works with python, although I haven't used it with the PIL.
If you are interested in how the algorithms work then implementing them yourself using python and numpy for the matrix ops is easy.
I guess it depends on what you want to do. Matlab certainly is a high end choice, but for a lot of things the image modules of general purpose programming languages do the trick.
I did some pixel mangling and image processing with PIL, the python image library. It is perfectly sufficient for processing single RGB images of reasonable size (say, what a consumer digital camera delivers). It can handle alpha channels, has some filters, more or less quick methods of accessing the pixel information - and it is python, a very straightforward and readable language.
The recommended language in my computer vision class was Ch with the OpenCV library. Ch is basically an interpreted version of C, the syntax is quite similar but has a few nice features, like treating arrays as matrices. OpenCV will house pretty much any image processing function you could need.
I think any free programming environments will do basic image processing well. If speed is not an issue, Processing will work fine and you can easily extend your code to Java in the future.
Have a look at Adobe Pixel Bender. It's really fun to play with.

Resources