OpenCV for iOS graphics app - ios

I want to create a photo and video manipulation app for the iPad. The app would effect the imagery in various ways (using canny edge detection or bilateral blur for instance).
I saw some very interesting examples of canny edge detection using OpenCV, but is OpenCV the right tool to be looking into if I want to create a graphics app like this?
If so can anyone recommend some good reading materials to get me started.
Thanks for reading!

Yes, you certainly can use OpenCV on iOS. You simply cross-compile the code and include it in your project. OpenCV can easily do what you describe, and much more.
O'Reilly has published a great book on OpenCV, which is probably the best way to get up to speed. It explains the methods and how to use them, with plenty of sample code and images.
Learning OpenCV, Gary Bradski, Adrian Kaehler, O'Reilly 2012
There are a few sample projects around:
Sample OpenCV on iOS project
There are also numerous build scripts etc but note that they are probably not the latest version (2.4).

Related

Face Recognition using Kinect

I went through the Kinect SDK and Toolkit provided by Microsoft. Tested the Face Detection Sample, it worked successfully. But, how to recognize the faces ? I know the basics of OpenCV (VS2010). Is there any Kinect Libraries for face recognition? if no, what are the possible solutions? Are there, any tutorials available for face recognition using Kinect?
I've been working on this myself. At first I just used the Kinect as a webcam and passed the data into a recognizer modeled after this code (which uses Emgu CV to do PCA):
http://www.codeproject.com/Articles/239849/Multiple-face-detection-and-recognition-in-real-ti
While that worked OK, I thought I could do better since the Kinect has such awesome face tracking. I ended up using the Kinect to find the face boundaries, crop it, and pass it into that library for recognition. I've cleaned up the code and put it out on github, hopefully it'll help someone else:
https://github.com/mrosack/Sacknet.KinectFacialRecognition
I've found project which could be a good source for you - http://code.google.com/p/i-recognize-you/ but unfortunetly(for you) its homepage is not in english. The most important parts:
-project(with source code) is at http://code.google.com/p/i-recognize-you/downloads/list
-in bibliography author mentioned this site - http://www.shervinemami.info/faceRecognition.html. This seems to be a good start point for you.
There are no built in functionality for the Kinect that will provide face recognition. I'm not aware of any tutorials out there that will do it, but someone I'm sure has tried. It is on my short list; hopefully time will allow soon.
I would try saving the face tracking information and doing a comparison with that for recognition. You would have a "setup" function that would ask the user the stare at the Kinect, and would save the points the face tracker returns to you. When you wish to recognize a face, the user would look at the screen and you would compare the face tracker points to a database of faces. This is roughly how the Xbox does it.
The big trick is confidence levels. Numbers will not come back exactly as they did previously, so you will need to include buffers of values for each feature -- the code would then come back with "I'm 93% sure this is Bob".

What libraries can I use to modify a video?

I'm new to video processing and I'm wondering what libraries I can use to do things like detecting letters, drawing boxes around them and so on. If you can name me a couple of good ones, I'd appreciate it very much!
OpenCV: (Open Source Computer Vision) is a cross-platform library of programming functions for real time computer vision.
It provides interfaces for both C and C++ programming laguages.
As for detecting the text region and drawing boxes around it, you can take a look at this article, which explains how to do this stuff using OpenCV. For better OCR capabilities I think that tesseract is the best open source tool available right now.
I've worked on a similar project some time ago and used OpenCV to detect the text region and then tesseract to do proper text recognition.

How to use Opencv for Document Recognition with OCR?

I´m a beginner on computer vision, but I know how to use some functions on opencv. I´m tryng to use Opencv for Document Recognition, I want a help to find the steps for it.
I´m thinking to use opencv example find_obj.cpp , but the documents, for example passport, has some variables, name, birthdate, pictures. So, I need a help to define the steps for it, and if is possible how function I have to use on the steps.
I'm not asking a whole code, but if anyone has any example link or you can just type a walkthrough, it is of great help.
There are two very different steps involved here. One is detecting your object, and the other is analyzing it.
For object detection, you're just trying to figure out whether the object is in the frame, and approximately where it's located. The OpenCv features framework is great for this. For some tutorials and comprehensive sample code, see the OpenCv features2d tutorials and especially the feature matching tutorial.
For analysis, you need to dig into optical character recognition (OCR). OpenCv does not include OCR libraries, but I recommend checking out tesseract-ocr, which is a great OCR library. If your documents have a fixed structured (consistent layout of text fields) then tesseract-ocr is all you need. For more advanced analysis checking out ocropus, which uses tesseract-ocr but adds layout analysis.

Which SDK should I use to visualize medical images in 3D?

I need to process DICOM formatted medical images and visualize them in 3D, also do some image processing on these images on real-time. Therefore, I am asking this question to learn which SDK has better real-time characteristics for medical visualization and image processing?
The Visualization Toolkit (VTK) is an open-source, freely available software system for 3D computer graphics, image processing and visualization.
You can find details here.
Or another solution would be the modifying or utilizing 3D engine that supports volume rendering.
Moreover, for computer vision algorithms, OpenCV seems promising.
osgVolume is an add-in to the popular openscenegraph library for doing this
Just use GDCM+VTK. In 2D simply use gdcmviewer. In 3D you need to build gdcmorthoplanes.
Ref:
http://sourceforge.net/apps/mediawiki/gdcm/index.php?title=Gdcmviewer
http://sourceforge.net/apps/mediawiki/gdcm/index.php?title=Using_GDCM_API
You could check out MITK (http://mitk.org) which combines the already mentioned VTK with the Insight Toolkit (http://www.itk.org) for image processing. Another option to start from could be Slicer (http://www.slicer.org), but this depends on the license you need.
In a uni we were taught Matlab for DICOM file processing. I think it has pretty nice and easy to use plugins for that as well. The end results were that using Matlab I was able to do all kinds of DICOM image processing, filtering and so forth.
As you probably know, Matlab is not SDK but a complete environment. Nevertheless you can write scripts to achieve normal application behavior: Create windows, buttons, images, etc.

Computer Vision with Mathematica

Does anybody here do computer vision work on Mathematica? I would like to know what external libraries are available for doing that. The built in image processing functions are not enough. I am looking for things like SURF, stereo, camera calibration, multi-view geometry etc.
How difficult would it be to wrap OpenCV for use in Mathematica?
Apart from the extensive set of image processing tools that are now (version 8) natively present in Mathematica, and which include a number of CV algorithms like finding morphologic objects, image segmentation and feature detection (see figure below), there's the new LibraryLink functionality, which makes working with DLLs very easy. You wouldn't have to change OpenCV much to be able to call it from Mathematica. Just some wrappers for the functions to be called and you're basically done.
I don't think such a thing exists, but I'm getting started.
It has the advantage that you can perform some analytic methods... for example rather than hacking in openCV or even Matlab endlessly, you can compute analytically a quantity, and see that the method leading to this matrix is numerically unstable as a function of input variables. Thus you do not need to hack, as it would be pointless.
As for wrapping opencv, that doesn't seem to make sense. The correct procedure would be to fix bad implementations in opencv based on your analysis in Mathematica and on paper.
Agreeing with Peter, I don't believe that forcing Mathematica to use OpenCV is a great thing.
All of the computer vision people that I've talked to, read about, and seen examples are using Matlab and the Imaging toolkit. Its either that, or go with a OpenCV compatible language + OpenCV.
Mathematica has a rich set of tools for image processing, but I'm uncertain about the computer vision capabilities.

Resources