Active Shape Models vs Active Appearance Models

Active Shape Models vs Active Appearance Models - opencv

I am implementing active ASM/AAM using OpenCV for segmentation of face images using OpenCV (to be further used in face recognition). I am pretty much done with the canonical implementation of ASM (as per T. Cootes papers) and result I get is not ideal, it does not always converge and when it does some boundaries are not captured, which I believe is a problem in the modeling of a local structure - i.e. gradient profile matching.
Now I am a bit unsure what to do next. ASM is a simpler and computationally less intensive algorithm compared to AAM. Should I continue improving ASM(say for example using 2D profiles rather than 1D profiles, or use different profile structure for different type of lanmarks) or get my hands straight on AAM?
Edit: Also, what are the papers you could recommend that improve on the original work by T.Cootes? I know there are so many of them, but maybe there are techniques that are considered canonical today?

You can find clarifications and implemented AAM whith 2D profiles in the book "Mastering OpenCV with Practical Computer Vision Projects" by Packt Publishing 2012. A lot of projects described in this book are open source and can be downloaded here: GitHub. They are more advanced than T.Cootes implementation.
I can say that AAM (as existing implementation you can look also at vosm) have good convergence (better than ASM) only if you train it on the same man (very good results for example for FRANCK (Talking Face Video) sequence) in other cases ASM works better.

Related

feature matching/detection on brain images

This question is for those who have tried feature detection/matching methods on brain images - it is a broad one, and perhaps a bad one:
How could you tell if the method you used was "good enough?"
What does a successful matching/detection test look like for your data?
EDIT:
As of now, I am not trying to detect any distinct features in particular.
I'm using OpenCV's ORB, SIFT, SURF, etc detection methods, and seeing what they identify for features.
Sometimes, however, the orientation of the brain changes entirely from a
few set of images to the next set, so if I compare two images from these sets,the detection methods won't yield any effective
results (i.e. the matching will be distinctly, completely off). But if I compare images that look similar, but not identical,
the detection seems to work alright. Point is, it seems like detection works for frames that were taken around the same
time, but not over a long interval. I wonder if others have come across this and if they have found that detection methods
are still useful despite the fact.

First of all, you should specify what kind of features or for which purpose, the experiment is going to be performed.
Feature extraction is highly subjective in nature, it all depends on what type of problem you are trying to handle. There is no generic feature extraction scheme which works in all cases.
For example if the features are pointing out to some tumor classification or lesion, then of course there are different softwares you can use to extract and define your features.
There are different methods to detect the relevant features regarding to the application:
SURF algorithm (Speeded Up Robust Features)
PLOFS: It is a fast wrapper approach with a subset evaluation.
ICA or 'PCA
This paper is a very great review about brain MRI data feature extraction for tissue classification:
https://pdfs.semanticscholar.org/fabf/a96897dcb59ad9f04b5ff92bd15e1bd159ef.pdf
I found this paper very good o understand the difference between feature extraction techniques.
https://www.sciencedirect.com/science/article/pii/S1877050918301297

Can the Hough Transform be used in commercial software?

Can the Hough Transform be used in commercial software?
I mean, it is one of those things that seem research only and unstable.
You would not put it in a commercial compositing software for example
and have the user rely on it at all times.
Any opinions?
Thanks

The Hough transform has been in use in commercial and industrial applications all over the world for years, decades even. From the wikipedia page you can see that it was first developed in 1972, based on earlier ideas from 1962. That means it is older than the CCD that you use to capture the images you use in the compositing software.
Given that it "seems research only and unstable" to you, I would suggest you spend some time learning various computer vision and image analysis algorithms and techniques, and get a good mathematical basis in the field in general before you implement the Hough transform in commercial compositing software.
And when you are done studying I'd suggest you use a well tested open source implementation.

Yes. In fact, I've written Hough transform code for a piece of commercial software that wasn't meant to be a research tool like MATLAB. Though I put a lot of time into its robustness towards a specific application, it worked great.
The Hough transform by itself can sometimes be unreliable in applications where you have some level noise, such in webcams, or when there are some distortions in the shape you need to extract. This may be what you are seeing. In this case you may need to do a little more tuning towards your application, or try some basic image preprocessing.

I'm a bit annoyed with the condescending tone in both the comment to the question (by High Performance Mark), as well as the accepted answer here.
Firstly, that programming libraries/frameworks provide an implementation of an algorithm does not mean it is used, or rather, suited for commercial applications (i.e. getting the job done, robustly, on less pristine conditions). The Hough transform is a well defined algorithm (with possible uses and limitations) which is simple enough to understand, and very commonly taught in introductory image processing courses. Not surprisingly, it has been implemented in general purpose libraries such as Matlab's, Octave's and OpenCV. I don't believe the question was intended to discuss the robustness of an implementation and possibility of inclusion in commercial image processing frameworks, but rather if the algorithm itself is well suited for end user software (an application that counts circles, or what not).
The accepted answer, as it stands, is "The algorithm is very old. Here is a book on image processing, here is a link to a image processing library that has implemented it". The other answer with zero score seems to be on topic (i.e. discussion possible applications), though isn't very specific ("worked for me").
So, why do some people get the impression that the hough transform is unreliable for shape detection? Here is a good example: Unreliable results with cv2.HoughCircles
The input seems to be very well defined circles. However, the more robust, suggested working solution doesn't use Hough transform. I've had similar experience with my own projects. Usually, the more robust way is some kind of object segmentation, distance transform, watershed and peak localization. Have I ever used Hough transform with good results? No. I think it could be useful in some cases. In particular if the shapes of the imaged objects are perfectly defined, and partially occluded.
In other words, I'm also curious as to commercial applications that ended up benefiting from Hough transform. That's how I came across this question, and subsequently disappointed in the "you wouldn't ask that question if you understood the subject better", responses.

Facial expression detection

I am currently working on a project where I have to extract the facial expression of a user (only one user at a time) like sad or happy.
There are a lot of programs/APIs to do face detection but I did not find any one to do automatic expression recognition.
The best possibility I found so far:
-Using Luxand FaceSDK, which will give me access to 66 different points within the face, so I would still have to manually map them to expressions.
I used OpenCV for face detection earlier, which was working great, so If anyone has some tips on how to do it with OpenCV, that would be great!
Any programming language is welcome (Java preferred).
Some user on a OpenCV board suggested looking for AAM (active apereance models) and ASM (active shape models), but all I found were papers.

You are looking for machine learning solutions. FaceSDK looks like a good feature extractor. I don't think that there will be an available library to solve your specific problem. Your best bet is to:
choose a machine learning framework (SVM, PCA) with a java implementation
take a serie of photos and label them yourself with the target expression (happy or sad)
compute your model and test it
This involves some knowledge about machine learning.

Natural feature tracking with openCV- evaluating the options

In brief, what are the available options for implementing the Tracking of a particular Image(A photo/graphic/logo) in webcam feed using OpenCv?In particular i am trying to collate opinion about the following:
Would HaarTraining be overkill(considering that it is not 3d objects but simply Images to be tracked) or is it the only way out?
Have tried Template Matching, Color-based detection but these don't offer reliable tracking under varying illumination/Scale/Orientation at all.
Would SIFT,SURF feature matching work as reliably in video as with static image
comparison?
Am a relative beginner to OpenCV , as is evident by my previous queries on SO (very helpful replies). Any cues or links to what could be good resources for beginning NFT implementation with OpenCV?

Can you talk a bit more about your requirements? Namely, what type of appearance variations do you expect/how much control you have over the environment. What type of constraints do you have in terms of speed/power/resource footprint?
Without those, I can only give some general assessment to the 3 paths you are talking about.
1.
Haar would work well and fast, particularly for instance recognition.
Note that Haar doesn't work all that well for 3D unless you train with a full spectrum of templates to cover various perspectives. The poster child application of Haar cascades is Viola Jones' face detection system which is largely geared towards frontal faces (can certainly be trained for many other things)
For a tutorial on doing Haar training using OpenCV, see here.
2.
Try NCC or better yet, Lucas Kanade tracking (cvCalcOpticalFlowPyrLK which is a pyramidal as in coarse-to-fine LK - a 4 level pyramid usually works well) for a template. Usually good upto 10% scale or 10 degrees rotation without template changes. Beyond that, you can have automatically evolving templates which can drift over time.
For a quick Optical Flow/tracking tutorial, see this.
3.
SIFT/SURF would indeed work very well. I'd suggest some additional geometric verification step to remove spurious matches.
I'd be a bit concerned about the amount of computational time involved. If there isn't significant illumination/scale/in-plane rotation, then SIFT is probably overkill. If you truly need it, check out Changchang Wu's excellent SIFTGPU implmentation. Note: 3rd party, not OpenCV.

It seems that none of the methods when applied alone could bring reliable results unless it is a hobby project. Probably some adaptive algorithm would be more or less acceptable. For example see a famous opensource project where they use machine learning.

what are the steps in object detection?

I'm new to image processing and I want to do a project in object detection. So help me by suggesting a step-by-step procedure to this project. Thanx.

Object detection is a very complex problem that includes some real hardcore math and long tuning of parameters to the computation methods involved. Your best bet is to use some freely available library for that - Google will help.

There are lot of algorithms about the theme and no one is the best of all. It's usually a mixture of them what makes the best solution to the solution.
For example, for object movement detection you could look at frame differencing and misture of gaussians.
Also, it's very dependent of your application, the environment (i.e. noise, signal quality), the processing capacity you may have available, the allowable error margin...
Besides, for it to work, most of time it's first necessary to do some kind of image processing to the input data like median filter, sobel filter, contrast enhancement and a large so on.
I think you should start reading all you can: books, google and, very important, a lot of papers about the subjects (there are many free in internet) you are interested in.
And first of all, i think it's fundamental (at least it has been for me) having a good library for testing. The one i have used/use is OpenCV. It's very complete, implement many of the actual more advanced algorithms, is very active, has a big community and it's free.
Open Computer Vision Library (OpenCV)
Have luck ;)

Take a look at AForge.NET. It's nowhere near Project Natal's levels of accuracy or usefulness, but it does give you the tools to learn the algorithms easily. It's an image processing and AI library and there are several tutorials on colored object tracking and motion detection.
Another one to look at is OpenCV from Intel. I believe it's a bit more advanced, but it's written in C.

Take a look at this. It might get you started in this complex field. The algorithm pages that it links to are interesting reading.
http://sun-valley.stanford.edu/projects/helicopters/final.html

This lecture by Jeff Hawkins, will give you an idea about the state of the art in this super-difficult field.
Seems that video disappeared... but this vid should cover similar ground.

Categories

HOME

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart