iRecognise eyes in a scanned image of a person - ios

I want to develop an iPhone app which should recognise the eyes, face, and skin color of a person in an image which is scanned by QR Reader.
How can eyes be detected in an image?

Although it may be possible, I'm just warning you that it will have, regardless of the programming, a degree of inaccuracy. Any face/retina detection software is able to be tricked, and considering the quality of the iPhone's camera, it can't capture enough detail to accurately evaluate the geometric relevancy between two retinas. Also, about recognizing the face color, it might be problematic due to varying light conditions. Under florescent lights, the person's skin tone would appear more blue than under incandescent or natural lighting. Maybe there is another way to go about this?

For localizing the eyes I've used the algorithm described in "Accurate Eye Center Location and Tracking Using Isophote Curvature" by Roberto Valenti, Theo Gevers in my master thesis and achieved very good results with it:
http://www.science.uva.nl/research/publications/2008/ValentiCVPR2008/CVPR%2008.pdf
For face detection / localization, use the Viola-Jones algorithm, there is probably a objective-c implementation out there somewhere. (OpenCV has it, alternatively)

Related

why is shape-indexed-feature so effective on face alignment?

I am implementing some face alignment algorithm recently. I have read the following papers:
Supervised descent method and its applications to face alignment
Face alignment by explicit shape regression
Face alignment at 3000 fps via regressing local binary features
All the paper mentioned a important keyword: shape-indexed-feature or pose-indexed-feature. This feature plays a key role in face alignment process. I did not get the key point of this feature. Why is it so important?
A shape-indexed-feature is a feature who's index gives some clue about the hierarchical structure of the shape that it came from. So in face alignment, facial landmarks are extremely important, since they are the things that will be useful in successfully aligning the faces. But, just taking facial landmarks into account throws away some of the structure inherent to a face. You know that the pupil is inside the iris, which is inside the eye. So a shape-indexed-feature would do more than tell you that you are looking at a facial landmark - it would tell you that you are looking at a facial landmark inside another landmark inside another landmark. Because there are only a few features that are 3-nested like that, you can be more confident about aligning those correctly.
Here is a much older paper that explains some of this with simpler language (especially in the introduction): http://www.cs.ubc.ca/~lowe/papers/cvpr97.pdf
If you want get shape-indexed-feature, you should do similarity transform for the face landmarks in one image first. The aim is transform the origin landmarks to a specific location which could be the mean landmark of all images. So the landmarks of each image is at same position.
Then you could extract local features according to the relocate landmarks, which are shape-indexed-feature, cause now the landmarks of each image is a fix shape.
I seached hours get the answer above, a graduation thesis and translated it, but not sure whether it's a right answer or not. In my opinion, it make sense.

Image Processing - Determine if someone looking into camera

with image processing libraries like opencv you can determine if there are faces recognized in an image or even check if those faces have a smile on it.
Would it be possible to somehow determine, if the person is looking directly into the camera? As it is hard even for the human eye to determine is someone is looking into the camera or to a close point, i think that this will be very tricky.
Can someone agree?
thanks
You can try using an eye detection program, I remember doing back a few years ago, and it wasn't that strong, so when we tilt our head slightly off the camera, or close our eyes, the eyes can't be detected.
Is it is not clear, what I really meant was our face must be facing straight at the camera with our eyes open before it can detect our eyes. You can try doing something similar with a bit of tweaks here and there.
Off the top of my head, split the image to different sections, for each ROI, there are different eye classifiers, for example, upper half of the image, u can train the a specific classifiers of how eyes look like when they look downwards, lower half of image, train classifiers of how eyes look like when they look upwards. and for the whole image, apply the normal eye detection in case the user move their head along while looking at the camera.
But of course, this will be based on extremely strong classifiers and ultra clear quality images, video due to when the eye is looking at. Making detection time, extremely slow even if my method is successful.
There maybe other ideas available too that u can explore. It's slightly tricky, but it not totally impossible. If openCV can't satisfy, openGL? so many libraries, etc available. I wish you best of luck!

Red Eye detection

Project: Red eye detection
Description: I want to remove red-eye from images. I am not able to use face detector because, the faces in the images are not always frontal and also the images are of players with helmet. And the images may have many red eyes. Also, the lighting is not proper. I want to know how to detect the red eyes? I am searching for some proven studies. Any help would be appreciated.
Update:
My images will be like the below one with red eye.
Those algorithms belong to edge and feature detection algorithms studied in Computer Vision.
Since you are looking for studies, I can offer you to read ones by Microsoft, HP, another one by HP, another good discussion of the algorithm

Feature Detection in Noisy Images

I've built an imaging system with a webcam and feature matching such that as I move the camera around; I can track the camera's motion. I am doing something similar to here, except with the webcam frames as the input.
It works really well for "good" images, but when taking images in really low light lots of noise appears (camera high gain), and that messes with the feature detection and matching. Basically, it doesn't detect any good features, and when it does, it cannot match them correctly between frames.
Does anyone know a good solution for this? What other methods are used for finding and matching features?
Here are two example images with very low features:
I think phase correlation is going to be your best bet here. It is designed to tell you the phase shift (i.e., translation) between two images. It is much more resilient (but not immune) to noise than feature detection because it operates in frequency space; whereas, feature detectors operate spatially. Another benefit is, it is very fast when compared with feature detection methods. I have an implementation available in the OpenCV trunk that is sub-pixel accurate located here.
However, your images are pretty much "featureless" with the exception of the crease in the middle, so even phase correlation may have some trouble with it. Think of it like trying to detect translation in a snow storm. If all you can see is white, you can't tell that you have translated at all, thus the term whiteout. In your case, the algorithm might suffer from "greenout" :)
Can you adjust the camera settings to work better in low-light conditions. Have you fully opened the iris? Can you live with lower framerates? Setting a longer exposure time will allow the camera to gather more light, thus giving you more features at the cost of adding motion blur. Or, if low-light is your default environment you probably want something designed for this like an IR camera, but those can be expensive. Other than that, a big lens and long exposures are your friend :)
Histogram equalization may be of interest in improving the image contrast. But, sometimes it can just enhance the noise. OpenCV has a global histogram equalization function called equalizeHist. For a more localized implementation, you'll want to look at Contrast Limited Adaptive Histogram Equalization or CLAHE for short. Here is a good article on it. This page has some nice examples, and some code.

Fiducial marker detection in the presence of camera shake

I'm trying to make my OpenCV-based fiducial marker detection more robust when the user moves the camera (phone) violently. Markers are ArTag-style with a Hamming code embedded within a black border. Borders are detected by thresholding the image, then looking for quads based on the found contours, then checking the internals of the quads.
In general, decoding of the marker is fairly robust if the black border is recognized. I've tried the most obvious thing, which is downsampling the image twice, and also performing quad-detection on those levels. This helps with camera defocus on extreme nearground markers, and also with very small levels of image blur, but doesn't hugely help the general case of camera motion blur
Is there available research on ways to make detection more robust? Ideas I'm wondering about include:
Can you do some sort of optical flow tracking to "guess" the positions of the marker in the next frame, then some sort of corner detection in the region of those guesses, rather than treating the rectangle search as a full-frame thresholding?
On PCs, is it possible to derive blur coeffiients (perhaps by registration with recent video frames where the marker was detected) and deblur the image prior to processing?
On smartphones, is it possible to use the gyroscope and/or accelerometers to get deblurring coefficients and pre-process the image? (I'm assuming not, simply because if it were, the market would be flooded with shake-correcting camera apps.)
Links to failed ideas would also be appreciated if it saves me trying them.
Yes, you can use optical flow to estimate where the marker might be and localise your search, but it's just relocalisation, your tracking will have broken for the blurred frames.
I don't know enough about deblurring except to say it's very computationally intensive, so real-time might be difficult
You can use the sensors to guess the sort of blur you're faced with, but I would guess deblurring is too computational for mobile devices in real time.
Then some other approaches:
There is some really smart stuff in here: http://www.robots.ox.ac.uk/~gk/publications/KleinDrummond2004IVC.pdf where they're doing edge detection (which could be used to find your marker borders, even though you're looking for quads right now), modelling the camera movements from the sensors, and using those values to estimate how an edge in the direction of blur should appear given the frame-rate, and searching for that. Very elegant.
Similarly here http://www.eecis.udel.edu/~jye/lab_research/11/BLUT_iccv_11.pdf they just pre-blur the tracking targets and try to match the blurred targets that are appropriate given the direction of blur. They use Gaussian filters to model blur, which are symmetrical, so you need half as many pre-blurred targets as you might initially expect.
If you do try implementing any of these, I'd be really interested to hear how you get on!
From some related work (attempting to use sensors/gyroscope to predict likely location of features from one frame to another in video) I'd say that 3 is likely to be difficult if not impossible. I think at best you could get an indication of the approximate direction and angle of motion which may help you model blur using the approaches referenced by dabhaid but I think it unlikely you'd get sufficient precision to be much more help.

Resources