Tips on building a program detecting pupil in images - image-processing

I am working on a project that aims to build a program which automatically gives a relatively accurate detection of pupil region in eye pictures. I am currently using simplecv in Python, given that Python is easier to experiment with. Since I just started, the eye pictures I am working with are fairly standardized. However, the size of iris and pupil as well as the color of iris can vary. And the position of the eye can shift a little among pictures. Here's a picture from wikipedia that is similar to the pictures I am using:
"MyStrangeIris.JPG" by Epicstessie is licensed under CC BY-SA 3.0
I have tried simple thresholding. Since different eyes have different iris colors, a fixed thresholding would not work on all pictures.
In addition, I tried simplecv's build-in sobel and canny edge detection, it's not working especially for eyes with darker iris. I also doubt that sobel or canny alone can solve the problem, given sometimes there are noises on the edge of the pupil (e.g., reflection of eyelash)
I have entry-level knowledge about image processing and machine learning. Right now, I am thinking about three possibilities:
Do a regression on the threshold value base on some variables
Make a specific mask only for edge detection for the pupil
classification on each pixel (this looks like lots of work to build the training set)
Am I on the right track? I would like to reach out to anyone with more experience on this type of problem. Any tips/suggestions are more than welcome. Thanks!

I think that for start you should put aside the machine learning. You have so much more to try in "regular" computer vision.
You need to try and describe a model for your problem. A good way to do this is to sit and think how you as a person detect iris. For example, i can think of:
It is near the center of image.
It is is Brown/green/blue circle, with distinct black center, surrounded by mostly white ellipse.
You have a skin color around the white ellipse.
It can't be too small or too large (depends on your images..)
After you build your model, try to find better ways to find these features. Hard to point on specific stuff, but you can start from: HSV color space, Correlation, Hough transform, Morphological operations..
Only after you feel you have exhausted all conventional tools, start thinking on features extraction and machine learning..
And BTW, because you are not the first person that try to detect iris, you can look at other projects for ideas.

I have written a small matlab code for image (link you have provided), function which i have used is hough transform for circle detection, which has also implemented in opencv, so porting will not create problem, i just want to know that i am on write way or not.
my result and code is as follows:
clc
clear all
close all
im = imresize(imread('irisdet.JPG'),0.5);
gray = rgb2gray(im);
Rmin = 50; Rmax = 100;
[centersDark, radiiDark] = imfindcircles(gray,[Rmin Rmax],'ObjectPolarity','dark');
figure,imshow(im,[])
viscircles(centersDark, radiiDark,'EdgeColor','b');
Input Image:
Result of Algorithm:
Thank You

Not sure about iris classification, but I've done written digit recognition from photos. I would recommend tuning up the contrast and saturation, then use a k-nearest neighbour algorithm to classify your images. Depending on your training set, you can get as high as 90% accuracy.
I think you are on the right track. Do image preprocessing to make classification easier, then train an algorithm of your choice. You would want to treat each image as one input vector though, instead of classifying each pixel!

I think you can try Active Shape Modelling or if you want a really feature rich modelling and do not care about the time it takes execute the algorithm you can try Active appearance modelling. You might want to look into these papers for better understanding:
Active Shape Models: Their Training and Application
Statistical Models of Appearance for Computer Vision - In Depth

Related

How to detect hand palm and its orientation (like facing outwards)?

I am working on a hand detection project. There are many good project on web to do this, but what I need is a specific hand pose detection. It needs a totally open palm and the whole palm face to outwards, like the image below:
The first hand faces to inwards, so it will not be detected, and the right one faces to outwards, it will be detected. Now I can detect hand with OpenCV. but how to tell the hand orientation?
Problem of matching with the forehand belongs to the texture classification, it's a classic pattern recognition problem. I suggest you to try one of the following methods:
Gabor filters: it is good to detect the orientation and pixel intensities (as forehand has different features), opencv has getGaborKernel function, the very important params of this function is theta (orientation) and lambd: (frequencies). To make it simple you can apply this process on a cropped zone of palm (as you have already detected it, it would be easy to crop for example the thumb, or a rectangular zone around the gravity center..etc). Then you can convolute it with a small database of images of the same zone to get the a rate of matching, or you can use the SVM classifier, where you have to train your SVM on a set of images by constructing the training matrix needed for SVM (check this question), this paper
Local Binary Patterns (LBP): it's an important feature descriptor used for texture matching, you can apply it on whole palm image or on a cropped zone or finger of image, it's easy to use in opencv, a lot of tutorials with codes are available for this method. I recommend you to read this paper talking about Invariant Texture Classification
with Local Binary Patterns. here is a good tutorial
Haralick Texture: I've read that it works perfectly when a set of features quantifies the entire image (Global Feature Descriptors). it's not implemented in opencv but easy to be implemented, check this useful tutorial
Training Models: I've already suggested a SVM classifier, to be coupled with some descriptor, that can works perfectly.
Opencv has an interesting FaceRecognizer class for face recognition, it could be an interesting idea to use it replacing the face images by the palm ones, (do resizing and rotation to get an unique pose of palm), this class has three methods can be used, one of them is Local Binary Patterns Histograms, which is recommended for texture recognition. and why not to try the other models (Eigenfaces and Fisherfaces ) , check this tutorial
well if you go for a MacGyver way you can notice that the left hand has bones sticking out in a certain direction, while the right hand has all finger lines and a few lines in the hand palms.
These lines are always sort of the same, so you could try to detect them with opencv edge detection or hough lines. Due to the dark color of the lines, you might even be able to threshold them out of it. Then gather the information from those lines, like angles, regressions, see which features you can collect and train a simple decision tree.
That was assuming you do not have enough data, if you have then you go into deeplearning, just take a basic inceptionV3 model and retrain the last dense layer to classify between two classes with a softmax, or to predict the probablity if the hand being up/down with sigmoid. Check this link, Tensorflow got your back on the training of this one, pure already ready code to execute.
Questions? Ask away
Take a look at what leap frog has done with the oculus rift. I'm not sure what they're using internally to segment hand poses, but there is another paper that produces hand poses effectively. If you have a stereo camera setup, you can use this paper's methods: https://arxiv.org/pdf/1610.07214.pdf.
The only promising solutions I've seen for mono camera train on large datasets.
use Haar-Cascade classifier,
you can get the classifier model file then use it here.
Just search for 'Haarcascade detection of Palm in Google' or use below code.
import cv2
cam=cv2.VideoCapture(0)
ccfr2=cv2.CascadeClassifier('haar-cascade-files-master/palm.xml')
while True:
retval,image=cam.read()
grey=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
palm=ccfr2.detectMultiScale(grey,scaleFactor=1.05,minNeighbors=3)
for x,y,w,h in palm:
image=cv2.rectangle(image,(x,y),(x+w,y+h),(256,256,256),2)
cv2.imshow("Window",image)
if cv2.waitKey(1) & 0xFF==ord('q'):
cv2.destroyAllWindows()
break
del(cam)
Best of Luck for your experience using HaarCascade.

Detect symbols written on a whiteboard using OpenCV

I'm trying to detect shapes written on a whiteboard with a black/blue/red/green marker. The shapes can be circles, rectangles or triangles. The image can be found at the bottom of this post.
I'm using OpenCV as the framework for the image recognition.
My first task is to research and list the different strategies that could be used for the detection. So far I have found the following:
1) Grayscale, Blur, Canny Edge, Contour detection, and then some logic to determine if the contours detected are shapes?
2) Haar training with different features for shapes
3) SVM classification
4) Grayscale, Blur, Canny Edge, Hough transformation and some sort of color segmentation?
Are there any other strategies that I have missed? Any newer articles or tested approaches? How would you do it?
One of the test pictures: https://drive.google.com/file/d/0B6Fm7aj1SzBlZWJFZm04czlmWWc/view?usp=sharing
UPDATE:
The first strategy seems to work the best, but is far from perfect. Issues arise when boxes are not closed, or when the whiteboard has a lot of noise. Haar training does not seems very effective because of the simple shapes to detect without many specific features. I have not tried CNN yet, but it seems most appropriate to image classification, and not so much to detect shapes in a larger image (but I'm not sure)
I think that the first option should work. You can use fourier descriptors in order to classify the segmented shapes.
http://www.isy.liu.se/cvl/edu/TSBB08/lectures/DBgrkX1.pdf
Also, maybe you can find something useful here:
http://www.pyimagesearch.com/2016/02/08/opencv-shape-detection/
If you want to try a more challenging but modern approach, consider deep learning approach (I would start with CNN). There are many implementations available on the internet. Although it is probably an overkill for this specific project, it might help you in the future...

Water Edge Detection

Is there a robust way to detect the water line, like the edge of a river in this image, in OpenCV?
(source: pequannockriver.org)
This task is challenging because a combination of techniques must be used. Furthermore, for each technique, the numerical parameters may only work correctly for a very narrow range. This means either a human expert must tune them by trial-and-error for each image, or that the technique must be executed many times with many different parameters, in order for the correct result to be selected.
The following outline is highly-specific to this sample image. It might not work with any other images.
One bit of advice: As usual, any multi-step image analysis should always begin with the most reliable step, and then proceed down to the less reliable steps. Whenever possible, the less reliable step should make use of the result of more-reliable steps to augment its own accuracy.
Detection of sky
Convert image to HSV colorspace, and find the cyan located at the upper-half of the image.
Keep this HSV image, becuase it could be handy for the next few steps as well.
Detection of shrubs
Run Canny edge detection on the grayscale version of image, with suitably chosen sigma and thresholds. This will pick up the branches on the shrubs, which would look like a bunch of noise. Meanwhile, the water surface would be relatively smooth.
Grayscale is used in this technique in order to reduce the influence of reflections on the water surface (the green and yellow reflections from the shrubs). There might be other colorspaces (or preprocessing techniques) more capable of removing that reflection.
Detection of water ripples from a lower elevation angle viewpoint
Firstly, mark off any image parts that are already classified as shrubs or sky. Since shrub detection would be more reliable than water detection, shrub detection's result should be used to inform the less-reliable water detection.
Observation
Because of the low elevation angle viewpoint, the water ripples appear horizontally elongated. In fact, every image feature appears stretched horizontally. This is called Anisotropy. We could make use of this tendency to detect them.
Note: I am not experienced in anisotropy detection. Perhaps you can get better ideas from other people.
Idea 1:
Use maximally-stable extremal regions (MSER) as a blob detector.
The Wikipedia introduction appears intimidating, but it is really related to connected-component algorithms. A naive implementation can be done similar to Dijkstra's algorithm.
Idea 2:
Notice that the image features are horizontally stretched, a simpler approach is to just sum up the absolute values of horizontal gradients and compare that to the sum of absolute values of vertical gradients.

Can Haar Cascade be too accurate to be useful in this situation?

I'm making a program to detect shapes from an r/c plane for a competition. I have no real images of the targets, but I do have computer generated examples of them on the rules.
My question is, can I train my program to detect real world objects based on computer generated shapes or should I find a different method to complete this task?
I would like to know before I foolishly generate 5k samples and find them useless in the end.
EDIT: I also don't know the exact color of the objects. If I feed the program samples of varying color, will it be a problem?
Thanks in advance!!
Edit2: Here's what groups from my school detected in previous years
As you can see, the detected images are not nearly as flawless as what would appear in real life. If you can suggest a better method, that would help.
If you think that the real images will have unique colors with simple geometric shapes then you could probably try to create a normalized Hue-histogram. Use it to train SVM classifier. The benefit of using Hue-histogram is that it will be rotational and scale invariant.
You can take the few precautions in mind:
Don't forget to remove the illumination affects.
Sometimes, White and black pixels create some problem in hue-histogram calculation so try to remove them from calculation by considering only those pixel which have S>0 and V>0 in S & V channels of HSV image.
I would rather suggest you to use the real world images because the performance is largely dependent upon training (my personal experience). And why don't you try to use SIFT/SURF descriptors for training to SVM (support vector machine) as SIFT/SURF are scale as well as rotational invariant.

Histogram of Oriented Gradients in multi-scale (mean-shift?)

I am working on HOG descriptors and I am pretty much done with most of the parts, except the fusion of the detection windows.
What I have done so far is; I build a scale space pyramid of the image and for each image on each scale I move the detection window(64x128) and detect humans. In each image a person is detected by more than one window.
So the question is how to fuse all these windows(assume for one person) into one window. Dalal suggests that one should use a robust mod detection algorithm, such as mean-shift. But, I have multiple scales... Should I first estimate the true location of the detection window found in lower levels of the scale space in order to do that?
Any help is appreciated.
Thanks in advance.
My interpretation is that mean shift would give you in effect what you are suggesting.
Essentially, you estimate the probability distribution of the location of the person at the coarsest scale first based upon the strengths of the detector outputs. This gives you a robust estimate of mode.
You can then iteratively refine using the finer scales around the maximum or the mode.
The idea is very similar that used in pyramidal LK tracking, for example. You can also do ensemble processing and/or particle filters.

Resources