I need to perform automatic segmentation of dark blue spots on light yellow paper. Here's a very simple example:
In this case, a simple threshold based on hue or brightness works well. But here are some more challenging real-world examples:
Clearly a simple hue/brightness threshold will fail when dealing with shadows, exposure issues, and small stains where the color is blurred from the background.
I've tried Otsu's method for adaptive thresholding, but it also fails to account for shadows - and produces erratic results when analyzing "blank" cards.
Is there a form of localized thresholding that might work better? I've also considered edge detection, but am unsure how to transform the results into the needed binary image. Any suggestions are appreciated.
Related
My goal is to find known logos in static image and videos. I want to achieve that by using feature detection with KAZE or AKAZE and RanSac.
I am aiming for a similar result to: https://www.youtube.com/watch?v=nzrqH...
While experimenting with the detection example from the docs which is great btw, i was facing several issues:
Object resolution: Differences in size between the known object and
the resolution of the scene where the object should be located
sometimes breaks the detection algorithm - the object won't be
recognized in images with a low resolution although the image quality
is still allright for a human eye.
Color contrast with the background: It seems, that the detection can
easily be distracted by different background contrasts (eg: object is
logo black on white background, logo in scene is white on black
background). How can I make the detection more robust against
different luminations and background contrasts?
Preprocessing: Should there be done any kind of preprocessing of the
object / scene? For example enlarge the scene up to a specific size?
Is there any guideline how to approach the feature detection in
several steps to get the best results?
I think your issue is more complicated than feature-descriptor-matching-homography process.
It is more likely oriented to pattern recognition or classification.
You can check this extended paper review of shape matching:
http://www.staff.science.uu.nl/~kreve101/asci/vir2001.pdf
Firstly, the resolution of images is very important,
because usually matching operation makes a pixel intensity cross-correlation
between your sample image (logo) and your process image, so you will get the best-crosscorrelated area.
In the same way, the background colour intensity
is very important because background illumination could affect severally to your final result.
Feature-based methods are widely researched:
http://docs.opencv.org/2.4/modules/features2d/doc/feature_detection_and_description.html
http://docs.opencv.org/2.4/modules/features2d/doc/common_interfaces_of_descriptor_extractors.html
So for example, you can try alternative methods such as:
Hog descritors: Histogram oriented gradients:
https://en.wikipedia.org/wiki/Histogram_of_oriented_gradients
Pattern matching or template matching
http://docs.opencv.org/2.4/doc/tutorials/imgproc/histograms/template_matching/template_matching.html
I think the lastest (Pattern matching) is the easiest to check your algorithm.
Hope these references helps.
Cheers.
Unai.
I am using threshold in Opencv to find the contours. My input is a hand image. Sometimes the threshold is not good so I couldnt find the contours.
I have applied the below preprocessing steps
1. Grabcut
cv::grabCut(image, result,rectangle,bgModel,fgModel, 3,cv::GC_INIT_WITH_RECT);
gray Scale conversion
cvtColor(handMat, handMat, CV_BGR2GRAY);
meadianblur
medianBlur(handMat, handMat, MEDIAN_BLUR_K);
I used the below code to find threshold
threshold( handMat, handMat, 141, 255, THRESH_BINARY||CV_THRESH_OTSU );
Sometimes I get good output and sometimes the threshold output is not good. I have attached the two output images.
Is there any other way than threshold from which contours can be found?
Good threshold Output:
Bad threshold Output
Have you tried an adaptive threshold? A single value of threshold rarely works in real life application. Another truism - threshold is a non-linear operation and hence non-stable. Gradient on the other hand is linear so you may want to find a contour by tracking the gradient if your background is smooth and solid color. Gradient is also more reliable during illumination changes or shadows than thresholding.
Grab-cut, by the way, uses color information to improve segmentation on the boundary when you already found 90% or so of the segment, so it is a post processing step. Also your initialization of grab cut with rectangle lets in a lot of contamination from background colors. Instead of rectangle use a mask where you mark as GC_FGD deep inside your initial segment where you are sure the hand is; mark as GC_BGD far outside your segment where you sure background is; mark GC_PR_FGD or probably foreground everywhere else - this is what will be refined by grab cut. to sum up - your initialization of grab cut will look like a russian doll with three layers indicating foreground (gray), probably foreground (white) and background (balck). You can use dilate and erode to create these layers, see below
Overall my suggestion is to define what you want to do first. Are you looking for contours of arbitrary objects on arbitrary moving background? If you are looking for a contour of a hand to find fingers on relatively uniform background I would:
1. use connected components or MSER to segment out a hand. Possibly improve results with grab cut initialized with the conservative mask and not rectangle!
2. use convexity defects to find fingers if this is your goal;
One issue is to try to find contours without binarizing the image.
If your input is in color, you can try to change color space in order to enhance the difference between the hand and the background.
Otsu try to find an optimal threshold, you can also try to set it manually but Otsu is useful because if the illumination change, the threshold will adapt automatically.
There are also many other kind of binarization : Sauvola, Bradley, Niblack, Kasar... but Otsu is simple, and work well. I suggest you to do preprocessing or postprocessing if you want to improve the binarization result.
So I have an image that has some dark spots, and they look pretty simple so I think I can creaat an luminance map, invert it and then apply it to my image to undo the dark spots. However all I can find is two methods for equalizing: equalize whole image(with histogram) or segment the image in dark meddium and light parts and equalize which you want. The first approach does not help my problem and the second approach also makes the dark objects in the image lighter. I am sure there is an easy way to do this (long ago I saw somebody do it in a presentation), although I havent been able to find it or come up with it yet.
So my question: how would I create a "luminance map" of an image like this:
So I get a map like this:
Which I can apply inversely to get a better image like this:
I understand I will have discretization errors at the corrected spots, but thats much better than the dark spots. I hope someone can help me do this, thanks!
I mainly use Matlab and have some limited python and mathematica knowledge, but a Matlab example would be most useful to me. One way I thought of myself was taking fft2 and nulling the low frequencies, but that would just destroy all contrast, not just the partss I want.
Similar but different SO questions that did not help me:
Image equalisation
thresholding an image based on gradient
Histogram of image
Matlab - Local Histogram Equalization
How to find out light, medium and dark color?
You will have to very precisely model the nature of the dark spots for this process to work. Can you characterize whether the dark gradient is linear, exponential, power, trigonometric, or some other predictable function? Is it always exactly circular?
Having straight-line elements in the photo helps, and may provide a source of samples to calculate the nature of the dark spot from. If you treat the dark spot like a quadratic or cubic function in three dimensions (X, Y, luminance) then you can solve it based on a certain number of known points.
I have the image of hand that was detected using this link. Its hand detection using HSV color space.
Now I face a problem: I need to get the enclosing area/draw bounding lines possible enough to determine the hand area, then fill the enclosing area and subtract it from the original to remove the hand.
I have thus so far tried to blurring the image to reduce noise, dilating the image, closing holes, etc. that seem to be an overdose. I have tried contours, and that seem to be the best approach so far. I was trying to get the convex hull (largest) and I ended up with the following after testing with different thresholds.
The inaccuracies can be seen with the thumb were the hull straightens. It must be curved. I am trying to figure out the location of the hand so to identify the region being covered by the hand. Going to subtract it to remove the hand from the original image. That is what I want to achieve.
Is there a better approach to this?
And ideas suggestions greatly appreciated.
Original and detected are as follows
Instead of the convex hull, consider using the alpha hull, which can better follow the contours of a shape by allowing concavities.
This site has a nice summary of alpha shapes: "Everything You Always Wanted to Know About Alpha Shapes But Were Afraid to Ask" by François Bélair.
http://cgm.cs.mcgill.ca/~godfried/teaching/projects97/belair/alpha.html
As David mentioned in his post, consider thresholding using HSV (or HSI) color space rather than on RGB or grayscale. If you can allow for longer processing time, you can use an algorithm such as Mean Shift to segment trickier images like yours. OpenCV has an implementation of Mean Shift, and the book Learning OpenCV provides a concise description of the algorithm.
Image Segmentation using Mean Shift explained
In any case, a standard binarization threshold doesn't appear to be helping much. Consider using a dynamic threshold; at least local/dynamic threshold is implemented for contours in OpenCV, from what I recall.
Assuming you want to identify hand area instead of the area convex hull gives and background of the application is at least in same color, I would apply hsv-threshold to identify background instead of hand if possible. Or maybe adaptive threshold if light distribution is not consistent. I believe this is what many applications do
If background can't be fixed, the segmentation is not an easy problem to resolve as you should take care of shadows and palm lines.
The objective is to display the person on a different background (aka background removal).
I'm using the Kinect with Microsoft's Beta Kinect SDK to do so. With help of the depth, the background is filtered and we get only the image of the person.
This is pretty simple to do, and we can find the code that does that everywhere on the Internet. However, the depth signal is noisy, and we get pixels which do not belong to the person that are displayed.
I applied an edge detector to see if it was useful, and I currently get this:
Here's another without edge detection:
My question is: Which way can I get rid of these noisy white pixels around the person?
I tried morphological operations, but some parts of the body are erased and still leave white pixels behind.
The algorithm doesn't need to be real-time, I can just apply it when I press a 'Save image' button.
Edit 1:
I just tried to do background substraction with the closest frames on the shape border. The single pixels you see are flickering, which means it is noise and I can get easily get rid of them.
Edit 2:
The project is now over, and here's what we did: manual calibration of the Kinect by using the OpenNI driver, which provides directly the infrared image. The result is really good, but each calibration is specific to each Kinect.
Then, we applied a little transparency on the borders, and the result looks really nice! I can't provide pictures, however.
Your problem isn't just the noisy white pixels. You're missing significant parts of the person as well, e.g. part of his right hand. I'd recommend being more conservative with your thresholding of the depth data (allow more false positives). This would give you more noisy pixels, but at least you'd have the person in their entirety.
To get rid of the noisy pixels, I can think of a couple of things:
Feather the outer pixels (reduce them in intensity/increase their transparency if you're using an alpha channel)
Smooth the image, perform the edge detection on the smoothed image, then use these edges with your original sharp image.
Do some skin region detection to mark parts that definitely belong to a person. See skin detection in the YUV color space? and Skin Color Detection
For clothes, work with the hue and saturation image. If you know the color of the t-shirt (or that at least that it's not a neutral color), then this will stand out easily. If you don't know this information, then it may be worth building up a model of the person using the other frames (if there's a big gray blob that's moving around in your video, chances are that your subject is wearing a gray shirt)
The approaches aren't mutually exclusive so it may be worth trying to do them in combination. If I think of anything else, I'll post back here.
If there is no other way of resolving the jitter on the edges you could always try anti-alias as post-process.