Image Segmentation for Color Analysis in OpenCV - opencv

I am working on a project that requires me to:
Look at images that contain relatively well-defined objects, e.g.
and pick out the color of n-most (it's generic, could be 1,2,3, etc...) prominent objects in some space (whether it be RGB, HSV, whatever) and return it.
I am looking into ways to segment images like this into the independent objects. Once that's done, I'm under the impression that it won't be particularly difficult to find the contours of the segments and analyze them for average or centroid color, etc...
I looked briefly into the Watershed algorithm, which seems like it could work, but I was unsure of how to generate the marker image for an indeterminate number of blobs.
What's the best way to segment such an image, and if it's using Watershed, what's the best way to generate the corresponding marker image of integers?

Check out this possible approach:
Efficient Graph-Based Image Segmentation
Pedro F. Felzenszwalb and Daniel P. Huttenlocher
Here's what it looks like on your image:

I'm not an expert but I really don't see how the Watershed algorithm can be very useful to your segmentation problem.
From my limited experience/exposure to this kind of problems, I would think that the way to go would be to try a sliding-windows approach to segmentation. Basically this entails walking the image using a window of a set size, and attempting to determine if the window encompasses background vs. an object. You will want to try different window sizes and steps.
Doing this should allow you to detect the object in the image, presuming that the images contain relatively well defined objects. You might also attempt to perform segmentation after converting the image to black and white with a certain threshold the gives good separation of background vs. objects.
Once you've identified the object(s) via the sliding window you can attempt to determine the most prominent color using one of the methods you mentioned.
UPDATE
Based on your comment, here's another potential approach that might work for you:
If you believe the objects will have mostly uniform color you might attempt to process the image to:
remove noise;
map original image to reduced color space (i.e. 256 or event 16 colors)
detect connected components based on pixel color and determine which ones are large enough
You might also benefit from re-sampling the image to lower resolution (i.e. if the image is 1024 x 768 you might reduce it to 256 x 192) to help speed up the algorithm.
The only thing left to do would be to determine which component is the background. This is where it might make sense to also attempt to do the background removal by converting to black/white with a certain threshold.

Related

Ideas to process challenging image

I'm working with Infra Red image that is an output of a 3D sensor. This sensors project a Infra Red pattern in order to draw a depth map, and, because of this, the IR image has a lot of white spots that reduce its quality. So, I want to process this image to make it smoother in order to make it possible to detect objects laying in the surface.
The original image looks like this:
My objective is to have something like this (which I obtained by blocking the IR projecter with my hand) :
An "open" morphological operation does remove some noise, but I think first there should be some noise removal operation that addresses the white dots.
Any ideas?
I should mention that the algorithm to reduce the noise has to run on real time.
A median filter would be my first attempt .... possibly followed by a Gaussian blur. It really depends what you want to do with it afterwards.
For example, here's your original image after a 5x5 median filter and 5x5 Gaussian blur:
The main difficulty in your images is the large radius of the white dots.
Median and morphologic filters should be of little help here.
Usually I'm not a big fan of these algorithms, but you seem to have a perfect use case for a decomposition of your images on a functional space with a sketch and an oscillatary component.
Basically, these algorithms aim at solving for the cartoon-like image X that approaches the observed image, and that differs from Y only through the removal of some oscillatory texture.
You can find a list of related papers and algorithms here.
(Disclaimer: I'm not Jérôme Gilles, but I know him, and I know that
most of his algorithms were implemented in plain C, so I think most of
them are practical to implement with OpenCV.)
What you can try otherwise, if you want to try simpler implementations first:
taking the difference between the input image and a blurred version to see if it emphasizes the dots, in which case you have an easy way to find and mark them. The output of this part may be enough, but you may also want to fill the previous place of the dots using inpainting,
or applying anisotropic diffusion (like the Rudin-Osher-Fatemi equation) to see if the dots disappear. Despite its apparent complexity, this diffusion can be implemented easily and efficiently in OpenCV by applying the algorithms in this paper. TV diffusion can also be used for the inpainting step of the previous item.
My main point on the noise removal was to have a cleaner image so it would be easier to detect objects. However, as I tried to find a solution for the problem, I realized that it was unrealistic to remove all noise from the image using on-the-fly noise removal algorithms, since most of the image is actually noise.. So I had to find the objects despite those conditions. Here is my aproach
1 - Initial image
2 - Background subtraction followed by opening operation to smooth noise
3 - Binary threshold
4 - Morphological operation close to make sure object has no edge discontinuities (necessary for thin objects)
5 - Fill holes + opening morphological operations to remove small noise blobs
6 - Detection
Is the IR projected pattern fixed or changes over time?
In the second case, you could try to take advantage of the movement of the dots.
For instance, you could acquire a sequence of images and assign each pixel of the result image to the minimum (or a very low percentile) value of the sequence.
Edit: here is a Python script you might want to try

Matlab image processing - replacing dark pixels with neighboring pixels

I am doing some image processing of the retina images.. I need to replace the blood vessels with background pixels so that I can focus on other aspects of retina. I could not figure out a way to do this. I am using matlab. any suggestions?
Having worked extensively with retinal images, I can tell you that what you're proposing is a complex problem in itself. Sure, if you just want a crude method, you can use imdilate. But that will affect your entire image, and other structures in the image will change appearance. Something, that is not desirable.
However, if you want to do it properly, you will first need to segment all the blood vessels and create a binary mask. Once you have a binary mask, it's up to you how to fill up the vessel regions. You can either interpolate from the boundaries or calculate a background image and replace the vessel regions with pixels from the background image, etc.
Segmentation of the blood vessels is a challenging problem and you will find a lot of literature concerning that on the internet. Ultimately, you will have to choose how accurate a segmentation you want and build your algorithm accordingly.
imdilate should do what you want, since it replaces each pixel with the maximum of its neighbors. For more detailed suggestions, I'd need to see images.

What is the correct method to auto-crop objects from light background?

I'm trying to extract objects from scanned images. There could be a few documents on a white background, and I need to crop and rotate them automatically. This seems like a rather simple task, but I've got stuck at some point and get bad results all the time.
I've tried to:
Binarise the image and get connected components by performing morphological operations.
Perform watershed segmentation by using dilated and eroded binary images as mask components.
Apply Canny detector and fill the contours.
None of this gets me good results. If the object does't have contrast edges (i.e a piece of paper on white background), it splits into a lot of separate components. If I connect these components by applying excessive dilation, background noise also expands and everything becomes a mess.
For example, I have an image:
After applying Canny detector and filling the contours I get something like this:
As you can see, the components are not connected. They are eve too far from each other to be connected by a reasonable amount of dilation. And when I apply watershed to this mask combined with some background points, it yields very bad results.
Some images are noisy:
In this particular case I was able to obtain contour of the whole passport by Canny detector because of it's contrast edges. But threshold method doesn't work here.
If the images are always on a very light background, then you can binarize with a threshold close to the maximum possible value. After that it is a matter of correcting the binary image to get the objects, but this step will vary depending on how your other images look like.
For instance, the following image at left is what we get with a threshold at 99% of the maximum value after a gaussian filtering on the input. After removing components connected to the border and other small components, and also combining with some basic morphological tools, we get the image at right.
This may seem a bit wishy-washy but bear with me:
This looks like quite a challenging case for image processing recipes involving only edge detection, morphological operations and segmentation.
What you are not exploiting here is that you (I believe) know what your document should look like. You are currently looking at completely general solutions which do not take into account this prior knowledge. If you can get some training data then you can go all the way from simple template/patch-based matching (SSD, Normalized Cross-Correlation) to more sophisticated object detection techniques to find the position and rotation of your documents.
My guess is that if your objects are always more or less the same and at the same scale (e.g. passports scanned at a fixed resolution/similar machines) then you can get away with a fairly crude approach. There won't be any one correct method. It's also likely that the technique you end up using will not work until you have done a significant amount of parameter tweaking, so don't give up on anything too quickly.

Image Comparison

What is the efficient way to compare two images in visual c..?
Also in which format images has to be stored.(bmp, gif , jpeg.....)?
Please provide some suggestions
If the images you are trying to compare have distinctive characteristics that you are trying to differentiate then PCA is an excellent way to go. The question of what format of the file you need is irrelevant really; you need to load it into the program as an array of numbers and do analysis.
Your question opens a can of worms in terms of complexity.
If you want to compare two images to check if they are the same, then you need to perform an md5 on the file (removing possible metainfos which could distort your result).
If you want to compare if they look the same, then it's a completely different story altogether. "Look the same" is intended in a very loose meaning (e.g. they are exactly the same image but stored with two different file formats). For this, you need advanced algorithms, which will give you a probability for two images to be the same. Not being an expert in the field, I would perform the following "invented out of my head" algorithm:
take an arbitrary set of pixel points from the image.
for each pixel "grow" a polygon out of the surrounding pixels which are near in color (according to HSV colorspace)
do the same for the other image
for each polygon of one image, check the geometrical similitude with all the other polygons in the other image, and pick the highest value. Divide this value by the area of the polygon (to normalize).
create a vector out of the highest values obtained
the higher is the norm of this vector, the higher is the chance that the two images are the same.
This algorithm should be insensitive to color drift and image rotation. Maybe also scaling (you normalize against the area). But I restate: not an expert, there's probably much better, and it could make kittens cry.
I did something similar to detect movement from a MJPEG stream and record images only when movement occurs.
For each decoded image, I compared to the previous using the following method.
Resize the image to effectively thumbnail size (I resized fairly hi-res images down by a factor of ten
Compare the brightness of each pixel to the previous image and flag if it is much lighter or darker (threshold value 1)
Once you've done that for each pixel, you can use the count of different pixels to determine whether the image is the same or different (threshold value 2)
Then it was just a matter of tuning the two threshold values.
I did the comparisons using System.Drawing.Bitmap, but as my source images were jpg, there were some artifacting.
It's a nice simple way to compare images for differences if you're going to roll it yourself.
If you want to determine if 2 images are the same perceptually, I believe the best way to do it is using an Image Hashing algorithm. You'd compute the hash of both images and you'd be able to use the hashes to get a confidence rating of how much they match.
One that I've had some success with is pHash, though I don't know how easy it would be to use with Visual C. Searching for "Geometric Hashing" or "Image Hashing" might be helpful.
Testing for strict identity is simple: Just compare every pixel in source image A to the corresponding pixel value in image B. If all pixels are identical, the images are identical.
But I guess don't want this kind of strict identity. You probably want images to be "identical" even if certain transformations have been applied to image B. Examples for these transformations might be:
changing image brightness globally (for every pixel)
changing image brightness locally (for every pixel in a certain area)
changing image saturation golbally or locally
gamma correction
applying some kind of filter to the image (e.g. blurring, sharpening)
changing the size of the image
rotation
e.g. printing an image and scanning it again would probably include all of the above.
In a nutshell, you have to decide which transformations you want to treat as "identical" and then find image measures that are invariant to those transformations. (Alternatively, you could try to revert the translations, but that's not possible if the transformation removes information from the image, like e.g. blurring or clipping the image)

Adaptive threshold Binarization's bad effects

I implemented some adaptive binarization methods, they use a small window and at each pixel the threshold value is calculated. There are problems with these methods:
If we select the window size too small we will get this effect (I think the reason is because of window size is small)
(source: piccy.info)
At the left upper corner there is an original image, right upper corner - global threshold result. Bottom left - example of dividing image to some parts (but I am talking about analyzing image's pixel small surrounding, for example window of size 10X10).
So you can see the result of such algorithms at the bottom right picture, we got a black area, but it must be white.
Does anybody know how to improve an algorithm to solve this problem?
There shpuld be quite a lot of research going on in this area, but unfortunately I have no good links to give.
An idea, which might work but I have not tested, is to try to estimate the lighting variations and then remove that before thresholding (which is a better term than "binarization").
The problem is then moved from adaptive thresholding to finding a good lighting model.
If you know anything about the light sources then you could of course build a model from that.
Otherwise a quick hack that might work is to apply a really heavy low pass filter to your image (blur it) and then use that as your lighting model. Then create a difference image between the original and the blurred version, and threshold that.
EDIT: After quick testing, it appears that my "quick hack" is not really going to work at all. After thinking about it I am not very surprised either :)
I = someImage
Ib = blur(I, 'a lot!')
Idiff = I - Idiff
It = threshold(Idiff, 'some global threshold')
EDIT 2
Got one other idea which could work depending on how your images are generated.
Try estimating the lighting model from the first few rows in the image:
Take the first N rows in the image
Create a mean row from the N collected rows. You know have one row as your background model.
For each row in the image subtract the background model row (the mean row).
Threshold the resulting image.
Unfortunately I am at home without any good tools to test this.
It looks like you're doing adaptive thresholding wrong. Your images look as if you divided your image into small blocks, calculated a threshold for each block and applied that threshold to the whole block. That would explain the "box" artifacts. Usually, adaptive thresholding means finding a threshold for each pixel separately, with a separate window centered around the pixel.
Another suggestion would be to build a global model for your lighting: In your sample image, I'm pretty sure you could fit a plane (in X/Y/Brightness space) to the image using least-squares, then separate the pixels into pixels brighter (foreground) and darker than that plane (background). You can then fit separate planes to the background and foreground pixels, threshold using the mean between these planes again and improve the segmentation iteratively. How well that would work in practice depends on how well your lightning can be modeled with a linear model.
If the actual objects you try to segment are "thinner" (you said something about barcodes in a comment), you could try a simple opening/closing operation the get a lighting model. (i.e. close the image to remove the foreground pixels, then use [closed image+X] as threshold).
Or, you could try mean-shift filtering to get the foreground and background pixels to the same brightness. (Personally, I'd try that one first)
You have very non-uniform illumination and fairly large object (thus, no universal easy way to extract the background and correct the non-uniformity). This basically means you can not use global thresholding at all, you need adaptive thresholding.
You want to try Niblack binarization. Matlab code is available here
http://www.uio.no/studier/emner/matnat/ifi/INF3300/h06/undervisningsmateriale/week-36-2006-solution.pdf (page 4).
There are two parameters you'll have to tune by hand: window size (N in the above code) and weight.
Try to apply a local adaptive threshold using this procedure:
convolve the image with a mean or median filter
subtract the original image from the convolved one
threshold the difference image
The local adaptive threshold method selects an individual threshold for each pixel.
I'm using this approach extensively and it's working fine with images having non uniform background.

Resources