I have some color images needing segmentation. They are images of slides that are stained with hematoxylin and eosin ("H&E").
I found this method for color deconvolution by Ruifrok
http://europepmc.org/abstract/med/11531144
that separates out the images by color.
However it seems that you can do something similar just by using K-means clustering:
http://www.mathworks.com/help/images/examples/color-based-segmentation-using-k-means-clustering.html
I am curious what the difference is. Any insight would be welcome. Thanks.
I can't seem to find a copy of the article (well without paying) but they are not exactly the same.
K means seeks to cluster data. So if you just want to find dominant colors in an image, or do some sorting based on colors, this is the way to go. As a side note: Kmeans can be used on any vector. Its not confined to color, so you can use it for many other applications.
Color Deconvolution is trying to remove the effects of chemical dyes commonly used for microscopy. (If I understood the abstract properly). Based on the specific dye used, the algorithm tries to reverse its effects and give you the original color image back (before the dye was added). I found this website that shows some output. This is deconvolving the dye contribution to the RGB spectrum. It doesn't do any clustering/grouping (other than finding the dye)
Hope that helps
EDIT
If you didn't know, convolution is most often associated with signals/image processing. Basically you take a filter and run it over a signal. The output is a modified version of the original input. In this case, the original image is filtered by a dye with known RGB values. IF we know the full characteristics of the dye/filter we can invert it. Then by running the convolution again using the inverse filter we can hopefully de -convolve the effect. In principle it sounds simple enough, but in many cases this isn't possible.
I don't think I'm going to get any replies but here goes: I'm developing an iOS app that performs image segmentation functions. I'm trying to implement the easiest way to crop out a subject from an image without the need of a greenscreen/keying. Most automated solutions like using OpenCV just aren't cutting it.
I've found the rotoscope brush tool in After Effects to be effective at giving hints on where the app should be cutting out. Anyone know what kind of algorithms the rotoscope brush tool is using?
Check out this page, which contains a couple of video presentations from SIGGRAPH (a computer graphics conference) about the Roto Brush tool. Also take a look at Jue Wang's paper on Video SnapCut. As Damien guessed, object extraction relies on some pretty intense image processing algorithms. You might be able to implement something similar in OpenCV depending on how clever/masochistic you're feeling.
The algorithm is a graph-cut based segmentation algorithm where Gaussian Mixture Models (GMM) are trained using color pixels in "local" regions as well as "globally", together with some sort of shape prior.
OpenCV has a "cheap hack" implementation of the "GrabCut" paper where the user specifies a bounding box around the object he wish to segment. Typically, using just the bounding box will not give good results. You will need the user to specify the "foreground" and "background" pixels (as is done in Adobe's Rotoscoping tool) to help the algorithm build foreground and background color models (in this case GMMs) so that it will know what are the typical colors in the foreground object you wish to segment, and those for the background that you want to leave out.
A basic graph-cut implementation can be found on this blog. You can probably start from there and experiment with different ways to compute the cost terms to get better results.
Lastly, the "soften" the edges, a cheap hack is to blur the binary mask to obtain a mask with values between 0 and 1. Then recomposite your image using the mask i.e. c[i][j] = mask[i][j] * fgd[i][j] + (1 - mask[i][j]) * bgd[i][j], where you are blending the foreground you segmented (fgd), with a new background image (bgd) using the mask values as blending weights.
Suppose I have gray-scale photographic pictures of texts sheets. Each sheet of paper is exactly white and text is exactly black.
Unfortunately, the light is not uniform, also perspective shading occurs, also sheets of papers may be curved. Of course, there are some small hi freq noise on an image.
I AM SURE that there should be nearly IDEAL solution to separate text and background in this situation.
So what is it? :)
I don't believe it is impossible or even hard to turn such gray-scale images into nearly perfect black and white pictures. I cant prove this but I judge on my own perception: I need no any intelligence to recognize such pictures by an eye. They can be in any language even unfamiliar, but I will SEE what is written exactly.
So, how to teach computer to do the same?
UPDATE
Consider original image
Any global thresolding will cause artefacts (1) and nonuniform text representation (2)
I need some thresolding, which looks for local statistics.
Switch to adaptive thresholding.
Here you will find some introduction - http://homepages.inf.ed.ac.uk/rbf/HIPR2/adpthrsh.htm
Adaptive thresholding is designed to deal with exactly this kind of problems.
Is HSL superior over HSI and HSV, because it takes human perception into account.?
For some image processing algorithms they say I can use either of these color spaces,
and I am not sure which one to pick. I mean, the algorithms just care that you provide
them with hue and saturation channel, you can pick which color space to use
Which one is best very much depends on what you're using it for. But in my experience HSL (HLS) has an unfortunate interaction between brightness and saturation.
Here's an example of reducing image brightness by 2. The leftmost image is the original; next comes the results using RGB, HLS, and HSV:
Notice the overly bright and saturated spots around the edge of the butterfly in HLS, particularly that red spot at the bottom. This is the saturation problem I was referring to.
This example was created in Python using the colorsys module for the conversions.
Since there is no accepted answer yet, and since I had to further research to fully understand this, I'll add my two cents.
Like others have said the answer as to which of HSL or HSV is better depends on what you're trying to model and manipulate.
tl;dr - HSV is only "better" than HSL for machine vision (with caveats, read below). "Lab" and other formal color models are far more accurate (but computationally expensive) and should really be used for more serious work. HSL is outright better for "paint" applications or any other where you need a human to "set", "enter" or otherwise understand/make sense of a color value.
For details, read below:
If you're trying to model how colours are GENERATED, the most intuitive model is HSL since it maps almost directly to how you'd mix paints to create colors. For example, to create "dark" yellow, you'd mix your base yellow paint with a bit of black. Whereas to create a lighter shade of yellow, you'd mix a bit of white.
Values between 50 and 0 in the "L" spectrum in HSL map to how much "black" has to be mixed in (black increasing from 0 to 100%, as L DECREASES from 50 to 0).
Values between 50 and 100 map to how much "white" has to be mixed in (white varying from 0 to 100% as L increases from 50 to 100%).
50% "L" gives you the "purest" form of the color without any "contamination" from white or black.
Insights from the below links:
1. http://forums.getpaint.net/index.php?/topic/22745-hsl-instead-of-hsv/
The last post there.
2. http://en.wikipedia.org/wiki/HSL_and_HSV
Inspect the color-space cylinder for HSL - it gives a very clear idea of the kind of distribution I've talked about.
Plus, if you've dealt with paints at any point, the above explanation will (hopefully) make sense. :)
Thus HSL is a very intuitive way of understanding how to "generate" a color - thus it's a great model for paint applications, or any other applications that are targeted to an audience used to thinking in "shade"/"tone" terms for color.
Now, onto HSV.
This is treacherous territory now as we get into a space based on a theory I HAVE FORMULATED to understand HSV and is not validated or corroborated by other sources.
In my view, the "V" in HSV maps to the quantity of light thrown at an object, with the assumption, that with zero light, the object would be completely dark, and with 100% light, it would be all white.
Thus, in this image of an apple, the point that is directly facing the light source is all white, and most likely has a "V" at 100% whereas the point at the bottom that is completely in shadow and untouched by light, has a value "0". (I haven't checked these values, just thought they'd be useful for explanation).
Thus HSV seems to model how objects are lit (and therefore account for any compensation you might have to perform for specular highlights or shadows in a machine vision application) BETTER than HSL.
But as you can see quite plainly from the examples in the "disadvantages" section in the Wikipedia article I linked to, neither of these methods are perfect. "Lab" and other more formal (and computationally expensive) color models do a far better job.
P.S: Hope this helps someone.
The only color space that has advantage and takes human perception into account is LAB, in the sense that the Euclidian metric in it is correlated with human color differentiation.
Taken directly from Wikipedia:
Unlike the RGB and CMYK color models, Lab color is designed to
approximate human vision. It aspires to perceptual uniformity, and its
L component closely matches human perception of lightness
That is the reason that many computer vision algorithms are taking advantage of LAB space
HSV, HSB and HSI don't have this property. So the answer is no, HSL is not "superior" over HSI and HSV in the sense of human perception.
If you want to be close to human perception, try LAB color space.
I would say that one is NO better than another, each is just a mathematical conversion of another. Differing representations CAN make manipulation of an image for the effect you wish a bit easier. Each person WILL perceive images a bit differently, and using HSI or HSV may provide a small difference in output image.
Even RGB when considered against a system (i.e. with pixel array) takes into account human perception. When an imager (with a bayer overlay) takes a picture, there are 2 green pixels for every 1 red and blue pixel. Monitors still output in RGB (although most only have a single green pixel for each red and blue). A new TV monitor made by Sharp now has a yellow output pixel. The reason they have done this is due to there being a yellow band in the actual frequency spectrum, so to better truly represent color, they have added a yellow band (or pixel).
All of these things are based on the human eye having a greater sensitivity to green over any other color in the spectrum.
Regardless, whatever scale you use, the image will be transformed back to RGB to be displayed on screen.
http://hyperphysics.phy-astr.gsu.edu/hbase/vision/colcon.html
http://www.physicsclassroom.com/class/light/u12l2b.cfm
In short, I dont think any one is better than another, just different representations.
http://en.wikipedia.org/wiki/Color
Imma throw my two cents in here being both a programmer and also a guy who aced Color Theory in art school before moving on to software engineering career wise.
HSL/HSV are great for easily writing programmatic functionality to handle color without dealing with a ton of edge cases. They are terrible at replicating human perception of color accurately.
CMYK is great for rendering print stuff, because it approximates the pigments that printers rely on. It is also terrible at replicating human perception of color accurately (although not because it's bad per se, but more because computers are really bad at displaying it on a screen. More on that in a minute).
RGB is the only color utility represented in tech that accurately reflects human vision effectively. LAB is essentially just resolving to RGB under the hood. It is also worth considering that the literal pixels on your screen are representations of RGB, which means that any other color space you work with is just going to get parsed back into RGB anyways when it actually displays. Really, it's best to just cut out the middleman and use that in almost every single case.
The problem with RGB in a programming sense, is that it is essentially cubic in representation, whereas HSL/HSV both resolve in a radius, which makes it much easier to create a "color wheel" programmatically. RGB is very difficult to do this with without writing huge piles of code to handle, because it resolves cubically in terms of its data representation. However, RGB accurately reflects human vision very well, and it's also the foundational basis of the actual hardware a monitor consists of.
TLDR; If you want dead on color and don't mind the extra work, use RGB all of the time. If you want to bang out a "good enough" color utility and probably field bug tickets later that you won't be able to really do anything about, use HSL/HSV. If you are doing print, use CMYK, not because it's good, but because the printer will choke if you don't use it, even though it otherwise sucks.
As an aside, if you were to approach Color Theory like an artist instead of a programmer, you are going to find a very different perception than any technical specifications about color really impart. Bear in mind that anyone working with a color utility you create is basically going to be thinking along these lines, at least if they have a solid foundational education in color theory. Here's basically how an artist approaches the notion of color:
Color from an artistic perspective is basically represented on a scale of five planes.
Pigment (or hue), which is the actual underlying color you are going after.
Tint, which is the pigment mixed with pure white.
Shade, which is the pigment mixed with pure black.
Tone (or "True Tone"), which is the pigment mixed with a varying degree of gray.
Rich Tone (or "Earth Tones"), which is the pigment mixed with its complementary color. Rich tones do not show up on the color wheel because they are inherently a mix of opposites, and visually reflect slightly differently than a "True Tone" due to minute discrepancies in physical media that you can't replicate effectively on a machine.
The typical problem with representing this paradigm programmatically is that there is not really any good way to represent rich tones. A material artist has basically no issue doing this with paint, because the subtle discrepancies of brush strokes allow the underlying variance between the complements to reflect in the composition. Likewise digital photography and video both suck at picking this up, but actual analog film does not suck nearly as bad at it. It is more reflected in photography and video than computer graphics because the texture of everything in the viewport of the camera picks up some of it, but is is still considerably less than actually viewing the same thing (which is why you can never take a really good picture of a sunset without a ton of post production to hack the literal look of it back in, for example). However, computers are not good at replicating those discrepancies, because a color is basically going to resolve to a consistent matrix of RGB pixel mapping which visually appears to be a flat regular tone. There is no computational color space that accurately reflects rich tones, because there is no computational way to make a color vary slightly in a diffuse, non-repeating random way over space and still have a single unique identifier, and you can't very well store it as data without a unique identifier.
The best approximation you can do of this with a computer is to create some kind of diffusion of one color overlapping another color, which does not resolve to a single value that you can represent as a hex code or stuff in a single database column. Even then, a computer is going to inherently reflect a uniform pattern, where a real rich tone relies on randomness and non-repeating texture and variance, which you can't do on a machine without considerable effort. All of the artwork that really makes color pop relies on this principle, and it is basically inaccessible to computational representation without a ton of side work to emulate it (which is why we have Photoshop and Corel Painter, because they can emulate this stuff pretty well with a bit of work, but at the cost of performing a lot of filtering that is not efficient for runtime).
RGB is a pretty good approximation of the other four characteristics from an artistic perspective. We pretty much get that it's not going to cover rich tones and that we're going to have to crack out a design utility and mash that part in by hand. However the underlying problem with programming in RGB is that it wants to resolve to a three dimensional space (because it is cubic), and you are trying to present it on a two dimensional display, which makes it very difficult to create UI that is reasonably intuitive because you lack the capacity to represent the depth of a 3rd axis on a computer monitor effectively in any way that is ever going to be intuitive to use for an end user.
You also need to consider the distinction between color represented as light, and color represented as pigment. RGB is a representation of color represented as light, and corresponds to the primary values used to mix lighting to represent color, and does so with a 1:1 mapping. CMYK represents the pigmentation spectrum. The distinction is that when you mix light in equal measure, you get white, and when you mix pigment in equal measure, you get black. If you are programming any utility that uses a computer, you are working with light, because pixels are inherently a single node on a monitor that emits RGB light waves. The reason I said that CMYK sucks, is not because it's not accurate, it's because it's not accurate when you try to represent it as light, which is the case on all computer monitors. If you are using actual paint, markers, colored pencils, etc, it works just fine. However representing CMYK on a screen still has to resolve to RGB, because that is how a computer monitor works, so it's always off a bit in terms of how it looks in display.
Not to go off on a gigantic side tangent, as this is a programming forum and you asked the question as a programmer. However if you are going for accuracy, there is a distinct "not technical" aspect to consider in terms of how effective your work will be at achieving its desired objective, which is to resolve well against visual perception, which is not particularly well represented in most computational color spaces. At the end of the day, the goal with any color utility is to make it look right in terms of human perception of color. HSL/HSV both fail miserably at that. They are prominent because they are easy to code with, and only for that reason. If you have a short deadline, they are acceptable answers. If you want something that is really going to work well, then you need to do the heavy legwork and consider this stuff, which is what your audience is considering when they decide if they want to use your tool or not.
Some reference points for you (I'm purposely avoiding any technical references, as they only refer to computational perspective, not the actual underlying perception of color, and you've probably read all of those already anyhow):
Color Theory Wiki
Basic breakdown of hue, tint, tone, and shade
Earth Tones (or rich tones if you prefer)
Basic fundamentals of color schemes
Actually, I'd have to argue that HSV accounts better for human visual perception as long as you understand that in HSV, saturation is the purity of the color and value is the intensity of that color, not brightness overall. Take this image, for example...
Here is a mapping of the HSL saturation (left) and HSL luminance (right)...
Note that the saturation is 100% until you hit the white at the very top where it drops suddenly. This mapping isn't perceived when looking at the original image. The same goes for the luminance mapping. While it's a clearer gradient, it only vaguely matches visually. Compare that to HSV saturation (left) and HSV value (right) below...
Here the saturation mapping can be seen dropping as the color becomes more white. Likewise, the value mapping can be very clearly seen in the original image. This is made more obvious when looking at the mappings for the individual color channels of the original image (the non-black areas almost perfectly match the value mapping, but are nowhere close to the luminance mapping)...Going by this information, I would have to say that HSV is better for working with actual images (especially photographs) whereas HSL is possibly better only for selecting colors in a color picker.
On a side note, the value in HSV is the inverse of the black in CMYK.
Another argument for the use of HSV over HSL is that HSV has much fewer combinations of different values that can result in the same color since HSL loses about half of its resolution to its top cone. Let's say you used bytes to represent the components--thereby giving each component 256 unique levels. The maximum number of unique RGB outputs this will yield in HSL is 4,372,984 colors (26% of the available RGB gamut). In HSV this goes up to 9,830,041 (59% of the RGB gamut)... over twice as many. And allowing a range of 0 to 359 for hue will yield 11,780,015 for HSV yet only 5,518,160 for HSL.
I started to work on a small hobbyist program recently in the area of image processing, and I'm kind of a noob with image processing but I'm trying to figure out at least some aspects of it.
What I want to be able to do is separate between objects in an image by their color (preferably in a real-time video feed), and then recognize their color.
I read a little about OpenCV and also about some of the different algorithms.
I even started to work a little with the canny algorithm, but I'm not sure this is the algorithm I should begin with for my need, as it detects edges of objects regardless of their color.
Even if it is the algorithm I should use, what would be the best method to recognize the colors of the objects it marked for me?
I hope I made myself clear enough.
Thanks a lot!
Understand colour spaces - RGB is almost always the worst source to do image processing on.
Start with HSL and HSV
To separate or to make a color transparent (for instance to remove it) is very simple with OpenCV... I posted an answer (see the link below) which should help you (or maybe solve your problem).
Here is the code I posted
Moreover, the answer from Martin Beckett is absolutely right, RGB is not a good color space to evaluate a color, you can use HSV, the hue value in degree tells you the proper color (something you could compare from a wavelength in a light spectrum) while S and V code a sort of intensity (what I say is to simplify in order to explain that in many cases to use Hue to segment color images is enough).
Even if it is the algorithm I should use, what would be the best
method to recognize the colors of the objects it marked for me?
The type of algorithm you are searching for is called color segmentation... Here is a tutorial which could help you as well.
Welcome to image processing community,
Julien,
For starter you should learn about image array operation, for example using OpenCV function inRange to filter colors by minimum to maximum color range. Another option is by splitting multi-channel array (in this case R, G and B) to 3 different single-channel for further examination. Hope its help