Is a change in colorspace needed when comparing colors only evaluated by the computer? - opencv

I am confused about the need to change color space for color comparison. I have read about delta E, the Lab format, and I do understand that comparisons in the RGB color space will not seem appropriate to the human eye. However, my program uses a linear color scale to calculate velocity, from a color flow Doppler signal. It takes the mean color of a sample region and compares it to the colors of the scale to find its nearest neighbor using Euclidian distance. I do that entirely in the BGR (OpenCV) color space, as the example image below:
Here, I obtain seemingly correct velocity values for each color circle, but is it only by chance, or is my assumption correct that since the color comparisons take place internally, it does not matter what color space I am in?

Since you searchind for nearest neighbour, and operate with 3D points (in color space) it does not matter what color space you choose, they will only be displayed in different ways.

Comparison of colour is not straight forward. You need to decide what defines a colour being close to another and then pick the most appropriate colour space to support that.
For example, working in HSL will give you an easy way to assess colours based upon the hue. This is fine if you are happy to disregard, or at least reduce the relevance of saturation and luminance.
If on the other hand, you want a point change in saturation to be a relevant as a point change in hue, working in RGB or perhaps CMYK would be more appropriate. Measuring the distance by plotting the channels as three axis and then creating a distance between the two colours. This has the downside that a 10 point shift in saturation has the same measured difference as a 10 point shift in hue, which visually will not make that much sense as the perceived difference will not be equivalent to the mathematical.
And that brings in another consideration. The human eye is more sensitive to colour variance around different colours. Green for example, takes more variation to be noticeable than magentas. All down to evolution but may have a bearing in your representation.
Personally I tend to work with RGB as it is needed for visual display, but most commonly I will arrange colours by hue so keep a conversion handy to HSL/ HSB.

Related

Most prevalent color on a background by changing color space

I have a sheet of paper on another surface. I want to figure out when the paper ends and the surface begins.
I'm using this approach:
1. Convert image to pixel array
2. Pick 3 random 20x20 squares and frequency count the colors
3. The highest frequency is the background
However, the problem is that I get over 100 colors every time I do this on an actual image taken by the camera.
I think I can fix it by putting the image in 16 colors palette. Is there a way to do this change on a UIImage or CGImage?
Thanks
Your colours are probably very close together. How about calculating the distance (the cumulative absolute difference between red, green and blue values) from each sampled colour to a reference colour - just use the first one you sample as reference. If the distance is large, you have a different colour. If the distance is small, you have the same colour with minor variations in lighting or other camera artefacts.
Basically this is applying a filter in a very simple manner. It is up to you to decide how big the difference has to be for the colours to be considered different, but you could decide that by looking at the median difference of all the colours and grouping them into over/under samples.
You might also get good results from applying a Core Image filter to the sample images, such as CIColorClamp (CISpotColor looks better but is OS X only). if you can find a suitable filter there is a good chance it will be simpler and faster than doing it yourself.

Robust color tracking vs exposure and white balance changes

Does anyone had the experience with color matching and frame-to-frame tracking of it video when the exposure settings and white balance are constantly changing?
I'm working on color tracking app that uses iPad 2 frontal camera to capture video. I need to detect colored objects (of predefined color we took earlier) on each frame. My problem is that the camera software are like to adjust WB and Exposure each frame. So if we remember one color at frame N, on N+10 frame the WB will be different and this can lead to big difference in color.
For calculating color distance I'm using LAB color space and CIE76 formula:
Yes, i know there is much better CIEDE2000 distance function, but I'm working with ARM processor and I'm afraid this formula will be too heavy even for ARM NEON manually optimized assembly code that i use already.
CIE76 provides a good results in general, but in poor or very bright lighting scenes the camera either generate too much noise or over-saturate the image so the colors becomes too far from their original. In addition to simple thresholding using color distance i implemented per-component thresholding of LAB pixel values based on standard deviation of the calibrated color. This had also increased correctness of the detection, however, this isn't solving the main issue.
The camera itself provide frames in RGB color space, but the API doesn't provide functions to get white point or color temperature of the current frame. Currently i assume D50 illuminant to perform RGB -> LAB conversion.
And this is the my main doubt. My idea is to compute the white point of the given RGB image, and then convert it to XYZ color space and then convert XYZ to LAB using calculated white point. Is it possible?
From Wikipedia: White Point
Expressing color as tristimulus coordinates in the LMS color space, one can "translate" the object's color according to the von Kries transform simply by scaling the LMS coordinates by the ratio of the maximum of the tristimulus values at both white points. This provides a simple, but rough estimate.
http://en.wikipedia.org/wiki/White_point
Are this going to work? Or there is a better way to calculate white point (even roughly)? By the way, i came out for Retinex algorithm, which demonstrate good color enhancement in shadows, does anyone used it? What it's pros and cons?

How to use CIELAB to obtain illumination invariance in image processing?

I found out that taking the Euclidean distance in RGB space to compare two colors in applications like image segmentation is not recommended because of its dependence on illumination and lighting conditions. Furthermore, because of the numerical instability of the HSV hue value at low intensity, the CIELAB color space is said to be a better alternative.
My problem is that I don't understand how to actually use it: Since CIELAB is device independent, you cannot simply convert to it from some RGB values without knowing anything about the sensor that was used to obtain these RGB values. As far as I know, you have to convert to CIEXYZ in an intermediate step first, but there are several different matrices available depending on the exact RGB working space of the source.
Or is it irrelevant which matrix you choose if you only want to use CIELAB to compare two colors (as I said, for example to perform image segmentation)?
If you don't know the exact color space that you're converting from, you may use sRGB - it was designed to be a generic space that corresponded to the average monitor of the time. It won't be exact of course, but it's likely to be acceptable. As you observe, perfect accuracy shouldn't be necessary for image segmentation, as the relative distances between colors won't be materially affected.

Is HSL Superior over HSI and HSV Color Spaces?

Is HSL superior over HSI and HSV, because it takes human perception into account.?
For some image processing algorithms they say I can use either of these color spaces,
and I am not sure which one to pick. I mean, the algorithms just care that you provide
them with hue and saturation channel, you can pick which color space to use
Which one is best very much depends on what you're using it for. But in my experience HSL (HLS) has an unfortunate interaction between brightness and saturation.
Here's an example of reducing image brightness by 2. The leftmost image is the original; next comes the results using RGB, HLS, and HSV:
Notice the overly bright and saturated spots around the edge of the butterfly in HLS, particularly that red spot at the bottom. This is the saturation problem I was referring to.
This example was created in Python using the colorsys module for the conversions.
Since there is no accepted answer yet, and since I had to further research to fully understand this, I'll add my two cents.
Like others have said the answer as to which of HSL or HSV is better depends on what you're trying to model and manipulate.
tl;dr - HSV is only "better" than HSL for machine vision (with caveats, read below). "Lab" and other formal color models are far more accurate (but computationally expensive) and should really be used for more serious work. HSL is outright better for "paint" applications or any other where you need a human to "set", "enter" or otherwise understand/make sense of a color value.
For details, read below:
If you're trying to model how colours are GENERATED, the most intuitive model is HSL since it maps almost directly to how you'd mix paints to create colors. For example, to create "dark" yellow, you'd mix your base yellow paint with a bit of black. Whereas to create a lighter shade of yellow, you'd mix a bit of white.
Values between 50 and 0 in the "L" spectrum in HSL map to how much "black" has to be mixed in (black increasing from 0 to 100%, as L DECREASES from 50 to 0).
Values between 50 and 100 map to how much "white" has to be mixed in (white varying from 0 to 100% as L increases from 50 to 100%).
50% "L" gives you the "purest" form of the color without any "contamination" from white or black.
Insights from the below links:
1. http://forums.getpaint.net/index.php?/topic/22745-hsl-instead-of-hsv/
The last post there.
2. http://en.wikipedia.org/wiki/HSL_and_HSV
Inspect the color-space cylinder for HSL - it gives a very clear idea of the kind of distribution I've talked about.
Plus, if you've dealt with paints at any point, the above explanation will (hopefully) make sense. :)
Thus HSL is a very intuitive way of understanding how to "generate" a color - thus it's a great model for paint applications, or any other applications that are targeted to an audience used to thinking in "shade"/"tone" terms for color.
Now, onto HSV.
This is treacherous territory now as we get into a space based on a theory I HAVE FORMULATED to understand HSV and is not validated or corroborated by other sources.
In my view, the "V" in HSV maps to the quantity of light thrown at an object, with the assumption, that with zero light, the object would be completely dark, and with 100% light, it would be all white.
Thus, in this image of an apple, the point that is directly facing the light source is all white, and most likely has a "V" at 100% whereas the point at the bottom that is completely in shadow and untouched by light, has a value "0". (I haven't checked these values, just thought they'd be useful for explanation).
Thus HSV seems to model how objects are lit (and therefore account for any compensation you might have to perform for specular highlights or shadows in a machine vision application) BETTER than HSL.
But as you can see quite plainly from the examples in the "disadvantages" section in the Wikipedia article I linked to, neither of these methods are perfect. "Lab" and other more formal (and computationally expensive) color models do a far better job.
P.S: Hope this helps someone.
The only color space that has advantage and takes human perception into account is LAB, in the sense that the Euclidian metric in it is correlated with human color differentiation.
Taken directly from Wikipedia:
Unlike the RGB and CMYK color models, Lab color is designed to
approximate human vision. It aspires to perceptual uniformity, and its
L component closely matches human perception of lightness
That is the reason that many computer vision algorithms are taking advantage of LAB space
HSV, HSB and HSI don't have this property. So the answer is no, HSL is not "superior" over HSI and HSV in the sense of human perception.
If you want to be close to human perception, try LAB color space.
I would say that one is NO better than another, each is just a mathematical conversion of another. Differing representations CAN make manipulation of an image for the effect you wish a bit easier. Each person WILL perceive images a bit differently, and using HSI or HSV may provide a small difference in output image.
Even RGB when considered against a system (i.e. with pixel array) takes into account human perception. When an imager (with a bayer overlay) takes a picture, there are 2 green pixels for every 1 red and blue pixel. Monitors still output in RGB (although most only have a single green pixel for each red and blue). A new TV monitor made by Sharp now has a yellow output pixel. The reason they have done this is due to there being a yellow band in the actual frequency spectrum, so to better truly represent color, they have added a yellow band (or pixel).
All of these things are based on the human eye having a greater sensitivity to green over any other color in the spectrum.
Regardless, whatever scale you use, the image will be transformed back to RGB to be displayed on screen.
http://hyperphysics.phy-astr.gsu.edu/hbase/vision/colcon.html
http://www.physicsclassroom.com/class/light/u12l2b.cfm
In short, I dont think any one is better than another, just different representations.
http://en.wikipedia.org/wiki/Color
Imma throw my two cents in here being both a programmer and also a guy who aced Color Theory in art school before moving on to software engineering career wise.
HSL/HSV are great for easily writing programmatic functionality to handle color without dealing with a ton of edge cases. They are terrible at replicating human perception of color accurately.
CMYK is great for rendering print stuff, because it approximates the pigments that printers rely on. It is also terrible at replicating human perception of color accurately (although not because it's bad per se, but more because computers are really bad at displaying it on a screen. More on that in a minute).
RGB is the only color utility represented in tech that accurately reflects human vision effectively. LAB is essentially just resolving to RGB under the hood. It is also worth considering that the literal pixels on your screen are representations of RGB, which means that any other color space you work with is just going to get parsed back into RGB anyways when it actually displays. Really, it's best to just cut out the middleman and use that in almost every single case.
The problem with RGB in a programming sense, is that it is essentially cubic in representation, whereas HSL/HSV both resolve in a radius, which makes it much easier to create a "color wheel" programmatically. RGB is very difficult to do this with without writing huge piles of code to handle, because it resolves cubically in terms of its data representation. However, RGB accurately reflects human vision very well, and it's also the foundational basis of the actual hardware a monitor consists of.
TLDR; If you want dead on color and don't mind the extra work, use RGB all of the time. If you want to bang out a "good enough" color utility and probably field bug tickets later that you won't be able to really do anything about, use HSL/HSV. If you are doing print, use CMYK, not because it's good, but because the printer will choke if you don't use it, even though it otherwise sucks.
As an aside, if you were to approach Color Theory like an artist instead of a programmer, you are going to find a very different perception than any technical specifications about color really impart. Bear in mind that anyone working with a color utility you create is basically going to be thinking along these lines, at least if they have a solid foundational education in color theory. Here's basically how an artist approaches the notion of color:
Color from an artistic perspective is basically represented on a scale of five planes.
Pigment (or hue), which is the actual underlying color you are going after.
Tint, which is the pigment mixed with pure white.
Shade, which is the pigment mixed with pure black.
Tone (or "True Tone"), which is the pigment mixed with a varying degree of gray.
Rich Tone (or "Earth Tones"), which is the pigment mixed with its complementary color. Rich tones do not show up on the color wheel because they are inherently a mix of opposites, and visually reflect slightly differently than a "True Tone" due to minute discrepancies in physical media that you can't replicate effectively on a machine.
The typical problem with representing this paradigm programmatically is that there is not really any good way to represent rich tones. A material artist has basically no issue doing this with paint, because the subtle discrepancies of brush strokes allow the underlying variance between the complements to reflect in the composition. Likewise digital photography and video both suck at picking this up, but actual analog film does not suck nearly as bad at it. It is more reflected in photography and video than computer graphics because the texture of everything in the viewport of the camera picks up some of it, but is is still considerably less than actually viewing the same thing (which is why you can never take a really good picture of a sunset without a ton of post production to hack the literal look of it back in, for example). However, computers are not good at replicating those discrepancies, because a color is basically going to resolve to a consistent matrix of RGB pixel mapping which visually appears to be a flat regular tone. There is no computational color space that accurately reflects rich tones, because there is no computational way to make a color vary slightly in a diffuse, non-repeating random way over space and still have a single unique identifier, and you can't very well store it as data without a unique identifier.
The best approximation you can do of this with a computer is to create some kind of diffusion of one color overlapping another color, which does not resolve to a single value that you can represent as a hex code or stuff in a single database column. Even then, a computer is going to inherently reflect a uniform pattern, where a real rich tone relies on randomness and non-repeating texture and variance, which you can't do on a machine without considerable effort. All of the artwork that really makes color pop relies on this principle, and it is basically inaccessible to computational representation without a ton of side work to emulate it (which is why we have Photoshop and Corel Painter, because they can emulate this stuff pretty well with a bit of work, but at the cost of performing a lot of filtering that is not efficient for runtime).
RGB is a pretty good approximation of the other four characteristics from an artistic perspective. We pretty much get that it's not going to cover rich tones and that we're going to have to crack out a design utility and mash that part in by hand. However the underlying problem with programming in RGB is that it wants to resolve to a three dimensional space (because it is cubic), and you are trying to present it on a two dimensional display, which makes it very difficult to create UI that is reasonably intuitive because you lack the capacity to represent the depth of a 3rd axis on a computer monitor effectively in any way that is ever going to be intuitive to use for an end user.
You also need to consider the distinction between color represented as light, and color represented as pigment. RGB is a representation of color represented as light, and corresponds to the primary values used to mix lighting to represent color, and does so with a 1:1 mapping. CMYK represents the pigmentation spectrum. The distinction is that when you mix light in equal measure, you get white, and when you mix pigment in equal measure, you get black. If you are programming any utility that uses a computer, you are working with light, because pixels are inherently a single node on a monitor that emits RGB light waves. The reason I said that CMYK sucks, is not because it's not accurate, it's because it's not accurate when you try to represent it as light, which is the case on all computer monitors. If you are using actual paint, markers, colored pencils, etc, it works just fine. However representing CMYK on a screen still has to resolve to RGB, because that is how a computer monitor works, so it's always off a bit in terms of how it looks in display.
Not to go off on a gigantic side tangent, as this is a programming forum and you asked the question as a programmer. However if you are going for accuracy, there is a distinct "not technical" aspect to consider in terms of how effective your work will be at achieving its desired objective, which is to resolve well against visual perception, which is not particularly well represented in most computational color spaces. At the end of the day, the goal with any color utility is to make it look right in terms of human perception of color. HSL/HSV both fail miserably at that. They are prominent because they are easy to code with, and only for that reason. If you have a short deadline, they are acceptable answers. If you want something that is really going to work well, then you need to do the heavy legwork and consider this stuff, which is what your audience is considering when they decide if they want to use your tool or not.
Some reference points for you (I'm purposely avoiding any technical references, as they only refer to computational perspective, not the actual underlying perception of color, and you've probably read all of those already anyhow):
Color Theory Wiki
Basic breakdown of hue, tint, tone, and shade
Earth Tones (or rich tones if you prefer)
Basic fundamentals of color schemes
Actually, I'd have to argue that HSV accounts better for human visual perception as long as you understand that in HSV, saturation is the purity of the color and value is the intensity of that color, not brightness overall. Take this image, for example...
Here is a mapping of the HSL saturation (left) and HSL luminance (right)...
Note that the saturation is 100% until you hit the white at the very top where it drops suddenly. This mapping isn't perceived when looking at the original image. The same goes for the luminance mapping. While it's a clearer gradient, it only vaguely matches visually. Compare that to HSV saturation (left) and HSV value (right) below...
Here the saturation mapping can be seen dropping as the color becomes more white. Likewise, the value mapping can be very clearly seen in the original image. This is made more obvious when looking at the mappings for the individual color channels of the original image (the non-black areas almost perfectly match the value mapping, but are nowhere close to the luminance mapping)...Going by this information, I would have to say that HSV is better for working with actual images (especially photographs) whereas HSL is possibly better only for selecting colors in a color picker.
On a side note, the value in HSV is the inverse of the black in CMYK.
Another argument for the use of HSV over HSL is that HSV has much fewer combinations of different values that can result in the same color since HSL loses about half of its resolution to its top cone. Let's say you used bytes to represent the components--thereby giving each component 256 unique levels. The maximum number of unique RGB outputs this will yield in HSL is 4,372,984 colors (26% of the available RGB gamut). In HSV this goes up to 9,830,041 (59% of the RGB gamut)... over twice as many. And allowing a range of 0 to 359 for hue will yield 11,780,015 for HSV yet only 5,518,160 for HSL.

uneven illuminated images

How to get rid of uneven illumination from images, that contain text data, usually printed but may be handwritten? It can have some spots of lights because the light reflected while making picture.
I've seen the Halcon program's segment_characters function that is doing this work perfectly,
but it is not open source.
I wish to convert an image to the image that has a constant illumination at background and more dark colored regions of text. So that binarization will be easy and without noise.
The text is assumed to be dark colored than it's background.
Any ideas?
Strictly speaking, assuming you have access to the image's pixels (you can search online for how to accomplish this in your programming language as the topic is abundantly available), the exercise involves going over the pixels once to determine a "darkness threshold". In order to do this you convert each pixel from RGB to HSL in order to get the lightness level component for each pixel. During this process you calculate an average lightness for the whole image which you can use as your "darkness threshold"
Once you have the image average lightness level, you can go over the image pixels once more and if a pixel is less than the darkness threshold, set it's color to full white RGB(255,255,255), otherwise, set it's color to full black RGB (0,0,0). This will give you a binary image with in which the text should be black - the rest should be white.
Of course, the key is in finding the appropriate darkness threshold - so if the average method doesn't give you good results you may have to come up with a different method to augment that step. Such a method could involve separating the image in the primary channels Red, Green, Blue and computing the darkness threshold for each channel separately and then using the aggressive threshold of the three..
And lastly, a better approach may be to compute the light levels distribution - as opposed to simply the average - and then from that, the range around the maximum is what you want to keep. Again, go over each pixel and if it's lightness fits the band make it black, otherwise, make it white.
EDIT
For further reading about HSL I recommend starting with the Wiky entry on HSL and HSV Color spaces.
Have you tried using morphological techniques? Closure-by-reconstruction (as presented in Gonzalez, Woods and Eddins) can be used to create a grayscale representation of background illumination levels. You can more-or-less standardize the effective illumination by:
1) Calculating the mean intensity of all the pixels in the image
2) Using closure-by-reconstruction to estimate background illumination levels
3) Subtract the output of (2) from the original image
4) Adding the mean intensity from (1) to every pixel in the output of (3).
Basically what closure-by-reconstruction does is remove all image features that are smaller than a certain size, erasing the "foreground" (the text you want to capture) and leaving only the "background" (illumination levels) behind. Subtracting the result from the original image leaves behind only small-scale deviations (the text). Adding the original average intensity to those deviations is simply to make the text readable, so that the resulting picture looks like a light-normalized version of the original image.
Use Local-Thresholding instead of the global thresholding algorithm.
Divide your image(grayscale) in to a grid of smaller images (say 50x50 px) and apply the thresholding algorithm on each individual image.
If the background features are generally larger than the letters, you can try to estimate and subsequently remove the background.
There are many ways to do that, a very simple one would be to run a median filter on your image. You want the filter window to be large enough that text inside the window rarely makes up more than a third of the pixels, but small enough that there are several windows that fit into the bright spots. This filter should result in an image without text, but with background only. Subtract that from the original, and you should have an image that can be segmented with a global threshold.
Note that if the bright spots are much smaller than the text, you do the inverse: choose the filter window such that it removes the light only.
The first thing you need to try and do it change the lighting, use a dome light or some other light that will give you a more diffuse and even light.
If that's not possible, you can try some of the ideas in this question or this one. You want to implement some type of "adaptive threshold", this will apply a local threshold to individual parts of the image so that the change in contrast won't be as noticable.
There is also a simple but effective method explained here. The simple outline of the alrithm is the following:
Split the image up into NxN regions or neighbourhoods
Calculate the mean or median pixel value for the neighbourhood
Threshold the region based on the value calculated in 2) or the value from 2) minus C (where C is a chosen constant)
It seems like what you're trying to do is improve local contrast while attenuating larger scale lighting variations. I'll agree with other posters that optimizing the image through better lighting should always be the first move.
After that, here are two tricks.
1) Use smooth_image() operator to convolve a gaussian on your original image. Use a relaitively large kernel, like 20-50px. Then subtract this blurred image from your original image. Apply scale and offset within sub_image() operator, or use equ_histo() to equalize histogram.
This basically subtracts the low spatial frequency information from the original, leaving the higher frequency information intact.
2) You could try highpass_image() operator, or one of the laplacian operators to extract a gradiant image.

Resources