What exactly is the need for gamma correction? - image-processing

I have problems to fully understand the need for gamma correction. I hope you guys can help me.
Let’s assume we want to display 256 neighboring pixels. These pixels should be a smooth gradient from black to white. To denote theirs colors, we use linear gray values from 0..255. Due to the non-linearity of the human eye, the monitor must not just turn these values into linear luminance values. If the neighboring pixels had the luminance values (1/256)*I_max, (2/256)*I_max, et cetera, we would perceive in the darker area too large differences in brightness between two pixels (the gradient would not be smooth).
Fortunately, a monitor has the reciprocal non-linearity to the human eye. That means, if we put linear gray values 0..255 into the frame buffer, then the monitor turns them into non-linear luminance values x^gamma. However, as our eye is non-linear the other way round, we perceive a smooth linear gradient. The non-linearity of the monitor and the one of our eye cancel each other out.
So, why do we need the gamma correction? I have read in books that we always want the monitor to produce linear luminance values. According to them, the non-linearity of the monitor must be compensated before writing the gray values to the frame buffer. That is done by the gamma correction. However, my problem here is that - as far as I understand it - we would not perceive linear brightness values (i.e. we would not perceive a smooth, steady gradient) when the monitor produces linear luminance values.
As far as I see it, it would be just perfect, if we put linear gray values into the frame buffer. The monitor turns these values into non-linear luminance values and our eye perceives linear brightness values again, because the eye is reciprocal non-linear. There would be no need to gamma correct the gray values in the frame buffer and no need to force the monitor to produce linear luminance values.
What is wrong with my way of looking at these things?
Thanks

Allow me to ‘resurrect’ this question since I am struggling with similar questions right now and I think I have found the answer -it may be useful for someone else. Or I might be wrong and someone could tell me :)
I think there is nothing wrong with your way of thinking. Thing is, you don‘t need to gamma-correct all the time, if you know what you are doing. It depends on what you want to achieve. Let‘s see two different cases.
A) Light simulation (AKA rendering). You have a diffuse surface with a light pointing towards it. Then, the light's intensity is doubled.
Well. Let’s see what happens in the real world in such situation. Assuming a purely diffuse surface, the intensity of the light reflected is going to be the surface's albedo multiplied by the incoming light intensity and the cosine of the incoming light angle and the normal. Whatever. Thing is, when the incoming light intensity is doubled, the reflected light intensity will be doubled too. This is why light transport is said to be a linear process. Funny enough, you will not perceive the surface as twice as bright, because our perception is nonlinear (This is modelled by the so-called Steven's power law). Put again: in the real world the reflected light is doubled, but you do not perceive it twice as bright.
Now, how would we simulate this? Well, if we have a sRGB texture with the surface's albedo, we would need to linearlize it (by de-correcting it, which means applying the 2.2 gamma). Now that it is linear, and we have the light intensity, we can use the formula I said before to compute the reflected light intensity. Since we are in a linear space, by doubling the intensity we will double the output, like in the real world. Now we gamma-correct our results. Because of this, when the screen displays the rendered image, it will apply the gamma and so it will have a linear response, meaning that the intensity of the light emited by the screen will be twice as much when we simulate the twice-as-poweful light than when we simulate the first one. So the light that arrive at your eyes from your screen will have double the intensity. Exactly as it would happen if you were looking at the real surface with real lights affecting it. You will not perceive the second render twice as bright, of course but, again, and as we said earlier, this is exactly what it would happen in the real situation. Same behavior in the real world and in the simulation means that the simulation (the render) was correct :)
B) A different case is precisely if you want a gradient that you want to 'look' (AKA being perceived) as linear.
Since you want the nonlinear response of the screen to cancel out our nonlinear visual perception, you can skip gamma correction altogether (as you suggest). Or, more accurately, keep operating in linear space and gamma-correcting, but creating your gradient not with consecutive values for the pixels(1,2,3...255) that would be perceived nonlinearly (because of Steven's), but values transformed by the inverse of our perceptual brightness response (that is, applying an exponent of 1/0.5=2 to the normalized values. This is applying the reciprocal of Steven's exponent for brightness).
As a matter of fact, if you see gamma-corrected linear gradient such as the one in http://scanline.ca/gradients/ you do not perceive it as linear at all: you see far more variation in the lower intensities than in the higher ones (as expected).
Well, at least this is my current understanding of the topic. I hope it helps anyone. And again, please, please, if it is wrong I would be really grateful if someone could point it out...

The problem is really when doing color calculations. For example, if you are blending two colors, you need to use the linear intensities to do the calculations. To actually display the proper result, you then have to convert the linear intensities back to the gamma-corrected intensities.
How your eyes perceive the intensities isn't relevant. To do color calculations correctly, they have to be done based on the physical principles of optics, which relies on linear luminance values. Once you have calculated a color, you want those luminance values to be output by your monitor, regardless of how it is perceived, so you have to compensate for the fact that the monitor doesn't directly produce the colors that you want.

To actually answer the question which is wrong with your way of looking at this - it is nothing really wrong with it. It WOULD be great to have a linear framebuffer, but as you say, it's definetely not great to have an 8-Bit linear frame buffer.
The fact that 8 bits are so easy to handle is pretty much the only justification for gamma compressed frame buffers and color notations (Think HTML's #888 - wouldn't it be uncool to use #333 for middle gray not #888).
About the monitor - you want to be able to predict it's response to your input, and you know from sRGB what it should be. Normally that's all you need to know. Some people think it's "correct" or something if the monitor produces "linear" output which can be simulated if you compensate for the monitor's gamma. I advise to steer clear of such a setup, which breaks all the apps which (correcly and sanely) assume standard gamma in favour of un-breaking ill-concieved linearity-assuming apps. Don't do that. Instead, fix the apps or dump them.

Related

Light intensity as function of exposure and image brightnes

I want to measure the light intensity of microscope images with a BW camera attached to the microscope. My purpose is to compare particular images with each other concerning their brightness. I'm neither interested in measuring absolute light intensity nor in units.
I think, the Function should use exposure and some brightness-related metric (e.g. thresholded histogram-width or pixel-value mean).
My first attempt: 1/exposure * brightness works for smaller exposure ranges.
The exposure is a real [0.001..0.6], the brightens is a natural number [0..255].
Is there a formula for calculating the light intensity received by camera having these two figures?
Many thanks for suggestions!
P.S.:
Currently I estimate the intensity using fuzzy-logic. It works, but the calibration is not flexible.
EDIT:
I've got additional information from the camera manufacturer. The function of light is linear when the pixel values are within the range 50-200
You say "I'm neither interested in measuring absolute light intensity nor in units.". So I guess you only want to answer questions like: "The light source in this image was shining N-times as bright as in this other image: what is N?".
Of course estimating an answer to such a question from images makes sense only if everything else stays (approximately) the same: microscope, camera, transmission (or reflection) of the imaged sample, etc. Is this the case?
If the content of the images is approximately the same, I'd just start by comparing image-wide statistics: ratio of the median/average/n-th quantile intensities, and see if there is a common shift. Be careful if your image are 8-bit per channel: you will probably have to linearize them by removing whatever gamma compression was applied before computing the stats.
As you notice, however, things get more complicated when the variation in exposure increase, probably because on nonlinear effects (cutoff at the lower end or saturation at the higher end).
The exposure might be in seconds or some unit of time. Then your first attempt should be right:
1/exposure * brightness
A possible problem might be the flash. If it flashes during 10ms and your exposure is <10ms there is no problem, but if your exposure is >10ms, then the result will be similar to 10ms (depending on how much light there is in the room apart from the flash).
I would take several pictures at the same object changing the exposure. Plot the brightness vs exposure and it should be a diagonal line, if not, the shape might give you some clues. If at some point the line flattens, probably the flash has not enough duration.
A flash can have more issues, like bad synchronization with the exposure or non uniform brightness during all the exposure (it might take some fraction of millisecond to achieve the maximum brightness, for example).
It there is no flash, a gamma correction could be the problem as has been suggested. In any case, a plot of brightness vs exposure may help.

Best tracking algorithm for multiple colored objects (billiard balls)

Let me quickly explain what I have: I have written a custom detector that finds the regions in an image of billiard balls. I did this in using the HSV colorspace and for most ball's I could get away with only thresholding the Hue channel. However for orange (#5) and brown (#7) one must take the saturation into account which adds another dimension to the problem.
From my research it seems like my best route would be to do some manner of mean-shift tracking but everything I've come across has described mean-shift in which only one channel is used (the hue channel).
Can anyone please explain or offer a link explaing how I can adapt mean-shift to work using hue and saturation?
Or can you tell me if you think a different tracking algorithm may be better suited to this problem?
In theory mean shift works well regardless of the dimensionality (in very high dimensions sparseness is a bit of an issues, but there are works that address that problem)
If you are trying to use an off the self mean shift tracker that only takes a single channel input, you can create your own problem specific color channel. You need a single channel that maximizes the difference between the different colored billiard balls.
The easiest way of doing that will be to take the mean colors of all 15 balls and, put them in a 15x3 matrix and decompose it with SVD (subtract the mean first) so you'll get the axis of maximal variance. This will give you the best linear transformation from RGB to a new one dimensional color space that maximizes difference between the billiard balls colors. (If it isn't good enough you can do better with local mapping, but might not be necessary)

How to deal with noise in images captured with a camera

Assume there is a black box there is no light in it and there is a camera in this box. Camera starts capturing and what it captures is nothing - everywhere is pure black in box. But there will be differences in the sizes of captured frames due to various types of noises (thermal noise, quantization noise etc). I want to decrease/eliminate the effects of these noises in software side so that in a completely isolated black box, all captured frames will be exactly the same. Resolution, depth, color etc none of the properties matters after processing, accuracy/quality of the captured frames in the end doesn't matter. Any kind of filtering, downsampling etc every solution is acceptable. Reference is the black box, frames should be as identical as possible.
Any suggestions ?
There are many approaches to how to remove noise, the noise you are talking about is probably Gaussian noise.
The simplest thing you can do to remove it is to run a Gaussian blur on the image, and then to use threshold to remove (make the value zero) everything that it's value is under "a", when "a" is a parameter you should play with a little bit to find the most suitable value.
ok this is a bit funny answer
since you wrote accuracy/quality of the captured frames in the end doesn't matter, i'd say simple return black a software generated black image. You could do this without a camera ;)
hm more seriously i think you want to calculate the average nearing to black.
each extra sample will have less effect, and it builds up over time to create a noise pattern for your camera.
Then substract that image from new captures.
Or return a statistical result so you return values and the probability of that value.

Is HSL Superior over HSI and HSV Color Spaces?

Is HSL superior over HSI and HSV, because it takes human perception into account.?
For some image processing algorithms they say I can use either of these color spaces,
and I am not sure which one to pick. I mean, the algorithms just care that you provide
them with hue and saturation channel, you can pick which color space to use
Which one is best very much depends on what you're using it for. But in my experience HSL (HLS) has an unfortunate interaction between brightness and saturation.
Here's an example of reducing image brightness by 2. The leftmost image is the original; next comes the results using RGB, HLS, and HSV:
Notice the overly bright and saturated spots around the edge of the butterfly in HLS, particularly that red spot at the bottom. This is the saturation problem I was referring to.
This example was created in Python using the colorsys module for the conversions.
Since there is no accepted answer yet, and since I had to further research to fully understand this, I'll add my two cents.
Like others have said the answer as to which of HSL or HSV is better depends on what you're trying to model and manipulate.
tl;dr - HSV is only "better" than HSL for machine vision (with caveats, read below). "Lab" and other formal color models are far more accurate (but computationally expensive) and should really be used for more serious work. HSL is outright better for "paint" applications or any other where you need a human to "set", "enter" or otherwise understand/make sense of a color value.
For details, read below:
If you're trying to model how colours are GENERATED, the most intuitive model is HSL since it maps almost directly to how you'd mix paints to create colors. For example, to create "dark" yellow, you'd mix your base yellow paint with a bit of black. Whereas to create a lighter shade of yellow, you'd mix a bit of white.
Values between 50 and 0 in the "L" spectrum in HSL map to how much "black" has to be mixed in (black increasing from 0 to 100%, as L DECREASES from 50 to 0).
Values between 50 and 100 map to how much "white" has to be mixed in (white varying from 0 to 100% as L increases from 50 to 100%).
50% "L" gives you the "purest" form of the color without any "contamination" from white or black.
Insights from the below links:
1. http://forums.getpaint.net/index.php?/topic/22745-hsl-instead-of-hsv/
The last post there.
2. http://en.wikipedia.org/wiki/HSL_and_HSV
Inspect the color-space cylinder for HSL - it gives a very clear idea of the kind of distribution I've talked about.
Plus, if you've dealt with paints at any point, the above explanation will (hopefully) make sense. :)
Thus HSL is a very intuitive way of understanding how to "generate" a color - thus it's a great model for paint applications, or any other applications that are targeted to an audience used to thinking in "shade"/"tone" terms for color.
Now, onto HSV.
This is treacherous territory now as we get into a space based on a theory I HAVE FORMULATED to understand HSV and is not validated or corroborated by other sources.
In my view, the "V" in HSV maps to the quantity of light thrown at an object, with the assumption, that with zero light, the object would be completely dark, and with 100% light, it would be all white.
Thus, in this image of an apple, the point that is directly facing the light source is all white, and most likely has a "V" at 100% whereas the point at the bottom that is completely in shadow and untouched by light, has a value "0". (I haven't checked these values, just thought they'd be useful for explanation).
Thus HSV seems to model how objects are lit (and therefore account for any compensation you might have to perform for specular highlights or shadows in a machine vision application) BETTER than HSL.
But as you can see quite plainly from the examples in the "disadvantages" section in the Wikipedia article I linked to, neither of these methods are perfect. "Lab" and other more formal (and computationally expensive) color models do a far better job.
P.S: Hope this helps someone.
The only color space that has advantage and takes human perception into account is LAB, in the sense that the Euclidian metric in it is correlated with human color differentiation.
Taken directly from Wikipedia:
Unlike the RGB and CMYK color models, Lab color is designed to
approximate human vision. It aspires to perceptual uniformity, and its
L component closely matches human perception of lightness
That is the reason that many computer vision algorithms are taking advantage of LAB space
HSV, HSB and HSI don't have this property. So the answer is no, HSL is not "superior" over HSI and HSV in the sense of human perception.
If you want to be close to human perception, try LAB color space.
I would say that one is NO better than another, each is just a mathematical conversion of another. Differing representations CAN make manipulation of an image for the effect you wish a bit easier. Each person WILL perceive images a bit differently, and using HSI or HSV may provide a small difference in output image.
Even RGB when considered against a system (i.e. with pixel array) takes into account human perception. When an imager (with a bayer overlay) takes a picture, there are 2 green pixels for every 1 red and blue pixel. Monitors still output in RGB (although most only have a single green pixel for each red and blue). A new TV monitor made by Sharp now has a yellow output pixel. The reason they have done this is due to there being a yellow band in the actual frequency spectrum, so to better truly represent color, they have added a yellow band (or pixel).
All of these things are based on the human eye having a greater sensitivity to green over any other color in the spectrum.
Regardless, whatever scale you use, the image will be transformed back to RGB to be displayed on screen.
http://hyperphysics.phy-astr.gsu.edu/hbase/vision/colcon.html
http://www.physicsclassroom.com/class/light/u12l2b.cfm
In short, I dont think any one is better than another, just different representations.
http://en.wikipedia.org/wiki/Color
Imma throw my two cents in here being both a programmer and also a guy who aced Color Theory in art school before moving on to software engineering career wise.
HSL/HSV are great for easily writing programmatic functionality to handle color without dealing with a ton of edge cases. They are terrible at replicating human perception of color accurately.
CMYK is great for rendering print stuff, because it approximates the pigments that printers rely on. It is also terrible at replicating human perception of color accurately (although not because it's bad per se, but more because computers are really bad at displaying it on a screen. More on that in a minute).
RGB is the only color utility represented in tech that accurately reflects human vision effectively. LAB is essentially just resolving to RGB under the hood. It is also worth considering that the literal pixels on your screen are representations of RGB, which means that any other color space you work with is just going to get parsed back into RGB anyways when it actually displays. Really, it's best to just cut out the middleman and use that in almost every single case.
The problem with RGB in a programming sense, is that it is essentially cubic in representation, whereas HSL/HSV both resolve in a radius, which makes it much easier to create a "color wheel" programmatically. RGB is very difficult to do this with without writing huge piles of code to handle, because it resolves cubically in terms of its data representation. However, RGB accurately reflects human vision very well, and it's also the foundational basis of the actual hardware a monitor consists of.
TLDR; If you want dead on color and don't mind the extra work, use RGB all of the time. If you want to bang out a "good enough" color utility and probably field bug tickets later that you won't be able to really do anything about, use HSL/HSV. If you are doing print, use CMYK, not because it's good, but because the printer will choke if you don't use it, even though it otherwise sucks.
As an aside, if you were to approach Color Theory like an artist instead of a programmer, you are going to find a very different perception than any technical specifications about color really impart. Bear in mind that anyone working with a color utility you create is basically going to be thinking along these lines, at least if they have a solid foundational education in color theory. Here's basically how an artist approaches the notion of color:
Color from an artistic perspective is basically represented on a scale of five planes.
Pigment (or hue), which is the actual underlying color you are going after.
Tint, which is the pigment mixed with pure white.
Shade, which is the pigment mixed with pure black.
Tone (or "True Tone"), which is the pigment mixed with a varying degree of gray.
Rich Tone (or "Earth Tones"), which is the pigment mixed with its complementary color. Rich tones do not show up on the color wheel because they are inherently a mix of opposites, and visually reflect slightly differently than a "True Tone" due to minute discrepancies in physical media that you can't replicate effectively on a machine.
The typical problem with representing this paradigm programmatically is that there is not really any good way to represent rich tones. A material artist has basically no issue doing this with paint, because the subtle discrepancies of brush strokes allow the underlying variance between the complements to reflect in the composition. Likewise digital photography and video both suck at picking this up, but actual analog film does not suck nearly as bad at it. It is more reflected in photography and video than computer graphics because the texture of everything in the viewport of the camera picks up some of it, but is is still considerably less than actually viewing the same thing (which is why you can never take a really good picture of a sunset without a ton of post production to hack the literal look of it back in, for example). However, computers are not good at replicating those discrepancies, because a color is basically going to resolve to a consistent matrix of RGB pixel mapping which visually appears to be a flat regular tone. There is no computational color space that accurately reflects rich tones, because there is no computational way to make a color vary slightly in a diffuse, non-repeating random way over space and still have a single unique identifier, and you can't very well store it as data without a unique identifier.
The best approximation you can do of this with a computer is to create some kind of diffusion of one color overlapping another color, which does not resolve to a single value that you can represent as a hex code or stuff in a single database column. Even then, a computer is going to inherently reflect a uniform pattern, where a real rich tone relies on randomness and non-repeating texture and variance, which you can't do on a machine without considerable effort. All of the artwork that really makes color pop relies on this principle, and it is basically inaccessible to computational representation without a ton of side work to emulate it (which is why we have Photoshop and Corel Painter, because they can emulate this stuff pretty well with a bit of work, but at the cost of performing a lot of filtering that is not efficient for runtime).
RGB is a pretty good approximation of the other four characteristics from an artistic perspective. We pretty much get that it's not going to cover rich tones and that we're going to have to crack out a design utility and mash that part in by hand. However the underlying problem with programming in RGB is that it wants to resolve to a three dimensional space (because it is cubic), and you are trying to present it on a two dimensional display, which makes it very difficult to create UI that is reasonably intuitive because you lack the capacity to represent the depth of a 3rd axis on a computer monitor effectively in any way that is ever going to be intuitive to use for an end user.
You also need to consider the distinction between color represented as light, and color represented as pigment. RGB is a representation of color represented as light, and corresponds to the primary values used to mix lighting to represent color, and does so with a 1:1 mapping. CMYK represents the pigmentation spectrum. The distinction is that when you mix light in equal measure, you get white, and when you mix pigment in equal measure, you get black. If you are programming any utility that uses a computer, you are working with light, because pixels are inherently a single node on a monitor that emits RGB light waves. The reason I said that CMYK sucks, is not because it's not accurate, it's because it's not accurate when you try to represent it as light, which is the case on all computer monitors. If you are using actual paint, markers, colored pencils, etc, it works just fine. However representing CMYK on a screen still has to resolve to RGB, because that is how a computer monitor works, so it's always off a bit in terms of how it looks in display.
Not to go off on a gigantic side tangent, as this is a programming forum and you asked the question as a programmer. However if you are going for accuracy, there is a distinct "not technical" aspect to consider in terms of how effective your work will be at achieving its desired objective, which is to resolve well against visual perception, which is not particularly well represented in most computational color spaces. At the end of the day, the goal with any color utility is to make it look right in terms of human perception of color. HSL/HSV both fail miserably at that. They are prominent because they are easy to code with, and only for that reason. If you have a short deadline, they are acceptable answers. If you want something that is really going to work well, then you need to do the heavy legwork and consider this stuff, which is what your audience is considering when they decide if they want to use your tool or not.
Some reference points for you (I'm purposely avoiding any technical references, as they only refer to computational perspective, not the actual underlying perception of color, and you've probably read all of those already anyhow):
Color Theory Wiki
Basic breakdown of hue, tint, tone, and shade
Earth Tones (or rich tones if you prefer)
Basic fundamentals of color schemes
Actually, I'd have to argue that HSV accounts better for human visual perception as long as you understand that in HSV, saturation is the purity of the color and value is the intensity of that color, not brightness overall. Take this image, for example...
Here is a mapping of the HSL saturation (left) and HSL luminance (right)...
Note that the saturation is 100% until you hit the white at the very top where it drops suddenly. This mapping isn't perceived when looking at the original image. The same goes for the luminance mapping. While it's a clearer gradient, it only vaguely matches visually. Compare that to HSV saturation (left) and HSV value (right) below...
Here the saturation mapping can be seen dropping as the color becomes more white. Likewise, the value mapping can be very clearly seen in the original image. This is made more obvious when looking at the mappings for the individual color channels of the original image (the non-black areas almost perfectly match the value mapping, but are nowhere close to the luminance mapping)...Going by this information, I would have to say that HSV is better for working with actual images (especially photographs) whereas HSL is possibly better only for selecting colors in a color picker.
On a side note, the value in HSV is the inverse of the black in CMYK.
Another argument for the use of HSV over HSL is that HSV has much fewer combinations of different values that can result in the same color since HSL loses about half of its resolution to its top cone. Let's say you used bytes to represent the components--thereby giving each component 256 unique levels. The maximum number of unique RGB outputs this will yield in HSL is 4,372,984 colors (26% of the available RGB gamut). In HSV this goes up to 9,830,041 (59% of the RGB gamut)... over twice as many. And allowing a range of 0 to 359 for hue will yield 11,780,015 for HSV yet only 5,518,160 for HSL.

OpenCV: Detect blinking lights in a video feed

I have a video feed. This video feed contains several lights blinking at different rates. All lights are the same color (they are all infrared LEDs). How can I detect the position and frequency of these blinking lights?
Disclaimer: I am extremely new to OpenCV. I do have a copy of Learning OpenCV, but I am finding it a bit overwhelming. If anyone could explain a solution in OpenCV terminology, it would be greatly appreciated. I am not expecting code to be written for me.
Threshold each image in the sequence with a threshold that makes the LED:s visible. If you can threshold it with a threshold that only keeps the LED and removes background then you are more or less finished since all you need to do now is to keep track of each position that has seen a LED and count how often it occurs.
As a middle step, if there is "background noise" in the thresholded image would be to use erosion to remove small mistakes, and then maybe dilate to "close holes" in the blobs you are actually interested in.
If the scene is static you could also make a simple background model by taking the median of a few frames and removing the resulting median image from any frame and threshold that. Stuff that has changed (your LEDs) will appear stronger.
If the scene is moving I see no other (easy) solution than making sure the LED are bright enough to be able to use the threshold approach given above.
As for OpenCV: if you know what you want to do, it is not very hard to find a function that does it. The hard part is coming up with a method to solve the problem, not the actual coding.
If the leds are stationary, the problem is far simpler than when they are moving. Assuming they are stationary, a solution to find the frequency could simply be to keep a vector or an array for each pixel location in which you store the values of that pixel, preferably after the preprocessing described by kigurai, over some timeframe. You can then compute the 1D fourier transform of those value vectors and find the ground frequency as the first significant component after the DC peak. If the DC peak is too low, it means there is no led there.
Hope this problem is still somewhat actual, and that my solution makes sense.

Resources