RGB, HSV concept - image-processing

I found an image stating the difference between RGB and HSV color spaces. But from the image, it looks like that they are just different ways of representing the same color, yet when we display them on our screen(e.g. using OpenCV) they look different. I mean though HSV space separates luminance from the color, the actual color should remain same.
Also for displaying an HSV image on the laptop screen, we will need the RGB values, what are those RGB values?

You said it yourself:
"they are just different ways of representing the same color"
Since they are simply different representations, you can convert between the formats. There are already SO posts on that: here, and this answer here.
RGB and HSV are useful in different applications. HSV seperates the color information from brightness information, which can be useful when comparing colors in image processing, for example. RBG is typically how screens display images.
As Cris said, you should not see these displayed differently.

Related

Understanding NetPBM's PNM nonlinear RGB color space for converting to grayscale

I am trying to understand how to properly work with the RGB values found in PNM formats in order to inevitably convert them to Grayscale.
Researching the subject, it appears that if the RGB values are nonlinear, then I would need to first convert them to a linear RGB color space, apply my weights, and then convert them back to the same nonlinear color space.
There appears to be an expected format http://netpbm.sourceforge.net/doc/ppm.html:
In the raster, the sample values are "nonlinear." They are proportional to the intensity of the ITU-R Recommendation BT.709 red, green, and blue in the pixel, adjusted by the BT.709 gamma transfer function.
So I take it these values are nonlinear, but not sRGB. I found some thread topics around ImageMagick that say they might save them as linear RGB values.
Am I correct that PNM specifies a standard, but various editors like Photoshop or GIMP may or may not follow it?
From http://netpbm.sourceforge.net/doc/pamrecolor.html
When you use this option, the input and output images are not true Netpbm images, because the Netpbm image format specifies a particular color space. Instead, you are using a variation on the format in which the sample values in the raster have different meaning. Many programs that ostensibly use Netpbm images actually use a variation with a different color space. For example, GIMP uses sRGB internally and if you have GIMP generate a Netpbm image file, it really generates a variation of the format that uses sRGB.
Else where I see this http://netpbm.sourceforge.net/doc/pgm.html:
Each gray value is a number proportional to the intensity of the
pixel, adjusted by the ITU-R Recommendation BT.709 gamma transfer
function. (That transfer function specifies a gamma number of 2.2 and
has a linear section for small intensities). A value of zero is
therefore black. A value of Maxval represents CIE D65 white and the
most intense value in the image and any other image to which the image
might be compared.
BT.709's range of channel values (16-240) is irrelevant to PGM.
Note that a common variation from the PGM format is to have the gray
value be "linear," i.e. as specified above except without the gamma
adjustment. pnmgamma takes such a PGM variant as input and produces a
true PGM as output.
Most sources out there assume they are dealing with linear RGB and just apply their weights and save, possibly not preserving the luminance. I assume that any complaint renderer will assume that these RGB values are gamma compressed... thus technically displaying different grayscale "colors" than what I had specified. Is this correct? Maybe to ask it differently, does it matter? I know it is a loaded question, but if I can't really tell if it is linear or nonlinear, or how it has been compressed or expected to be compressed, will the image processing algorithms (binarization) be greatly effected if I just assume linear RGB values?
There may have been some confusion with my question, so I would like to answer it now that I have researched the situation much further.
To make a long story short... it appears like no one really bothers to re-encode an image's gamma when saving to PNM format. Because of that, since almost everything is sRGB, it will stay sRGB as opposed to the technically correct BT.709, as per the spec.
I reached out to Bryan Henderson of NetPBM. He held the same belief and stated that the method of gamma compression is not as import as knowing if it was applied or not and that we should always assume it is applied when working with PNM color formats.
To reaffirm the effect of that opinion in regard to image processing, please read "Color-to-Grayscale: Does the Method Matter in Image Recognition?", 2012 by Kanan and Cottrell. Basically if you calculate the Mean of the RGB values you will end up in one of three situations: Gleam, Intensity', or Intensity. After comparing the effects of different grayscale conversion formulas, taking into account when and how gamma correction was applied, he discovered that Gleam and Intensity' where the best performers. They differ only by when the gamma correction was added (Gleam has the gamma correction on the input RGB values, while Intensity' takes in linear RGB and applies gamma afterwords). Sadly you drop from 1st and 2nd place down to 8th when no gamma correction is added, aka Intensity. It's interesting to note that it was the simple Mean formula that worked the best, not one of the more popular grayscale formulas most people tout. All of that to say that if you use the Mean formula for converting PNM color to grayscale for image processing applications, you will ensure great performance since we can assume some gamma compression will have been applied. My comment about ImageMagick and linear values appears only to apply to their PGM format.
I hope that helps!
There is only one way good way to convert colour signal to greyscale: going to linear space and add light (and so colour intensities). In this manner you have effective light, and so you can calculate the brightness. Then you can "gamma" correct the value. This is the way light behave (linear space), and how the brightness was measured by CIE (by wavelength).
On television it is standard to build luma and then black and white images) from non-linear R,G,B. This is done because simplicity and the way analog colour television (NTSC and PAL) worked: black and white signal (for BW television) as main signal, and then adding colours (as subcarrier) to BW image. For this reason, the calculations are done in non linear space.
Video could use often such factors (on non-linear space), because it is much quick to calculate, and you can do it easily with integers (there are special matrix to use with integers).
For edge detection algorithms, it should not be important which method you are using: we have difficulty to detect edge with similar L or Y', so we do no care if computers have similar problem.
Note: our eyes are non linear on detecting light intensities, and with similar gamma as phosphors on our old televisions. For this reason using gamma corrected value is useful: it compress the information in a optimal way (or in "analog-TV" past: it reduce perceived noise).
So you if you want Y', do with non linear R',G',B'. But if you need real grey scale, you need to calculate real greyscale going to linear space.
You may see differences especially on mid-greys, and on purple or yellow, where two of R,G,B are nearly the same (and as maximum value between the three).
But on photography programs, there are many different algorithms to convert RGB to greyscale: we do not see the world in greyscale, so different weight (possibly non linear) could help to make out some part of image, which it is the purpose of greyscale photos (by remove distracting colours).
Note Rec.709 never specified the gamma correction to apply (the OETF on the standard is not useful, we need EOTF, and often one is not the inverse of the other, for practical reasons). Only on a successive recommendation this missing information were finally provided. But because many people speak about Rec.709, the inverse of OETF is used as gamma, which it is incorrect.
How to detect: classical yellow sun on blue sky, choosing yellow and blue with same L. If you see sun in grey image, you are transforming with non-linear space (Y' is not equal). If you do no see the sun, you transform linearly.

Impact of converting image to grayscale

I am seeing many Machine learning(CNN) tutorial which converts the read image in grayscale. I want to know how the model will understand original color/use color as one identification criteria if the colors are converted throughout the model creation ?
In consideration with colours, there can be 2 cases in an image processing problem:
Colours are not relevant in object-identification
In this case, converting a coloured image to a grayscale image will not matter, because eventually the model will be learning from the geometry present in the image. The image-binarization will help in sharpening the image by identifying the light and dark areas.
Colours are relevant in object-identification
As you might know that all the colours can be represented as some combination of three primary RGB colours. Each of these R, G and B values usually vary from 0 to 255 for each pixel. However, in gray-scaling, a certain pixel value will be one-dimensional instead of three-dimensional, and it will just vary from 0 to 255. So, yes, there will be some information loss in terms of actual colours, but, that is in tradeoff with the image-sharpness.
So, there can be a combined score of R, G, B values at each point (probably their mean (R+G+B)/3), which can give a number between 0 to 255, which can eventually be used as their representative. So that, instead of specific colour information, the pixel just carries the intensity information.
Reference:
https://en.wikipedia.org/wiki/Grayscale
I would like to add to Shashank's answer.
A model when fed with an image, does not perceive it as we do. Humans perceive images with the variations in colors, stauration of the colors and the brightness of it. We are able to recognize objects and other shapes as well.
However, a model sees an image as a matrix with a bunch of numbers in it (if it is a greyscale image). In case of a color image, it sees it as three matrices stacked above one another filled with numbers(0 -255) in it.
So how does it learn color? Well it doesn't. What it does learn is the variation in the numbers within this matrix (in case of greyscale image). These variations are crucial to determine changes in the image. If the CNN is trained in this respect, it will be able to detect a structure in the image and can also be used for bject detection.

Gray scale image to color

Is there a way to convert a gray-scale image to a colour image?
Here's a few JPG examples
Photo 1
Photo 2
Photo 3
ImageMagick is powerful but doesn't seem capable of converting to a colourful version.
You cannot accurately re-create the information that is lost by aggregating the three channels together when converting to colour. You can, however, generate a false colour and you can also make some assumptions that may be fairly reasonable for some/many pictures.
Generally, when doing these types of things, you create a LUT (Look-up Table) and use the grey value at each pixel location to look-up a replacement colour. So, if we make a LUT that goes from Red through Green through Blue, the dark tones in your image will be mapped to Red, the midtones to Green and the highlights to Blue. Let's try it:
convert -size 1x1! xc:red xc:lime xc:blue +append -resize 255x1! rainbowCLUT.png
If we now apply that to your image:
convert yourImage.jpg rainbowCLUT.png -clut result.png
Ok, we have now got colour but that is not very realistic. So, to do it better, we need to start making some assumptions. One assumption might be that there is something pretty black in the picture (i.e. a tie, a jacket, some dark hair, or a deep shadow somewhere), another assumption might be that there is probably a white highlight somewhere in the image (i.e. a white background, the whites of an eye) and finally we assume that there is some flesh tone somewhere in the middle. So, let's make a CLUT that looks like that, i.e. it goes from solid black through a flesh tone in the middle to a white highlight:
convert -size 128x1! gradient:black-"rgb(210,160,140)" gradient:"rgb(210,160,140)"-white +append clut.png
(I have put a 1 pixel wide red border around it just so you can see it on StackOverflow's white background)
Now we can apply it to your images:
convert yourImage.jpg -normalize clut.png -clut result.png
Note how I also used -normalize to try and make the image fit my assumption that there was a solid black tone in the image and a solid white highlight.
This technique is only an attempt at re-creating something that is no longer in the image so it will not always work. Of course, if you know extra information about your subjects, the lighting and so on, you could build that into the LUT.
Unfortunately this is not possible. Grayscale images do not contain sufficient information to create a color image. In instances where you see B&W/Grayscale images converted to color, this has been done manually in an application such as photoshop.
You can use imagemagick to apply a filter but you cannot re-introduce color.
http://www.imagemagick.org/Usage/color_mods/#level-colors
To convert an image to grayscale you just take the average of the r g b value in each pixel and set r g and b to that value. Therefore it is pretty much impossible to convert it back to color. The key word is pretty much, I'm sure someone will eventually invent some complex algorithm that will look at the pixels around it see their averages and maybe make out a conclusion of around what color is in that area maybe, I dunno. But as for now I don't think it's possible to do such a thing unfortunately. Sorry.
take a look at siggraph2016_colorization
i did not try it but seems interesting.
They present a novel technique to automatically colorize grayscale images that combines both global priors and local image features.
Colorization Architecture:
Their model consists of four main components: a low-level features network, a mid-level features network, a global features network, and a colorization network. The components are all tightly coupled and trained in an end-to-end fashion. The output of our model is the chrominance of the image which is fused with the luminance to form the output image.
here a sample

Does OpenCV have functions to handle non-linearities in sRGB color space?

I am wondering whether OpenCV has functions to handle the non-linearities in the sRGB color space.
Say I want to convert an JPEG image from sRGB color space into XYZ color space. As specified in this Wiki page, one needs to first undo the nonlinearities to convert to linear RGB space, and then multiply with the 3x3 color transform matrix. However, I couldn't find any such discussions in the cvtColor documentation. Did I miss something?
Thanks a lot in advance!
It's not explicitly stated in the documentation, so you're not missing anything, but OpenCV does not perform gamma correction in its RGB2XYZ/BGR2XYZ color conversions. You can confirm this by looking at the source code for cvtColor in
<OpenCV_dir>/modules/imgproc/src/color.cpp
If you look at the RGB <-> XYZ section you'll see that the input RGB values are simply multiplied by the coefficient matrix.
I have also not found any existing method to perform gamma correction on an RGB image.
Interestingly, a custom RGB -> XYZ conversion is done as a preliminary step for converting to both L*a*b* and L*u*v*, and in both cases it performs gamma correction.
Unfortunately, this isn't accessible from RGB2XYZ code, but you might be able to reuse it in your own code. I've also seen several code samples on the web, mostly using look-up tables for CV_8U depth images.

comparison between RGB and Ycbcr and HSI color spaces

can anyone explain me advantage and disadvantage of HSI, Ycbcr and RGB color spaces and a give me a short comparison about these spaces?
I know relations between these models indeed I just need a comparison.
HSI and Ycbcr, unlike RGB, separate the intensity (luma) from color information (chroma). This is useful if you want to ignore one or the other. For example, face detection is usually done on intensity images. On the other hand, ignoring the intensity can help to get rid of shadows.
HSI contains hue and saturation, which are the terms that people use to describe colors. On the other hand, hue and saturation are angles, which can be inconvenient for computing distances in the color space. Not to mention that hue wraps around. Ycbcr, on the other hand, is a Euclidean space. Also, Ycbcr is what you typically get directly from a camera.
Also see this answer on DSP stackexchange.

Resources