RGBA decimal notation to arithmetic notation - ios

I have to customize a iOS app and the guideline says:
Please don’t use RGBA values in 0 to 255 decimal notation, but use 0.0
to 1.0 arithmetic notation instead!
For exemple, the default app color #70C7C6 in the guideline is converted to (0.298, 0.792, 0.784, 1.000).
How can I convert other colors? I never knew this arithmetic notation before.

Convert the string hex values into integers, then to get the arithmetic notation divide each value by 255.
For example "C7" -> 199 -> 199/255. -> 0.78.
The last value is opacity, which sounds like in your case would always be 1.

A color component is a number over a specified range, while working with hex or integer values you have (usually) a number in 0-255 (represented by 0x00-0xFF if working in hexadecimal). But you can express the same value by normalizing it in the range 0.0-1.0, you do so by dividing each component by the maximum allowed value, eg:
You have 0xC7 in hex, which is 199 in decimal, you divide it by 255.0f and you obtain 0.780f.
In practice UIColor already provides methods to obtain normalized values, you just need to convert a number from hex notation, which can be done easily or by using a simple library:
UIColor* color = [UIColor colorWithCSS:#"70c7c6"];
CGFloat r, g, b, a;
[color getRed:&r green:&g blue:&b alpha:&a]

Related

Difference between absdiff and normal subtraction in OpenCV

I am currently planning on training a binary image classification model. The images I want to train on are the difference between two original pictures. In other words, for each data entry, I start out with 2 pictures, take their difference, and the label that difference as a 0 or 1. My question is what is the best way to find this difference. I know about cv2.absdiff and then normal subtraction of images - what is the most effective way to go about this?
About the data: The images I'm training on are screenshots that usually are the same but may have small differences. I found that normal subtraction seems to show the differences less than absdiff.
This is the code I use for absdiff:
diff = cv2.absdiff(img1, img2)
mask = cv2.cvtColor(diff, cv2.COLOR_BGR2GRAY)
th = 1
imask = mask>1
canvas = np.zeros_like(img2, np.uint8)
canvas[imask] = img2[imask]
And then this for normal subtraction:
def extract_diff(self,imageA, imageB, image_name, path):
subtract = imageB.astype(np.float32) - imageA.astype(np.float32)
mask = cv2.inRange(np.abs(subtract),(30,30,30),(255,255,255))
th = 1
imask = mask>1
canvas = np.zeros_like(imageA, np.uint8)
canvas[imask] = imageA[imask]
Thanks!
A difference can be negative or positive.
For some number types, such as uint8 (unsigned 8-bit int), which can't be negative (have no sign), a negative value wraps around and the value would make no sense anymore. Other types can be signed (e.g. floats, signed ints), so a negative value can be represented correctly.
That's why cv.absdiff exists. It always gives you absolute differences, and those are okay to represent in an unsigned type.
Example with numbers: a = 4, b = 6. a-b should be -2, right?
That value, as an uint8, will wrap around to become 0xFE, or 254 in decimal. The 254 value has some relation to the true -2 difference, but it also incorporates the range of values of the data type (8 bits: 256 values), so it's really just "code".
cv.absdiff would give you the absolute of the difference (-2), which is 2.

How are floating-point pixel values converted to integer values?

How does image library (such as PIL, OpenCV, etc) convert floating-point values to integer pixel values?
For example
import numpy as np
from PIL import Image
# Creates a random image and saves in a file
def get_random_img(m=0, s=1, fname='temp.png'):
im = m + s * np.random.randn(60, 60, 3) # For eg. min: -3.8947058634971179, max: 3.6822041760496904
print(im[0, 0]) # for eg. array([ 0.36234732, 0.96987366, 0.08343])
imp = Image.fromarray(im, 'RGB') # (*)
print(np.array(imp)[0, 0]) # [140 , 74, 217]
imp.save(fname)
return im, imp
For the above method, an example is provided in the comment (which is randomly produced). My question is: how does (*) convert ndarray (which can range from - infinity to plus infinity) to pixel values between 0 and 255?
I tried to investigate the Pil.Image.fromarray method and eventually ended by at line #798 d.decode(data) within Pil.Image.Image().frombytes method. I could find the implementation of decode method, thus unable to know what computation goes behind the conversion.
My initial thought was that maybe the method use minimum (to 0) and maximum (to 255) value from the array and then map all the other values accordingly between 0 and 255. But upon investigations, I found out that's not what is happening. Moreover, how does it handle when the values of the array range between 0 and 1 or any other range of values?
Some libraries assume that floating-point pixel values are between 0 and 1, and will linearly map that range to 0 and 255 when casting to 8-bit unsigned integer. Some others will find the minimum and maximum values and map those to 0 and 255. You should always explicitly do this conversion if you want to be sure of what happened to your data.
In general, a pixel does not need to be 8-bit unsigned integer. A pixel can have any numerical type. Usually a pixel intensity represents an amount of light, or a density of some sort, but this is not always the case. Any physical quantity can be sampled in 2 or more dimensions. The range of meaningful values thus depends on what is imaged. Negative values are often also meaningful.
Many cameras have 8-bit precision when converting light intensity to a digital number. Likewise, displays typically have an b-bit intensity range. This is the reason many image file formats store only 8-bit unsigned integer data. However, some cameras have 12 bits or more, and some processes derive pixel data with a higher precision that one does not want to quantize. Therefore formats such as TIFF and ICS will allow you to save images in just about any numeric format you can think of.
I'm afraid it has done nothing anywhere near as clever as you hoped! It has merely interpreted the first byte of the first float as a uint8, then the second byte as another uint8...
from random import random, seed
import numpy as np
from PIL import Image
# Generate repeatable random data, so other folks get the same results
np.random.seed(42)
# Make a single RGB pixel
im = np.random.randn(1, 1, 3)
# Print the floating point values - not that we are interested in them
print(im)
# OUTPUT: [[[ 0.49671415 -0.1382643 0.64768854]]]
# Save that pixel to a file so we can dump it
im.tofile('array.bin')
# Now make a PIL Image from it and print the uint8 RGB values
imp = Image.fromarray(im, 'RGB')
print(imp.getpixel((0,0)))
# OUTPUT: (124, 48, 169)
So, PIL has interpreted our data as RGB=124/48/169
Now look at the hex we dumped. It is 24 bytes long, i.e. 3 float64 (8-byte) values, one for red, one for green and one for blue for the 1 pixel in our image:
xxd array.bin
Output
00000000: 7c30 a928 2aca df3f 2a05 de05 a5b2 c1bf |0.(*..?*.......
00000010: 685e 2450 ddb9 e43f h^$P...?
And the first byte (7c) has become 124, the second byte (30) has become 48 and the third byte (a9) has become 169.
TLDR; PIL has merely taken the first byte of the first float as the Red uint8 channel of the first pixel, then the second byte of the first float as the Green uint8 channel of the first pixel and the third byte of the first float as the Blue uint8 channel of the first pixel.

Why does Metal constant with h suffix produce bad byte range results?

I have run into a very strange problem in my Metal shader that has to do with a byte value in the range (0, 255). This byte value is represented as a ushort that is converted to a half precision by code like (x / 255.0h). What is strange is that this literal constant divide seems to be optimized incorrectly when run on a A7 device (A10 does not do this). Has anyone else run into this? Is there some way I can write Metal code that is used only on this GPU family 1 device?
I found a workaround, by leaving the h suffix off of the inline constant:
// This method accepts 4 byte range input values and encodes them as a half4 vector
// that works properly on A7 class hardware. The issue with A7 devices is that
// there seems to be a compler bug or range issue with an operation like (x / 255.0h).
// What should be the same operation (x / 255.0) does not show the range problem on A7.
half4
encodeBytesAsHalf4(const ushort4 b4) {
return half4(b4.x/255.0, b4.y/255.0, b4.z/255.0, b4.a/255.0);
}

Why float value is rounded in playground but not in project in Swift?

I'm using float value in my project. when I try to access in Project, it's expanding to 1/billions decimal but when it comes to playground it works perfectly.
In xcodeproj:
let sampleFloat: Float = 0.025
print(sampleFloat) // It prints 0.0250000004
In Playground:
let sampleFloat: Float = 0.025
print(sampleFloat) // It prints 0.025
Any clue what's happening here? how can I avoid expansion in xcodeproj?
Lots of comments, but nobody's posted all the info as an answer yet.
The answer is that internally, floating point numbers are represented with binary powers of 2.
In base 10, the tenths digit represents how many 1/10ths are in the value. The hundredths digit represents how many 1/100ths are in the value, the thousandths digit represents how many 1/1000ths are in the value, and so on. In base 10, you can't represent 1/3 exactly. That is 0.33333333333333333...
In binary floating point, the first fractional binary digit represents how many 1/2s are in the value. The second digit represents how many 1/4ths are in in th value, the next digit represents how many 1/8ths are in the value, and so on. There are some (lots of) decimal values that can't be represented exactly in binary floating point. The value 0.1 (1/10) is one such value. That will be approximated by something like 1/16 + 1/32 + 1/256 + 1/512 + 1/4096 + 1/8192.
The value 0.025 is another value that can't be represented exactly in binary floating point.
There is an alternate number format, NSDecimalNumber (Decimal in Swift 3) that uses decimal digits to represent numbers, so it CAN express any decimal value exactly. (Note that it still can't express a fraction like 1/3 exactly.)

Converting RGB to grayscale/intensity

When converting from RGB to grayscale, it is said that specific weights to channels R, G, and B ought to be applied. These weights are: 0.2989, 0.5870, 0.1140.
It is said that the reason for this is different human perception/sensibility towards these three colors. Sometimes it is also said these are the values used to compute NTSC signal.
However, I didn't find a good reference for this on the web. What is the source of these values?
See also these previous questions: here and here.
The specific numbers in the question are from CCIR 601 (see Wikipedia article).
If you convert RGB -> grayscale with slightly different numbers / different methods,
you won't see much difference at all on a normal computer screen
under normal lighting conditions -- try it.
Here are some more links on color in general:
Wikipedia Luma
Bruce Lindbloom 's outstanding web site
chapter 4 on Color in the book by Colin Ware, "Information Visualization", isbn 1-55860-819-2;
this long link to Ware in books.google.com
may or may not work
cambridgeincolor :
excellent, well-written
"tutorials on how to acquire, interpret and process digital photographs
using a visually-oriented approach that emphasizes concept over procedure"
Should you run into "linear" vs "nonlinear" RGB,
here's part of an old note to myself on this.
Repeat, in practice you won't see much difference.
### RGB -> ^gamma -> Y -> L*
In color science, the common RGB values, as in html rgb( 10%, 20%, 30% ),
are called "nonlinear" or
Gamma corrected.
"Linear" values are defined as
Rlin = R^gamma, Glin = G^gamma, Blin = B^gamma
where gamma is 2.2 for many PCs.
The usual R G B are sometimes written as R' G' B' (R' = Rlin ^ (1/gamma))
(purists tongue-click) but here I'll drop the '.
Brightness on a CRT display is proportional to RGBlin = RGB ^ gamma,
so 50% gray on a CRT is quite dark: .5 ^ 2.2 = 22% of maximum brightness.
(LCD displays are more complex;
furthermore, some graphics cards compensate for gamma.)
To get the measure of lightness called L* from RGB,
first divide R G B by 255, and compute
Y = .2126 * R^gamma + .7152 * G^gamma + .0722 * B^gamma
This is Y in XYZ color space; it is a measure of color "luminance".
(The real formulas are not exactly x^gamma, but close;
stick with x^gamma for a first pass.)
Finally,
L* = 116 * Y ^ 1/3 - 16
"... aspires to perceptual uniformity [and] closely matches human perception of lightness." --
Wikipedia Lab color space
I found this publication referenced in an answer to a previous similar question. It is very helpful, and the page has several sample images:
Perceptual Evaluation of Color-to-Grayscale Image Conversions by Martin Čadík, Computer Graphics Forum, Vol 27, 2008
The publication explores several other methods to generate grayscale images with different outcomes:
CIE Y
Color2Gray
Decolorize
Smith08
Rasche05
Bala04
Neumann07
Interestingly, it concludes that there is no universally best conversion method, as each performed better or worse than others depending on input.
Heres some code in c to convert rgb to grayscale.
The real weighting used for rgb to grayscale conversion is 0.3R+0.6G+0.11B.
these weights arent absolutely critical so you can play with them.
I have made them 0.25R+ 0.5G+0.25B. It produces a slightly darker image.
NOTE: The following code assumes xRGB 32bit pixel format
unsigned int *pntrBWImage=(unsigned int*)..data pointer..; //assumes 4*width*height bytes with 32 bits i.e. 4 bytes per pixel
unsigned int fourBytes;
unsigned char r,g,b;
for (int index=0;index<width*height;index++)
{
fourBytes=pntrBWImage[index];//caches 4 bytes at a time
r=(fourBytes>>16);
g=(fourBytes>>8);
b=fourBytes;
I_Out[index] = (r >>2)+ (g>>1) + (b>>2); //This runs in 0.00065s on my pc and produces slightly darker results
//I_Out[index]=((unsigned int)(r+g+b))/3; //This runs in 0.0011s on my pc and produces a pure average
}
Check out the Color FAQ for information on this. These values come from the standardization of RGB values that we use in our displays. Actually, according to the Color FAQ, the values you are using are outdated, as they are the values used for the original NTSC standard and not modern monitors.
What is the source of these values?
The "source" of the coefficients posted are the NTSC specifications which can be seen in Rec601 and Characteristics of Television.
The "ultimate source" are the CIE circa 1931 experiments on human color perception. The spectral response of human vision is not uniform. Experiments led to weighting of tristimulus values based on perception. Our L, M, and S cones1 are sensitive to the light wavelengths we identify as "Red", "Green", and "Blue" (respectively), which is where the tristimulus primary colors are derived.2
The linear light3 spectral weightings for sRGB (and Rec709) are:
Rlin * 0.2126 + Glin * 0.7152 + Blin * 0.0722 = Y
These are specific to the sRGB and Rec709 colorspaces, which are intended to represent computer monitors (sRGB) or HDTV monitors (Rec709), and are detailed in the ITU documents for Rec709 and also BT.2380-2 (10/2018)
FOOTNOTES
(1) Cones are the color detecting cells of the eye's retina.
(2) However, the chosen tristimulus wavelengths are NOT at the "peak" of each cone type - instead tristimulus values are chosen such that they stimulate on particular cone type substantially more than another, i.e. separation of stimulus.
(3) You need to linearize your sRGB values before applying the coefficients. I discuss this in another answer here.
Starting a list to enumerate how different software packages do it. Here is a good CVPR paper to read as well.
FreeImage
#define LUMA_REC709(r, g, b) (0.2126F * r + 0.7152F * g + 0.0722F * b)
#define GREY(r, g, b) (BYTE)(LUMA_REC709(r, g, b) + 0.5F)
OpenCV
nVidia Performance Primitives
Intel Performance Primitives
Matlab
nGray = 0.299F * R + 0.587F * G + 0.114F * B;
These values vary from person to person, especially for people who are colorblind.
is all this really necessary, human perception and CRT vs LCD will vary, but the R G B intensity does not, Why not L = (R + G + B)/3 and set the new RGB to L, L, L?

Resources