can any one tell that ,how many number of pixels are present in RGB type of image is it height * width or height *width * channels.
I want to calculate bit per pixel(bpp) of an image so i need this information.
The number of pixels is simply:
height × width
It's indepenent of whether the color of each pixel is composed from a single channel or from several channels.
If your image has three channels, e.g. a separte one for red, green and blue, each using an 8 bit value for each pixel, then you have to add them to get the bits per pixel (bpp) value. In the example, it would be:
bpp = 3 × 8bit = 24bit
But it does not affect the number of pixels.
Related
I am analyzing a very big number of images and extracting the dominant color codes.
I want to group them into ranges of generic color names, like Green, Dark Green, Light Green, Blue, Dark Blue, Light Blue and so on.
I am looking for a language agnostic way in order to implement something by myself, if there are examples I can look into in order to achieve this I would be more than grateful.
In machine learning field, what you want to do is called classification, in which the goal is to assign the label of one of the classes (color) to each of the observations (images).
To do this, classes must be pre-defined. Suppose these are the colors we want to assign to images:
To determine the dominant color of an image, the distance between each of its pixels and all the colors in the table must be calculated. Note that this distance is calculated in RGB color space. To calculate the distance between the ij-th pixel of the image and the k-th color of the table, the following equation can be used:
d_ijk = sqrt((r_ij-r_k)^2+(g_ij-g_k)^2+(b_ij-b_k)^2)
In the next step, for each pixel, the closest color in the table is selected. This is the concept used to compress an image using indexed colors (except that here the palette is the same for all images and is not calculated for each to minimize the difference between the original and the indexed image). Now, as #jairoar pointed out, we can get the histogram of the image (not to be confused with RGB histogram or intensity histogram), and determine the color that has the most repetition.
To show the result of these steps, I used random crops of this work of art! of mine:
This is how images look, before and after indexing (left: original, right: indexed):
And these are most repeated colors (left: indexed, right: dominant color):
But since you said the number of images is large, you should know that these calculations are relatively time consuming. But the good news is that there are ways to increase the performance. For example, instead of using the Euclidean distance (formula above), you can use the City Block or Chebyshev distance. You can also calculate the distance only for a fraction of the pixels instead of calculating it for all the pixels in an image. For this purpose, you can first scale the image to a much smaller size (for example, 32 by 32) and perform calculations for the pixels of this reduced image. If you decided to resize images, don not bother to use bilinear or bicubic interpolations, it doesn't worth the extra computation. Instead, go for the nearest neighbor, which actually performs a rectangular lattice sampling on the original image.
Although the mentioned changes will greatly increase the speed of calculations, but nothing good comes for free. This is a trade-off of performance versus accuracy. For example, in the previous two pictures, we see that the image, which was initially recognized as orange (code 20), has been recognized as pink (code 26) after resizing it.
To determine the parameters of the algorithm (distance measurement, reduced image size and scaling algorithm), you must first perform the classification operation on a number of images with the highest possible accuracy and keep the results as the ground truth. Then, with multiple experiments, obtain a combination of parameters that do not make the classification error more than a maximum tolerable value.
#saastn's fantastic answer assumes you have a set of pre-defined colors that you want to sort your images to. The implementation is easier if you just want to classify the images to one color out of some set of X equidistant colors, a la histogram.
To summarize, round the color of each pixel in the image to the nearest color out of some set of equidistant color bins. This reduces the precision of your colors down to whatever amount of colors that you desire. Then count all of the colors in the image and select the most frequent color as your classification for that image.
Here is my implementation of this in Python:
import cv2
import numpy as np
#Set this to the number of colors that you want to classify the images to
number_of_colors = 8
#Verify that the number of colors chosen is between the minimum possible and maximum possible for an RGB image.
assert 8 <= number_of_colors <= 16777216
#Get the cube root of the number of colors to determine how many bins to split each channel into.
number_of_values_per_channel = number_of_colors ** ( 1 / 3 )
#We will divide each pixel by its maximum value divided by the number of bins we want to divide the values into (minus one for the zero bin).
divisor = 255 / (number_of_values_per_channel - 1)
#load the image and convert it to float32 for greater precision. cv2 loads the image in BGR (as opposed to RGB) format.
image = cv2.imread("image.png", cv2.IMREAD_COLOR).astype(np.float32)
#Divide each pixel by the divisor defined above, round to the nearest bin, then convert float32 back to uint8.
image = np.round(image / divisor).astype(np.uint8)
#Flatten the columns and rows into just one column per channel so that it will be easier to compare the columns across the channels.
image = image.reshape(-1, image.shape[2])
#Find and count matching rows (pixels), where each row consists of three values spread across three channels (Blue column, Red column, Green column).
uniques = np.unique(image, axis=0, return_counts=True)
#The first of the two arrays returned by np.unique is an array compromising all of the unique colors.
colors = uniques[0]
#The second of the two arrays returend by np.unique is an array compromising the counts of all of the unique colors.
color_counts = uniques[1]
#Get the index of the color with the greatest frequency
most_common_color_index = np.argmax(color_counts)
#Get the color that was the most common
most_common_color = colors[most_common_color_index]
#Multiply the channel values by the divisor to return the values to a range between 0 and 255
most_common_color = most_common_color * divisor
#If you want to name each color, you could also provide a list sorted from lowest to highest BGR values comprising of
#the name of each possible color, and then use most_common_color_index to retrieve the name.
print(most_common_color)
I need to iterate over the pixels of a YUV NV12 buffer and set color. I think the conversion for NV12 format should be easy but I can't figure it out. If I could set the top 50x50 pixels at 0,0 to white, I'd be set. Thank you in advance.
Have you tried setting the first 3 bytes (12 bits) * number of pixels to all 0x00 or all 0xFF?
Since you really don't seem to care about the conversion simply overwriting the buffer would suffice. If that works, you can tackle the other problems, like finding the right color and producing a rect instead of a line.
For the first, you need to understand the YUV coding. https://wiki.videolan.org/YUV#NV12. According to this document you will most likely need to overwrite bits in the Y range and in the UV range. So writing at two different locations. Thats very contrary to the RGB buffer where all pixel colors have close locality. So you can start and overwrite the first 8 bits in the Y range and the first or last 2 bits in the UV range. That should set you one pixel to a different color than before.
Finally you can tackle the display of the 50x50 rectangle. You'll need to know the image dimensions, because you'll need to offset after each row (if the buffer is transmitted by rows!). E.g., this graph:
.------.
|xx |
|xx |
| |
'------'
In a rgb color space, with row major transmitted values, the buffer would look like this: xx0000xx0000000000. So you would need to overwrite bytes 0-6 and bytes 18-24 (RGB). Because: first range * 3 bytes RGB. Then next range starts at row number (1) * image width (6) * 3 bytes (RGB), and so on. You have to apply the same thinking to the YUV color space.
I have a 48-bit (16 bits per pixel) image I've loaded with FreeImage. I'm trying to generate a histogram from this image without having to convert it to a 24-bit image.
This is how I understand histograms are calculated..
for (pixel in pixels)
{
red_histo[pixel.red]++;
}
Where pixel.red can be between 0 and 255. So there is a range from 0 to 255 on my histogram. But if there is 16 bits per pixel, it could be between 0 and 65535, which is too large to be displayed on a histogram.
Is there a standard way to calculate histograms with 48-bit (or higher) images?
You have to decide how many bins you need in the histogram. For eg. the Matlab histogram function takes these forms
imhist(I)
imhist(I, n)
imhist(X, map)
In the first case, the number of bins is by default used as 256. So, if you have 16bit input, these will be scaled down to 8 bit and split into 256 bin histogram.
In the second one, you can specify number of bins 'n'. Lets say you specify n=2 for your 16 bit data. Then, this will essentially split the histogram as [0-2^15, 2^15-2^16-1].
The third case is where you specify the map for each bin. ie you have to specify the ranges of the pixel values for each bin.
http://www.mathworks.com/help/images/ref/imhist.html
How you want to choose the number of bins depends on your requirement.
This Stack Overflow Question May have the answer you are looking for.
I do not know if there is a "standard" way.
If this is for display purposes you can scale back the pixels to keep the range from 0-255 for instance:
double scalingFactor = 255/65535;
for (pixel in pixels)
{
red_histo[(int)(scalingFactor * pixel.red)]++;
}
This will allow the upper range of the 16 bit pixel to come in at 255 and lower range of the 16 bit pixel to come in at 0.
I need to mask a green pixels in the image.
I have have example of the masking red pixels.
Here the example:
Image<Hsv, Byte> hsv = image.Convert<Hsv, Byte>()
Image<Gray, Byte>[] channels = hsv.Split();
//channels[0] is the mask for hue less than 20 or larger than 160
CvInvoke.cvInRangeS(channels[0], new MCvScalar(20), new MCvScalar(160), channels[0]);
channels[0]._Not();
but, I cant understand from where those parameters where token:
new MCvScalar(20), new MCvScalar(160)
Any idea which parameters I have to take to mask the green pixels?
Thank you in advance.
The code masks pixels with Hue outside the range 20 - 160 (or rather masks pixeles inside the range and then inverts the mask).
First, understand HSV (Hue, Saturation, Value): http://en.wikipedia.org/wiki/HSL_and_HSV
The actual Hue is in degrees and goes from 0 to 360 like:
Then see OpenCV documentation on 8-bit HSV format:
Hue is first calculated in 0 - 360, then divided by 2 to fit into 8-bit integer.
This means that in the original example the masked pixels have actual Hue under 40 or above 320 degrees. Apparently that's 0 degrees plus / minus 40.
For a similar range of greens you'd want 120 +/- 40, i.e. from 80 to 160. Finally converting that to 8-bit representation - from 40 to 80.
The actual code will differ from your sample though: for red they had to mask 20,160 then invert the mask. For green just masking from 40 to 80 is enough (i.e. you'll have to omit the channels[0]._Not(); part).
What is "rescale intercept" and "rescale slope" in DICOM image (CT)?
How to calculate window width and window center with that?
The rescale intercept and slope are applied to transform the pixel values of the image into values that are meaningful to the application.
For instance, the original pixel values could store a device specific value that has a meaning only when used by the device that generated it: applying the rescale slope/intercept to pixel value converts the original values into optical density or other known measurement units (e.g. hounsfield).
When the transformation is not linear, then a LUT (lookup table) is applied.
After the modality transform has been applied (rescale slope/intercept or LUT) then the window width/center specify which pixels should be visible: all the pixels outside the values specified by the window are displayed as black or white.
For instance, if the window center is 100 and the window width is 20 then all the pixels with a value smaller than 90 are displayed as black and all the pixels with a value bigger than 110 are displayed as white.
This allow to display only portions of the images (for instance just the bones or just the tissues).
Hounsfield scale: http://en.wikipedia.org/wiki/Hounsfield_scale
How to apply the rescale slope/intercept:
final_value = original_value * rescale_slope + rescale_intercept
How to calculate the pixels to display using the window center/width:
lowest_visible_value = window_center - window_width / 2
highest_visible_value = window_center + window_width / 2
Rescale intercept and slope are a simple linear transform applied to the raw pixel data before applying the window width/center. The basic formula is:
NewValue = (RawPixelValue * RescaleSlope) + RescaleIntercept