How to check whether a TIFF file truly has 16 bit depth - image-processing

Assume I convert an 8-bit TIFF image to 16-bit with the following ImageMagick command:
$ convert 8bit-image.tif -depth 16 16bit-image.tif
The result is a file that is detected by other programs as a file with 16-bit depth:
$ identify 16bit-image.tif
16bit-image.tif TIFF 740x573 740x573+0+0 16-bit sRGB 376950B 0.000u 0:00.000
Naturally, this file does not have "true" 16 bit, since it's an 8 bit file which has simply been marked as 16 bit. It hasn't got the subtle nuances one would expect from true 16 bit. How can I distinguish a true 16 bit image from one that just "pretends"?
Best,
Bela

When you have an 8-bit image, the pixel values range from 0 to 255. For a 16-bit image, the pixel range is from 0 to 65535. So you can express more nuances in 16 bit than you can in 8 bit.
Usually, when you have a 16-bit imager in a camera, it is able to capture these nuances and map them to the full 16 bit range. An 8 bit imager will be limited to a smaller range, so when taking the picture, some information is lost compared to the 16 bit imager.
Now when you start out with an 8 bit image, that information is already lost, so converting to 16 bit will not give you greater nuance, because ImageMagick cannot invent information where there is none.
What image processing tools usually do is to fill copy the pixel values of your 8 bit image into the 16 bit image, so your 16 bit image will still contain only values in the range of [0,255]. If this is the case in your example, you can check whether the brightest pixel of your 16 bit image is greater than 255. If it is, you can assume that it is a native 16 bit image. If it isn't, it's likely that it was converted from 8 bit.
However, there is not a guarantee that the 16 bit image was really converted from 8 bit, as it could simply be a very dark native 16 bit image that only uses the darkest pixels from the 8 bit range by chance.
Edit: It is possible that someone converts the 8-bit image to 16 bit using the full 16 bit range. This could mean that a pixel of value 0 might remain at 0, a pixel at 255 might now be at 65535 and all values inbetween will be evenly distributed to the 16 bit range.
However, since no new information can be invented, there will be gaps in the pixel values used, e.g. you might have pixels of value 0, 255, 510 and so on, but values in between do not occur.
Depeding on the algorithm used for stretching the pixel range, these specific values may differ, but you would be able to spot a conversion like that by looking at the image' histogram:
It will have a distinctive comb-like structure (image taken from http://www.northlight-images.co.uk/digital-black-and-white-working-in-16-bit/)
So depending on how the conversion from 8 to 16 bit is executed, finding out whether it is a native image or not may be a bit more complicated and even then it can not be guaranteed to robustly determine whether the image was actually converted or not.

Related

What do ImageJ software when convert 8bits color image to 16 bits grayscale image?

I use ImageJ to convert 8 bits black and white RGB images to 16 bits grayscale images, but the conversion seems to change the dynamic of the gray levels. Blacks seem more black and whites seem whiter.
Here are captures of an image before and after conversion :
8b color (brefore imageJ conversion
16b Grayscale (after imageJ conversion)
In addition, the ImageJ documentation explains that this conversion (8 bits RGB to 16 bits grayscale) is not supported but I still want to understand what's going on.
Does anyone have the same issue ?
Many thanks in advance :)

Image Processing: Determining a trapezoid from a list of points

The problem is fairly simple: I have the following image.
My list of points is the white pixels, I have them stored in a texture. What would be the best and possibly most efficient method to determine the trapezoid they define? (Convex shape with 4 corners, doesn't necessarily have 90 degree angles).
The texture is fairly small (800x600) so going for CUDA/CL is definetly not worth it (I'd rather iterate over the pixels if possible).
You should be able to do what you want, i.e. detect lines from incomplete information, using the Hough Transform.
There is a cool demo of it in the examples accompanying CImg which itself is a rather nice, simple, header-only C++ image processing library. I have made a video of it here, showing how the accumulator space on the right is updated as I move the mouse first along a horizontal bar of the cage and then down a vertical bar. You can see the votes cast in the accumulator and that the point in the accumulator gradually builds up to a peak of bright white:
You can also experiment with ImageMagick on the command-line without needing to write or compile any code, see example here. ImageMagick is installed on most Linux distros and is available for macOS and Windows.
So, using your image:
magick trapezoid.png -background black -fill red -hough-lines 9x9+10 result.png
Or, if you want the underlying information that identifies the 4 lines:
magick trapezoid.png -threshold 50% -hough-lines 9x9+10 mvg:
# Hough line transform: 9x9+10
viewbox 0 0 784 561
# x1,y1 x2,y2 # count angle distance
line 208.393,0 78.8759,561 # 14 13 312
line 0,101.078 784,267.722 # 28 102 460
line 0,355.907 784,551.38 # 14 104 722
line 680.493,0 550.976,561 # 12 13 772
If you look at the numbers immediately following the hash (#), i.e. 14, 28, 14, 12 they are the votes which correspond to the number of points/dots in your original image along that line. That's is why I set the threshold to 10, in the 9x9+10 part - rather than using the 40 in the ImageMagick example I linked to. I mean you have relatively few points on each line so you need a lower threshold.
Note that the Hough Transform is also available in other packages, such as OpenCV.

Mat_<uchar> for Image. Why?

I'm reading a code, in this code I can not understand why we use Mat_<uchar> for image (in opencv) for use:
thereshold
what is the advantage of using this matrix?
OpenCV threshold function accepts as source image a 1 channel (i.e. grayscale) matrix, either 8 bit or 32 bit floating point.
So, in your case, you're passing a single channel 8 bit matrix. Its OpenCV type is CV_8UC1.
A Mat_<uchar> is also typedef-ined as Mat1b, and the values of the pixels are in the range [0, 255], since the underlying type (uchar aka unsigned char) is 8 bit, with possible values from 0 to 2^8 - 1.

How to choose the number of bins when creating HSV histogram?

I was reading some documentation about HSV histogram, and in several refs the Saturation channel was quantized into 256 values. Why is that? Is there any reason behind choosing this number?
I have the same questions for the Hue channel, often it is quantized into 180 values.
Disclaimer: Off-hand answers (i.e., not backed up by any documentation):
"256" is a popular number for a bin size because Programmers Like Round Numbers -- it fits in a single byte. And "180" because the HSB circle is "360 [degrees]", but "360" does not fit into a single byte.
For many image formats, the range of RGB values is limited to 0..255 per channel -- 3 bytes in total. To store the same amount of data (ignoring any artifacts of converting to another color model), Saturation and Brightness are often expressed in single bytes as well. The same could be done for Hue, by scaling the original range of 0..359 (as Hue is usually expressed as a value in degrees on the HSB Color Wheel) into the byte range 0..255. However, probably because it's easier to do calculations with a number close to the original 360° full circle, the range is clipped to 0..179. That way the value can be stored into a single byte (and thus "HSB" uses as much memory as "RGB") and can be converted trivially back to (close to) its original value -- multiply by 2. Obviously, sticking to the storage space wins over fidelity.
Given 256 values for both S and B, and 180 for H, you end up with a color space of 256*256*180 = 11,796,480 colors. To inspect the number of colors, you build a histogram: an array where you can read out the total amount of pixels in a certain color or color range. Using a color range here, instead of actual values, significantly cuts down the memory requirements.
For an RGB color image, with the colors fairly evenly distributed, you could shift down each channel a certain number of bits. This is how a straightforward conversion from 24-bit "true-color" RGB down to 15-bit RGB "high-color" space works: each channel gets divided by 8, reducing 256 values down to 32 (5 bits per channel). Conversion to a 16-bit high-color RGB space works the same; the bit that got left over in the 15-bit conversion is assigned to green. Thus, the range of colors for green is doubled, which is useful since the human eye is more perceptive for shades of green than for the other two primaries.
It gets more complicated when the colors in the input image are not evenly distributed. A naive solution is to create an array of [256][256][256], initialize all to zero, then fill the array with the colors of the image, and finally sort them. There are better alternatives -- let me consult my old Computer Graphics [1] here. Hold on.
13.4 Reproducing Color mentions the names of two different approaches from Heckbert (Color Image Quantization for Frame Buffer Display, SIGGRAPH 82): the popularity and the median-cut algorithms. (Unfortunately, that's all they say about this topic. I assume efficient code for both can be googled for.)
A rough guess:
The size for each bin (H,S,B) should be reflected by what you are trying to use it for. This older SO question, for example, uses a large bin for hue -- color is considered the most important -- and only 3 different values for both saturation and brightness. Thus, bright images with some subdued areas (say, a comic book) will give a good spread in this histogram, but a real-color photograph will not so much.
The main limit is that the bin sizes, multiplied with each other, should use a reasonably small amount of memory, yet cover enough of each component to get evenly filled. Perhaps some trial-and-error comes into play here. You could initially evenly distribute all of H, S, and B components over the available memory in your histogram and process a small part of the image; say, 1 out of 4 pixels, horizontally and vertically. If you notice one of the component bins fills up too fas where others stay untouched, adjust the ranges and restart.
If you need to do an analysis of multiple pictures, make sure they are all alike in their color gamut. You cannot expect a reasonable bin size to work on all sorts of images; you would end up with an evenly distribution, where all matches are only so-so.
[1] Computer Graphics. Principles and Practices. (1997) J.D. Foley, A. van Dam, S.K. Feiner, and J.F. Hughes, 2nd ed., Reading, MA: Addison-Wesley.

OpenCV data types

depth Pixel depth in bits. The supported depths are:
IPL_DEPTH_8U Unsigned 8-bit integer
IPL_DEPTH_8S Signed 8-bit integer
IPL_DEPTH_16U Unsigned 16-bit integer
IPL_DEPTH_16S Signed 16-bit integer
IPL_DEPTH_32S Signed 32-bit integer
IPL_DEPTH_32F Single-precision floating point
IPL_DEPTH_64F Double-precision floating point
What these value actually stands for?
How much bits presents each one?
What is the difference between:
Unsigned 8-bit integer and Signed 8-bit integer ?
Unsigned 16-bit integer and Signed 16-bit integer ?
If they demands 8 and 16 bit respectivly?
What's the sense to use data types with floating point?
An unsigned 8 bit has values from 0 to 255, while a signed 8 bit has values from -127 to 127. Most digital cameras use unsigned data. Signed data is mainly the result of an operation on an image, such as a Canny edge detection.
The reason for higher bit depth images, such as 16 bit, is more detail in the image. This allows more operations, such as white balancing or brightening the image, without creating artifacts in the image. For example, a dark image that has been brightened to much has distinct banding in the image. A 16 bit image will allow the image to be brightened more than an 8 bit image, because there is more information to start with.
Some operations work better with floating point data. For example, a FFT(Fast Fourier Transform). If too many operations are done on an image, then the error from rounding the pixel values to an integer every time, start to accumulate. Using a floating point number mitigates this, but doesn't eliminate this.

Resources