OpenCV data types - opencv

depth Pixel depth in bits. The supported depths are:
IPL_DEPTH_8U Unsigned 8-bit integer
IPL_DEPTH_8S Signed 8-bit integer
IPL_DEPTH_16U Unsigned 16-bit integer
IPL_DEPTH_16S Signed 16-bit integer
IPL_DEPTH_32S Signed 32-bit integer
IPL_DEPTH_32F Single-precision floating point
IPL_DEPTH_64F Double-precision floating point
What these value actually stands for?
How much bits presents each one?
What is the difference between:
Unsigned 8-bit integer and Signed 8-bit integer ?
Unsigned 16-bit integer and Signed 16-bit integer ?
If they demands 8 and 16 bit respectivly?
What's the sense to use data types with floating point?

An unsigned 8 bit has values from 0 to 255, while a signed 8 bit has values from -127 to 127. Most digital cameras use unsigned data. Signed data is mainly the result of an operation on an image, such as a Canny edge detection.
The reason for higher bit depth images, such as 16 bit, is more detail in the image. This allows more operations, such as white balancing or brightening the image, without creating artifacts in the image. For example, a dark image that has been brightened to much has distinct banding in the image. A 16 bit image will allow the image to be brightened more than an 8 bit image, because there is more information to start with.
Some operations work better with floating point data. For example, a FFT(Fast Fourier Transform). If too many operations are done on an image, then the error from rounding the pixel values to an integer every time, start to accumulate. Using a floating point number mitigates this, but doesn't eliminate this.

Related

How does the "compressed form" of `cv::convertMaps` work?

The documentation for convertMaps says that it supports the following transformation:
(CV_32FC1, CV_32FC1)→(CV_16SC2, CV_16UC1) This is the most frequently used conversion operation, in which the original floating-point maps (see remap) are converted to a more compact and much faster fixed-point representation. The first output array contains the rounded coordinates and the second array (created only when nninterpolation=false) contains indices in the interpolation tables.
I understand that (CV_32FC1, CV_32FC1) is encoding (x, y) coordinates as floats. How does the fixed point format work? What is encoded in each 2-channel entry of the CV_16SC2 matrix? What interpolation tables does the CV_16UC1 matrix index into?
I'm going by what I remember from the last time I investigated this. Grain of salt and all that.
the fixed point format splits the integer and fractional parts of your (x,y)-coordinates into different maps.
it's "compact" in that CV_32FC2 or 2x CV_32FC1 uses 8 bytes per pixel, while CV_16SC2 + CV_16UC1 uses 6 bytes per pixel. also it's integer-only, so using it can free up floating point compute resources for other work.
the integer parts go into the first map, which is 2-channel. no surprises there.
the fractional parts are converted to 5-bit integers, i.e. they're multiplied by 32. then they're packed together, lowest 5 bits from one coordinate, higher next 5 bits from the other one.
the resulting funny number has a range of 0 .. 1023, or 0b00000_00000 .. 0b11111_11111, which encodes fractional parts (0.0, 0.0) and (0.96875, 0.96875) respectively (that's 31/32).
during remap...
the integer map is used to look up, for every resulting pixel, several pixels in the source image required for interpolation.
the fractional map is taken as an index into an "interpolation table", which is internal to OpenCV. it contains whatever factors and shifts required to correctly blend the several sampled pixels into one resulting pixel, all using integer math. I guess there are multiple tables, one for each interpolation method (linear, cubic, ...).

How to check whether a TIFF file truly has 16 bit depth

Assume I convert an 8-bit TIFF image to 16-bit with the following ImageMagick command:
$ convert 8bit-image.tif -depth 16 16bit-image.tif
The result is a file that is detected by other programs as a file with 16-bit depth:
$ identify 16bit-image.tif
16bit-image.tif TIFF 740x573 740x573+0+0 16-bit sRGB 376950B 0.000u 0:00.000
Naturally, this file does not have "true" 16 bit, since it's an 8 bit file which has simply been marked as 16 bit. It hasn't got the subtle nuances one would expect from true 16 bit. How can I distinguish a true 16 bit image from one that just "pretends"?
Best,
Bela
When you have an 8-bit image, the pixel values range from 0 to 255. For a 16-bit image, the pixel range is from 0 to 65535. So you can express more nuances in 16 bit than you can in 8 bit.
Usually, when you have a 16-bit imager in a camera, it is able to capture these nuances and map them to the full 16 bit range. An 8 bit imager will be limited to a smaller range, so when taking the picture, some information is lost compared to the 16 bit imager.
Now when you start out with an 8 bit image, that information is already lost, so converting to 16 bit will not give you greater nuance, because ImageMagick cannot invent information where there is none.
What image processing tools usually do is to fill copy the pixel values of your 8 bit image into the 16 bit image, so your 16 bit image will still contain only values in the range of [0,255]. If this is the case in your example, you can check whether the brightest pixel of your 16 bit image is greater than 255. If it is, you can assume that it is a native 16 bit image. If it isn't, it's likely that it was converted from 8 bit.
However, there is not a guarantee that the 16 bit image was really converted from 8 bit, as it could simply be a very dark native 16 bit image that only uses the darkest pixels from the 8 bit range by chance.
Edit: It is possible that someone converts the 8-bit image to 16 bit using the full 16 bit range. This could mean that a pixel of value 0 might remain at 0, a pixel at 255 might now be at 65535 and all values inbetween will be evenly distributed to the 16 bit range.
However, since no new information can be invented, there will be gaps in the pixel values used, e.g. you might have pixels of value 0, 255, 510 and so on, but values in between do not occur.
Depeding on the algorithm used for stretching the pixel range, these specific values may differ, but you would be able to spot a conversion like that by looking at the image' histogram:
It will have a distinctive comb-like structure (image taken from http://www.northlight-images.co.uk/digital-black-and-white-working-in-16-bit/)
So depending on how the conversion from 8 to 16 bit is executed, finding out whether it is a native image or not may be a bit more complicated and even then it can not be guaranteed to robustly determine whether the image was actually converted or not.

How are the bits allocated?

In OpenCV, the type of the elements in a cv::Mat object could be for instance CV_32FC1, CV_32FC3 which represent 32-bit floating point with one channel and 32-bit floating point with three channels, respectively.
The CV_32FC3 type can be used to represent color images which have blue, green and red channels plus an alpha channel used to represent transparency, with each channel getting 8 bits.
I'm wondering how the bits being allocated in CV_32FC1 type, when there's only one channel?
32F means float. The number after the C is the number of channels. So CV_32FC3 means 3 floats per pixel, while CV_32FC1 is one float only. What these floats mean is up to you and not explicitly stored in the Mat.
The float is stored in memory as it would in your regular C program (typically in little endian).
A classical BGR image (default channel ordering in OpenCV) would be a CV_8UC3: 8 bit unsigned integer per channel in three channels.

opencv threshold operation limited to 8-bit types and 32-bit floats

I tried to threshold a CV_32S, grayscale image. Lo and behold the operation fails. After a more careful reading of the docs I noticed:
src = input array (single-channel, 8-bit or 32-bit floating point) ...
Why the limitation to these types only?

Mat_<uchar> for Image. Why?

I'm reading a code, in this code I can not understand why we use Mat_<uchar> for image (in opencv) for use:
thereshold
what is the advantage of using this matrix?
OpenCV threshold function accepts as source image a 1 channel (i.e. grayscale) matrix, either 8 bit or 32 bit floating point.
So, in your case, you're passing a single channel 8 bit matrix. Its OpenCV type is CV_8UC1.
A Mat_<uchar> is also typedef-ined as Mat1b, and the values of the pixels are in the range [0, 255], since the underlying type (uchar aka unsigned char) is 8 bit, with possible values from 0 to 2^8 - 1.

Resources