What is 'Depth' in Image Processing - opencv

When I develop Image Processing Program to use OpenCV, I can usually see 'IPL_DEPTH_8U' or 'IPL_DEPTH_16U'
But, I don't know what does that mean.
What is the meaning of 'Depth' in the context of Image Processing?

Depth is the "precision" of each pixel. Typically it can be 8/24/32 bit for displaying, but any precision for computations.
Instead of precision you can also call it the data type of the pixel. The more bits per element, the better to represent different colors or intensities.
Your examples mean:
8U : 8 bit per element (maybe 8 bit per channel if multiple channels) of unsigned integer type. So probably you can access elements as unsigned char values, because that's 8 bit unsigned type.
16U : 16 bit per element => unsigned short is typically the 16 bit unsigned integer type on your system.
In OpenCV you typically have those types:
8UC3 : 8 bit unsigned and 3 channels => 24 bit per pixel in total.
8UC1 : 8 bit unsigned with a single channel
32S: 32 bit integer type => int
32F: 32 bit floating point => float
64F: 64 bit floating point => double
hope this helps

Related

Diff between bit and byte, and exact meaning of byte

This is just basic theoretical question. so I read that a bit consist of 0 or 1. and a byte consists of 8 bits. and in 8 bit we can store 2^8 nos.
similarly in 10 bits we store 2^10 (1024). but then why do we say that 1024 is 1 kilo bytes, its actually 10 bits which just 1.25 byte to be exact.
please share some knowledge on it
just a concrete explanation.
Bit means like there are 8 bits in 1 byte, bit is the smallest unit of any storage or you can say the system and 8 bits sums up to 1 byte.
A bit, short for binary digit, is the smallest unit of measurement used in computers for information storage. A bit is represented by a 1 or a 0 with the value true or false, also known as on or off. A single byte of information, also known as an octet, is made up of eight bits. The size, or amount of information stored, distinguishes a bit from a byte.
A kilobit is 1,000 bits, but it is designated as 1024 bits in the binary system due to the amount of space required to store a kilobit using common operating systems and storage schemes. Most people, however, think of kilo as referring to 1,000 in order to remember what a kilobit is. A kilobyte then, would be 1,000 bytes.

HOW does a 8 bit processor interpret the 2 bytes of a 16 bit number to be a single piece of info?

Assume the 16 bit no. to be 256.
So,
byte 1 = Some binary no.
byte 2 = Some binary no.
But byte 1 also represents a 8 bit no.(Which could be an independent decimal number) and so does byte 2..
So how does the processor know that bytes 1,2 represent a single no. 256 and not two separate numbers
The processor would need to have another long type for that. I guess you could implement a software equivalent, but for the processor, these two bytes would still have individual values.
The processor could also have a special integer representation and machine instructions that handle these numbers. For example, most modern machines nowadays use twos-complement integers to represent negative numbers. In twos-complement, the most significant bit is used to differentiate negative numbers. So a twos-complement 8-bit integer can have a range of -128 (1000 0000) to 127 (0111 111).
You could easily have the most significant bit mean something else, so for example, when MSB is 0 we have integers from 0 (0000 0000) to 127 (0111 1111); when MSB is 1 we have integers from 256 (1000 0000) to 256 + 127 (1111 1111). Whether this is efficient or good architecture is another history.

Why pixel shader returns float4 when the back buffer format is DXGI_FORMAT_B8G8R8A8_UNORM?

Alright, so this has been bugging me for a while now, and could not find anything on MSDN that goes into the specifics that I need.
This is more of a 3 part question, so here it goes:
1-) When creating the swapchain applications specify backbuffer pixel formats, and most often is either B8G8R8A8 or R8G8B8A8. This gives 8 bit per color channel so a total of 4 bytes is used per pixel....so why does the pixel shader has to return a color as a float4 when float4 is actually 16 bytes?
2-) When binding textures to the Pixel Shader my textures are DXGI_FORMAT_B8G8R8A8_UNORM format, but why does the sampler need a float4 per pixel to work?
3-) Am I missing something here? am I overthinking this or what?
Please provide links to to support your claim. Preferably from MSDN!!!!
GPUs are designed to perform calculations on 32bit floating point data, at least if they want to support D3D11. As of D3D10 you can also perform 32bit signed and unsigned integer operations. There's no requirement or language support for types smaller than 4 bytes in HLSL, so there's no "byte/char" or "short" for 1 and 2 byte integers or lower precision floating point.
Any DXGI formats that use the "FLOAT", "UNORM" or "SNORM" suffix are non-integer formats, while "UINT" and "SINT" are unsigned and signed integer. Any reads performed by the shader on the first three types will be provided to the shader as 32 bit floating point irrespective of whether the original format was 8 bit UNORM/SNORM or 10/11/16/32 bit floating point. Data in vertices is usually stored at a lower precision than full-fat 32bit floating point to save memory, but by the time it reaches the shader it has already been converted to 32bit float.
On output (to UAVs or Render Targets) the GPU compresses the "float" or "uint" data to whatever format the target was created at. If you try outputting float4(4.4, 5.5, 6.6, 10.1) to a target that is 8-bit normalised then it'll simply be truncated to (1.0,1.0,1.0,1.0) and only consume 4 bytes per pixel.
So to answer your questions:
1) Because shaders only operate on 32 bit types, but the GPU will compress/truncate your output as necessary to be stored in the resource you currently have bound according to its type. It would be madness to have special keywords and types for every format that the GPU supported.
2) The "sampler" doesn't "need a float4 per pixel to work". I think you're mixing your terminology. The declaration that the texture is a Texture2D<float4> is really just stating that this texture has four components and is of a format that is not an integer format. "float" doesn't necessarily mean the source data is 32 bit float (or actually even floating point) but merely that the data has a fractional component to it (eg 0.54, 1.32). Equally, declaring a texture as Texture2D<uint4> doesn't mean that the source data is 32 bit unsigned necessarily, but more that it contains four components of unsigned integer data. However, the data will be returned to you and converted to 32 bit float or 32 bit integer for use inside the shader.
3) You're missing the fact that the GPU decompresses textures / vertex data on reads and compresses it again on writes. The amount of storage used for your vertices/texture data is only as much as the format that you create the resource in, and has nothing to do with the fact that the shader is operating on 32 bit floats / integers.

Hardware implementation for integer data processing

I am currently trying to implement a data path which processes an image data expressed in gray scale between unsigned integer 0 - 255. (Just for your information, my goal is to implement a Discrete Wavelet Transform in FPGA)
During the data processing, intermediate values will have negative numbers as well. As an example process, one of the calculation is
result = 48 - floor((66+39)/2)
The floor function is used to guarantee the integer data processing. For the above case, the result is -4, which is a number out of range between 0~255.
Having mentioned above case, I have a series of basic questions.
To deal with the negative intermediate numbers, do I need to represent all the data as 'equivalent unsigned number' in 2's complement for the hardware design? e.g. -4 d = 1111 1100 b.
If I represent the data as 2's complement for the signed numbers, will I need 9 bits opposed to 8 bits? Or, how many bits will I need to process the data properly? (With 8 bits, I cannot represent any number above 128 in 2's complement.)
How does the negative number division works if I use bit wise shifting? If I want to divide the result, -4, with 4, by shifting it to right by 2 bits, the result becomes 63 in decimal, 0011 1111 in binary, instead of -1. How can I resolve this problem?
Any help would be appreciated!
If you can choose to use VHDL, then you can use the fixed point library to represent your numbers and choose your rounding mode, as well as allowing bit extensions etc.
In Verilog, well, I'd think twice. I'm not a Verilogger, but the arithmetic rules for mixing signed and unsigned datatypes seem fraught with foot-shooting opportunities.
Another option to consider might be MyHDL as that gives you a very powerful verification environment and allows you to spit out VHDL or Verilog at the back end as you choose.

How do you use GL_UNSIGNED_BYTE for texture coordinates?

In the "OpenGL ES Programming Guide for iOS" documentation included with XCode, in the "Best Practices for Working with Vertex Data" section, under the heading "Use the Smallest Acceptable Types for Attributes", it says
Specify texture coordinates using 2 or 4 unsigned bytes
(GL_UNSIGNED_BYTE) or unsigned short (GL_UNSIGNED_SHORT).
I'm a bit puzzled. I thought that texture coordinates would be < 1 most of the time, and so would require a float to represent fractional values. How do you use unsigned bytes or unsigned shorts? Divide it by 255 in the shader?
If you use GL_UNSIGNED_BYTE you'll end up passing normalized of GL_TRUE, to (for example) glVertexAttribPointer, indicating that the values you're passing are not between 0 and 1, but should be normalized from their full range (eg. 0 to 255) to the normalized range of 0 to 1 before being passed to your shader. (See Section 2.1.2 of the OpenGL ES 2.0 spec for more details).
In other words, when you're passing integer types like unsigned byte, use a "normalized" value of GL_TRUE and use the full range of that type (such as 0 to 255), so a value of 127 would be approximately equivalent to floating point 0.5.

Resources