How to accomodate large quantized dct values in a single byte - image-processing

In case of Jpeg compression, at first we go through color conversion, we divide the whole image into 8*8 blocks, then apply dct on each block. dct values are very high. But after diving with quantization table values, these values become smaller. As far as I have understood, we apply some entropy coding over quantized dct values. But now question is, do we really send this quantized dct values to the receiver? If so then at high quality factor, these quantized dct values will be very high. Those will not be accomodated in 1 byte. My main problem is : do the quantized dct values are passed in the receiver side or the byte image which can be regenerated from quantized dct values. I have found in some.sources that, quantized dct values are passed in receiver side in case of jpeg compression. If so, then how can we accomodate large quantized dct values in 1 byte?

In theory you would store quantized DCT values in 1 byte. Unquantized values require 2bytes.
The problem is quantized values may still require 2-bytes. The quantization tables for "high" quality JPEGs are likely to have small quanitization values (including 1s).

Related

How to get color depth of DICOM pixel data in reliable way?

A DICOM file may have either uncompressed pixel data or compressed pixel data. It's PhotometricInterpretation (0028,0004) can be MONOCHROME1/MONOCHROME2/RGB/PALETTE COLOR/YBR etc. There is also a Pydicom page about color space.
But from these pages or any other DICOM websites, it is not clear to me that how to get the color depth.
Is either BitsAllocated (0028,0100) or BitsStored (0028,0101) tag referring to color depth? Can its color depth be different than these two tag values?
How to get color depth of DICOM pixel data in reliable way?
Bits Stored is the number of bits that is used for the actual color or grayscale data, so it is at least related to the color depth. Bits Allocated is always a multiple of 8, as the data is always organized in bytes, where some of the upper bits may not be used for data (with the exception of Bit Data, where it is 1).
Getting the bit depth is not as straightforward as it may seem. While the number of bits used for the data can mostly be defined, the resolution of the data (e.g. the distance between adjacent values) may also depend on the Photometric Interpretation, and of course on the resolution provided by the modality itself.
The easiest case is monochrome data (Photometric Interpretation is MONOCHROME1 or MONOCHROME2), where the color depth is directly defined by Bits Stored typical values being 12, 14 or 16. The same is mostly true for RGB data (e.g. data originally recorded as RGB), and while it is true that Bits Stored can have different values for JPEG2000 encoded images as correctly mentioned by #kritzel_sw, I yet have to see any RGB data with Bits Stored different from 8. Update: I still haven't seen this, but found that RTDOSE images can have 32 Bits Stored.
For color data in the YBR color space (Photometric Interpretation is YBR_xxx) this is less clear. It somewhat depends on your definition of color depth. Given that the used color space is YBR instead of RGB, and the number of bits used for each component maybe different (for example in YBR_FULL_422, which is used for some JPEG compressed images, 2 channels our downsampled), the resulting image if converted into RGB (what is mostly done) uses 8 bits for each color component, but the actual number of possible values is less than 256 for that reason. So if your definition of color depth depends on the number of bits used per RGB channel, the answer would probably be 8 in this case, but if you define the color depth per YBR channel, the answer could be different and depends both on the Photometric Interpretation and Bits Stored.
A special case is the PhotometricInterpretation of PALETTE COLOR, where the possible colors are defined in the color table. In this case, the number of colors per color component is defined in the first value of the Palette Color Lookup Table Descriptor (0028,1101-1104), which is equal for all 3 tables (e.g. for the Red, Green and Blue components). The actual color depth has to be derived from that value.
Given all that the answer is probably: it depends. I'll also add the note by #kritzel_sw, that many of the IODs limit the degrees of freedom of how pixel data is encoded significantly, which will narrow down the possibilities for the color depth for any concrete type of images.
I'm interested if anybody has a more straightforward answer.

where is the DCT matrix in Libjpeg?

In libjpeg I am unable to locate the 8x8 DCT matrix ? If I am not wrong this matrix is always a constant for a 8x8 block . it must contain 1/sqrt(8) on the first row but where is this matrix ?
In an actual JPEG implementation, the DCT matrix is usually factored down to its Gaussian Normal Form. That gives a series of matrix multiplications. However, in the normal form, these only involve operations on the diagonal and values adjacent to the diagonal. Most of the values in the normalized matrices are zero so you can omit them.
That transforms the DCT into a series of 8 parallel operations.
This book describes a couple of ways the matrix operations can be transformed:
http://www.amazon.com/Compressed-Image-File-Formats-JPEG/dp/0201604434/ref=pd_bxgy_b_img_y
This book describes a tensor approach that is theoretically more efficient but tends not to be so in implementation
http://www.amazon.com/JPEG-Compression-Standard-Multimedia-Standards/dp/0442012721/ref=pd_bxgy_b_img_y
It doesn't. Or maybe it's somewhere in a sneaky place, but it doesn't really matter. Real implementations of DCT don't work that way, they're very specialized pieces of code that have all the constants hardcoded into them, and they look nothing like a matrix multiplication. It is occasionally useful to view the transform as a matrix multiplication from a theoretical standpoint, but it can be implemented much more efficiently.
For the DCT in libjpeg, see for example the file jfdctflt.c (or one of its friends).

Is "color quantization" the right name for color quantization in OpenCV?

I'm looking for a way to get a complete list of all the RGB values for each pixel in a given image using OpenCV, now i call this "color quantization".
The problem is that according to what I have found online, at least at this point, this "color quantization" thing is about histograms or "color reduction" or similar discrete computation solutions.
Since I know what I want and the "internet" seems to have a different opinion about what this words mean, I was wondering: maybe there is not a real solution for this ? a workable way or a working algorithm in the OpenCV lib.
Generally speaking, quantization is an operation that takes an input signal with real (mathematical) values to a set of discrete values. A possible algorithm to implement this process is to compute the histogram of the data, then retaining the n values that correspond to the n bins of the histogram with the higher population.
What you are trying to do would be called maybe color listing.
If you ar eworking with 8 bits quantized images (type CV_8UC3), my guess is that you do what you desire by taking the histogram of the input image (bin width equal to 1) then searching the result for non-empty bins.
Color quantization is the conversion of infinite natural colors in the finite digital color space. Anyway to create a full color 'histogram' you can use opencv's sparse matrix implementation and write your own function to compute it. Of course you have to access the pixels one by one, if you have no other structural or continuity information about the image.

jpeg lossless transformations - memory consumption?

i have just downloaded the latest win32 jpegtran.exe from http://jpegclub.org/jpegtran/ and observed the following:
i have prepared a 24 BPP jpeg test image with 14500 x 10000 pixels.
compressed size in file system is around 7.5 MB.
decompressing into memory (with some image viewer) inflates to around 450 MB.
monitoring the jpegtran.exe command line tool's memory consumption during lossless rotation (180) i can see the process consuming up to 900 MB memory!
i would have assumed that such jpeg lossless transformations don't require decoding the image file into memory and instead would just perform some mathematical transformations on the encoded file itself - keeping the memory footprint very low.
so which of the following is true?
some bug in this particular tool's implementation
some configuration switch i have missed
some misunderstanding at my end (i.e. jpeg lossless transformations also need to decode the image into memory?)
the "mathematical operations" consuming even more memory than "decoding the image into memory"
edit:
according to the answer by JasonD the reason seems to be the latter one. so i'll extend my question:
are there any implementations that can do those operations in small chunks (to avoid high memory usage)? or does it always need to be done on the whole and there's no way around it?
PS:
i'm not planning to implement my own codec / algorithm. instead i'm asking if there are any implementations out there that meet my requirements. or if there could be in theory, at least.
I don't know about the library in question, but in order to perform a lossless rotation on a jpeg image, you would at least have to decompress the DCT coefficients in order to rotate them, and then re-compress.
The DCT coefficients, fully expanded, will be the same size or larger than the original image data, as they have more bits of information.
It's lossless, because the loss in a jpeg is caused by quantization of the DCT coefficients. So long as you don't decode/re-encode/re-quantize these, no loss will be incurred.
But it will be memory intensive.
jpeg compression works very roughly as follows:
Convert image into YCbCr colour space.
Optionally downsample some of the channels (colour error is less perceptible than luminance error, so it is typical to 2x downsample the chroma channels). This is obviously lossy, but very predictably/stably so.
Transform 8x8 blocks of the image by a discrete cosine transform (DCT), moving the image into frequency space. The DCT coefficients are also in an 8x8 block, and use more bits for storage than the 8-bit image data did.
Quantize the DCT coefficients by a variable amount (this is the quality setting in most packages). The aim is to produce as many small and especially zero coefficients as possible. The is the main "lossy" aspect of jpeg compression.
Zig-zag through the 2D data to turn it into a 1D stream of coefficients which is roughly in frequency order. High frequencies are more likely to be zero'd out, so many packets will ideally end in a stream of zeros which can be truncated.
Compress (non-lossily) the (now quite compressible) data using huffman encoding.
So a 'non-lossy' transformation would want to avoid doing as much as possible of that - especially anything beyond the DCT quantization, but that does not avoid expanding the data.

Displaying Eigenfaces with negative values

After implementing the Eigenfaces algorithm for python using numpy, I noticed that the normalized eigenvectors contained negative values. How are these negative values represented when the eigenface is displayed as an image, like this? I thought that images consisted of positive intensity values. Are these eigenface images generated by histogram equalization on the eigenvector?
The plotting of negative values depends on the implementation of the plotting function. Matlab's imagesc, for example, scales image data to the full range of the current colormap and displays the image. This is simpler than histogram equalization.
Yes, for visualization purposes, just map min(eigenface) to 0 and max(eigenface) to 255. Your linked image appears to be doing that. (Note how each eigenface occupies the full dynamic range.)
Eigenfaces (or eigenvectors, in general) will likely have positive and negative elements.

Resources