Max size of Jpeg2000 image - image-processing

According to Jpeg2000 specs, Xsiz & Ysiz can have values ranging from 1 to (2pow32 -1). That means Max size of jpeg2000 file should be (2pow32 -1)*(2pow32 -1) which is very huge.
Am I missing anything here? or Is there any other limitation on the Xsiz, Ysiz or image size?

The maximum resolution of a JPEG 2000 compliant codestream is, as you point out, 2^32-1 x 2^32-1 pixels. However:
It is the decompressed file that will have a maximum size of pixels 2^32-1 x 2^32-1. However, to obtain the actual decompressed file size you need to multiply that by the number of components and the number of bytes per sample.
As Piglet points out, the compressed file size will (hopefully) be smaller, that's the whole point of image compression: producing compressed files smaller than the uncompressed images.
Even though compliant codestreams may have up to that resolution, it doesn't mean that your encoder/decoder implementation necessarily support images that big. JPEG 2000 introduced the concept of "compliance class", which is a system of guarantees of the minimum dimensions (among other things) supported by a given implementation. In practice, you're probably better off testing what the maximum supported size is.

Related

When streaming JPEG over RTP, how should I adjust for images whose dimensions are not multiples of 8?

As stated in RFC 2435 section 3.1.5, the dimensions of a jpeg image are stored as an 8-pixel multiple for RTP packets, however this is not the case for the actual jpeg file.
How should I accommodate for images whose dimensions are not a multiple of 8? Should the image be re-sized to the nearest multiple or should I increase the value stored by one?
The latter option seems most reasonable in terms of accurately streaming the original image, but would this lead to segment dislocation or a stretched image? Or even just a strip of nonsense pixels?
Edit:
Here is the original image:
Here is a screenshot from VLC without increasing the dimension values by one:
Here is a screenshot after increasing the dimension values to account for excess pixels:
As you can see, there is segment dislocation on each of them, but manifesting in a different manner! On top of this, there is clearly an issue with quantization tables - although that may be a side-effect of the segment dislocation.
What is a proper solution for dealing with the excess pixels? What is the cause of the segment dislocation?

Do PVR textures consume less RAM?

It is my understanding that PVR textures, made with texturetool, are simply compressed images. Therefore the difference lies in the file size.
Frankly, the file size doesn't interest me. What I want to know is, can a PVR texture consume less RAM than a normal .PNG texture? Or does this depend entirely on the texture format (like RGBA8888 etc)?
The essential question would be:
Given X.png and X.pvr, if I display both with texture format RGBA8888, will one consume less RAM than the other?
Yes, the PVR will consume less RAM at all stages — it's unpacked live by the GPU as it's accessed. There's no intermediate decompression.
A PVR-like approach used in digital video is that instead of storing RGB at every pixel, convert to YUV, then store Y at every pixel and U and V only twice per four-pixel block. So you go from 128 bits for the block to 64 bits. To get back to RGB the outputter reads the exactly correct Y and interpolates or accesses the most nearby U and V as necessary.
Schemes like PVR do a similar thing of not storing the full value at every pixel but inferring parts of it from nearby context. What counts as nearby is picked directly corresponding to however the caching is arranged on that GPU. It's usually more in-depth that just scaling down the sampling resolution of some of the channels, e.g. specifying a base offset for samples and then using a tiny precision for each is also common.
So the GPU can always get a value for pixel X by reading only values in a very small, local region of the data.
This contrasts with traditional schemes like PNG where having to know every pixel in the stream prior to X is acceptable if it improves the compression. Processing such things live would flood the GPU's memory bandwidth and hence be completely impractical, so such textures are decompressed from disk and then uploaded.
So schemes like PVR tend to lead to poorer compression and lower per-pixel quality but the win is that they can sit in VRAM compressed. A game will often increase the resolution of its textures if using PVR to try to find a comfortable balance.
Uncompressed textures are:
16 bit per pixel (RGB565, RGBA4444),
24 bit per pixel (RGB888)
PVRTC textures are either 4bpp or even 2bpp. So yes they do use less memory.
Also they perform better because need less memory bandwidth to fetch textures.

jpeg lossless transformations - memory consumption?

i have just downloaded the latest win32 jpegtran.exe from http://jpegclub.org/jpegtran/ and observed the following:
i have prepared a 24 BPP jpeg test image with 14500 x 10000 pixels.
compressed size in file system is around 7.5 MB.
decompressing into memory (with some image viewer) inflates to around 450 MB.
monitoring the jpegtran.exe command line tool's memory consumption during lossless rotation (180) i can see the process consuming up to 900 MB memory!
i would have assumed that such jpeg lossless transformations don't require decoding the image file into memory and instead would just perform some mathematical transformations on the encoded file itself - keeping the memory footprint very low.
so which of the following is true?
some bug in this particular tool's implementation
some configuration switch i have missed
some misunderstanding at my end (i.e. jpeg lossless transformations also need to decode the image into memory?)
the "mathematical operations" consuming even more memory than "decoding the image into memory"
edit:
according to the answer by JasonD the reason seems to be the latter one. so i'll extend my question:
are there any implementations that can do those operations in small chunks (to avoid high memory usage)? or does it always need to be done on the whole and there's no way around it?
PS:
i'm not planning to implement my own codec / algorithm. instead i'm asking if there are any implementations out there that meet my requirements. or if there could be in theory, at least.
I don't know about the library in question, but in order to perform a lossless rotation on a jpeg image, you would at least have to decompress the DCT coefficients in order to rotate them, and then re-compress.
The DCT coefficients, fully expanded, will be the same size or larger than the original image data, as they have more bits of information.
It's lossless, because the loss in a jpeg is caused by quantization of the DCT coefficients. So long as you don't decode/re-encode/re-quantize these, no loss will be incurred.
But it will be memory intensive.
jpeg compression works very roughly as follows:
Convert image into YCbCr colour space.
Optionally downsample some of the channels (colour error is less perceptible than luminance error, so it is typical to 2x downsample the chroma channels). This is obviously lossy, but very predictably/stably so.
Transform 8x8 blocks of the image by a discrete cosine transform (DCT), moving the image into frequency space. The DCT coefficients are also in an 8x8 block, and use more bits for storage than the 8-bit image data did.
Quantize the DCT coefficients by a variable amount (this is the quality setting in most packages). The aim is to produce as many small and especially zero coefficients as possible. The is the main "lossy" aspect of jpeg compression.
Zig-zag through the 2D data to turn it into a 1D stream of coefficients which is roughly in frequency order. High frequencies are more likely to be zero'd out, so many packets will ideally end in a stream of zeros which can be truncated.
Compress (non-lossily) the (now quite compressible) data using huffman encoding.
So a 'non-lossy' transformation would want to avoid doing as much as possible of that - especially anything beyond the DCT quantization, but that does not avoid expanding the data.

How can I optimize UIImage resizes for a specific target filesize?

For any given file data size, I want to be able to resize (or compress) a UIImage to fit within that data limit. This question is NOT about how to resize, or how to check file sizes... it is about an algorithm to getting this in a performant way.
Searching here already, I found this thread which talks about stepping down the image jpeg quality in a linear, or binary algorithm. This isn't very performant, taking dozens of seconds at best.
I am working on iOS so images can be close to 10MB (from iPhone 4S). My target, although variable, is currently 3145728 bytes.
I am currently using UIImageJPEGRepresentation to compress a little, but to get to my low target it appears I would have to lose much quality for such a large photo. Is there a relation between UIImage size and NSData size? Is there some function where I can say something like:
area * X = dataSize
...and solve for a scale factor so I can resize in one shot?
One idea I just had after looking at the thread you linked to: compressing a 10MB image is going to be relatively slow. How about resizing to be much smaller (so that compression is much faster), then performing the compression algorithm (from the link). This can then be used as a guide to the size of compressing the 10MB image? The idea being that the compression ratio should be similar for the same image, independent of size.
Let's say 1000x1000 pixels compressed is 10MB, target size is 3MB.
Then say smaller 100x100 pixels (for example), compressed with same quality, is C MB. Then perform the binary search alg on the 100x100 image until size = C * (3/10). Then use this compression quality for the 1000x1000 image to get ~3MB image.
Note: I have no idea how well this will work - it's just a suggestion. What size to pick (I've used 100x100) for the smaller-sized image is also just a guess and something would need to be experimented with.

PVRTC compression increasing the file sizes of PNG

For iPhone game development, I switched from PNG format to PVRTC format for the sake of performance. But PVRTC compression is creating files that are much bigger than the PNG files.. So a PNG of 140 KB (1024x1024) gets bloated to 512 KB or more in the PVRTC format.. I read somewhere that a PNG file of 50KB got compressed to some 10KB and all, in my case, its the other way around..
Any reason why it happens this way and how I can avoid this.. If PVRTC compression is blindly doing 4bpp conversion (1024x1024x0.5) irrespective of the transparencies in the PNG, then whats the compression we are achieving here..
I have 100s of these 1024x1024 images in my game as there are numerous characters each doing some complex animations.. so in this rate of 512KB per image, my app would get more than 50MB.. which is unacceptable for my customer.. ( with PNG, I could have got my app to 10MB)..
In general, uncompressed image data is either 24bpp (RGB) or 32bpp (RGBA) flatrate. PVRTC is 4bpp (or 2bpp) flatrate so there is a compression of 6 or 8 (12 or 16) times compared to this.
A requirement for graphics hardware to use textures natively is that the format of the texture must be random accessible for the hardware. PVRTC is this kind of format, PNG is not and this is why PNG can achieve greater compression ratios. PVRTC is a runtime, deployment format; PNG is a storage format.
PVRTC compression is carried out on 4x4 blocks of pixels at a time and at a flat bit rate so it is easy to calculate where in memory to retrieve the data required to derive a particular texel's value from and there is only one access to memory required. There is dedicated circuitry in the graphics core which will decode this 4x4 block and give the texel value to your shader/texture combiner etc.
PNG compression does not work at a flat bitrate and is more complicated to retrieve specific values from; memory needs to be accessed from multiple locations in order to retrieve a single colour value and far more memory and processing would be required every single time a texture read occurs. So it's not suitable for use as a native texture format and this is why your textures must be decompressed before the graphics hardware will use them. This increases bandwidth use when compared to PVRTC, which requires no decompression for use.
So for offline storage (the size of your application on disk), PNG is smaller than PVRTC which is smaller than completely uncompressed. For runtime memory footprint and performance, PVRTC is smaller and faster than PNG which, because it must be decompressed, is just as large and slow as uncompressed textures. You might gain some advantage with PNG at initialisation for disk access, but then you'd lose time for decompression.
If you want to reduce the storage footprint of PVRTC you could try zip-style compression on the texture files and expand these when you load from disk.
PVRTC (PowerVR Texture Compression) is a texture compression format. On devices using PowerVR e.g. most higher end mobile phones including the iPhone and other ARM-based gadgets like the iPod it is very fast to draw since drawing it is hardware accelerated. It also uses much less memory since images are represented in their compressed form and decoded each draw, whereas a PNG needs to be decompressed before being drawn.
PNG is lossless compression.
PVRTC is lossy compression meaning it approximates the image. It has a completely different design criteria.
PVRTC will 'compress' (by approximating) any type of artwork, giving a fixed bits per texel, including photographic images.
PNG does not approximate the image, so if the image contains little redundancy it will hardly compress at all. On the other hand, a uniform image e.g. an illustration will compress best with PNG.
Its apples and oranges.
Place more than one frame tiled onto a single image and blit the subrectangles of the texture. This will dramatically reduce your memory consumption.
If you images are, say, 64x64, then you can place 256 of them on a 1024x1024 texture in a 16x16 arrangement.
With a little effort, images do not need to be all the same size, just so long as you keep track in the code of the rectangle in the texture that each image is at.
This is how iPhone game developers do it.
I agree with Will. There is no point in the question. I read the question 3 times, but I still don't know what Sankar want to know. It's just a complain, no question.
The only thing I can advice, don't use PVRTC if you mind to use it. It offers performance gain and saves VRAM, but it won't help you in this case. Because what you want is just reducing game volume, not a consideration about trade-off between performance and quality.

Resources