For iPhone game development, I switched from PNG format to PVRTC format for the sake of performance. But PVRTC compression is creating files that are much bigger than the PNG files.. So a PNG of 140 KB (1024x1024) gets bloated to 512 KB or more in the PVRTC format.. I read somewhere that a PNG file of 50KB got compressed to some 10KB and all, in my case, its the other way around..
Any reason why it happens this way and how I can avoid this.. If PVRTC compression is blindly doing 4bpp conversion (1024x1024x0.5) irrespective of the transparencies in the PNG, then whats the compression we are achieving here..
I have 100s of these 1024x1024 images in my game as there are numerous characters each doing some complex animations.. so in this rate of 512KB per image, my app would get more than 50MB.. which is unacceptable for my customer.. ( with PNG, I could have got my app to 10MB)..
In general, uncompressed image data is either 24bpp (RGB) or 32bpp (RGBA) flatrate. PVRTC is 4bpp (or 2bpp) flatrate so there is a compression of 6 or 8 (12 or 16) times compared to this.
A requirement for graphics hardware to use textures natively is that the format of the texture must be random accessible for the hardware. PVRTC is this kind of format, PNG is not and this is why PNG can achieve greater compression ratios. PVRTC is a runtime, deployment format; PNG is a storage format.
PVRTC compression is carried out on 4x4 blocks of pixels at a time and at a flat bit rate so it is easy to calculate where in memory to retrieve the data required to derive a particular texel's value from and there is only one access to memory required. There is dedicated circuitry in the graphics core which will decode this 4x4 block and give the texel value to your shader/texture combiner etc.
PNG compression does not work at a flat bitrate and is more complicated to retrieve specific values from; memory needs to be accessed from multiple locations in order to retrieve a single colour value and far more memory and processing would be required every single time a texture read occurs. So it's not suitable for use as a native texture format and this is why your textures must be decompressed before the graphics hardware will use them. This increases bandwidth use when compared to PVRTC, which requires no decompression for use.
So for offline storage (the size of your application on disk), PNG is smaller than PVRTC which is smaller than completely uncompressed. For runtime memory footprint and performance, PVRTC is smaller and faster than PNG which, because it must be decompressed, is just as large and slow as uncompressed textures. You might gain some advantage with PNG at initialisation for disk access, but then you'd lose time for decompression.
If you want to reduce the storage footprint of PVRTC you could try zip-style compression on the texture files and expand these when you load from disk.
PVRTC (PowerVR Texture Compression) is a texture compression format. On devices using PowerVR e.g. most higher end mobile phones including the iPhone and other ARM-based gadgets like the iPod it is very fast to draw since drawing it is hardware accelerated. It also uses much less memory since images are represented in their compressed form and decoded each draw, whereas a PNG needs to be decompressed before being drawn.
PNG is lossless compression.
PVRTC is lossy compression meaning it approximates the image. It has a completely different design criteria.
PVRTC will 'compress' (by approximating) any type of artwork, giving a fixed bits per texel, including photographic images.
PNG does not approximate the image, so if the image contains little redundancy it will hardly compress at all. On the other hand, a uniform image e.g. an illustration will compress best with PNG.
Its apples and oranges.
Place more than one frame tiled onto a single image and blit the subrectangles of the texture. This will dramatically reduce your memory consumption.
If you images are, say, 64x64, then you can place 256 of them on a 1024x1024 texture in a 16x16 arrangement.
With a little effort, images do not need to be all the same size, just so long as you keep track in the code of the rectangle in the texture that each image is at.
This is how iPhone game developers do it.
I agree with Will. There is no point in the question. I read the question 3 times, but I still don't know what Sankar want to know. It's just a complain, no question.
The only thing I can advice, don't use PVRTC if you mind to use it. It offers performance gain and saves VRAM, but it won't help you in this case. Because what you want is just reducing game volume, not a consideration about trade-off between performance and quality.
Related
Is there any difference in terms of precision and speed in using SIFT with JPEG, PNG or PGM images? Obviously supposing the same image size.
This question does not make sense.
SIFT is an algorithm, it operates on raw pixel data in memory, not on files.
You would need some framework that loads image files into some data structure in memroy.
SIFT itself doesn't care if you load a jpeg or a pgm. It will never know where the pixels came from.
Having the same size (to 1 bit) for the same image in three different formats would be pretty impossible in my opinion. As PGM is uncompressed this would also mean that JPEG and PNG would have to be uncompressed to have roughly the same size. If you have no compression you have no losses. So there would be no difference in SIFT performance.
It is my understanding that PVR textures, made with texturetool, are simply compressed images. Therefore the difference lies in the file size.
Frankly, the file size doesn't interest me. What I want to know is, can a PVR texture consume less RAM than a normal .PNG texture? Or does this depend entirely on the texture format (like RGBA8888 etc)?
The essential question would be:
Given X.png and X.pvr, if I display both with texture format RGBA8888, will one consume less RAM than the other?
Yes, the PVR will consume less RAM at all stages — it's unpacked live by the GPU as it's accessed. There's no intermediate decompression.
A PVR-like approach used in digital video is that instead of storing RGB at every pixel, convert to YUV, then store Y at every pixel and U and V only twice per four-pixel block. So you go from 128 bits for the block to 64 bits. To get back to RGB the outputter reads the exactly correct Y and interpolates or accesses the most nearby U and V as necessary.
Schemes like PVR do a similar thing of not storing the full value at every pixel but inferring parts of it from nearby context. What counts as nearby is picked directly corresponding to however the caching is arranged on that GPU. It's usually more in-depth that just scaling down the sampling resolution of some of the channels, e.g. specifying a base offset for samples and then using a tiny precision for each is also common.
So the GPU can always get a value for pixel X by reading only values in a very small, local region of the data.
This contrasts with traditional schemes like PNG where having to know every pixel in the stream prior to X is acceptable if it improves the compression. Processing such things live would flood the GPU's memory bandwidth and hence be completely impractical, so such textures are decompressed from disk and then uploaded.
So schemes like PVR tend to lead to poorer compression and lower per-pixel quality but the win is that they can sit in VRAM compressed. A game will often increase the resolution of its textures if using PVR to try to find a comfortable balance.
Uncompressed textures are:
16 bit per pixel (RGB565, RGBA4444),
24 bit per pixel (RGB888)
PVRTC textures are either 4bpp or even 2bpp. So yes they do use less memory.
Also they perform better because need less memory bandwidth to fetch textures.
i have just downloaded the latest win32 jpegtran.exe from http://jpegclub.org/jpegtran/ and observed the following:
i have prepared a 24 BPP jpeg test image with 14500 x 10000 pixels.
compressed size in file system is around 7.5 MB.
decompressing into memory (with some image viewer) inflates to around 450 MB.
monitoring the jpegtran.exe command line tool's memory consumption during lossless rotation (180) i can see the process consuming up to 900 MB memory!
i would have assumed that such jpeg lossless transformations don't require decoding the image file into memory and instead would just perform some mathematical transformations on the encoded file itself - keeping the memory footprint very low.
so which of the following is true?
some bug in this particular tool's implementation
some configuration switch i have missed
some misunderstanding at my end (i.e. jpeg lossless transformations also need to decode the image into memory?)
the "mathematical operations" consuming even more memory than "decoding the image into memory"
edit:
according to the answer by JasonD the reason seems to be the latter one. so i'll extend my question:
are there any implementations that can do those operations in small chunks (to avoid high memory usage)? or does it always need to be done on the whole and there's no way around it?
PS:
i'm not planning to implement my own codec / algorithm. instead i'm asking if there are any implementations out there that meet my requirements. or if there could be in theory, at least.
I don't know about the library in question, but in order to perform a lossless rotation on a jpeg image, you would at least have to decompress the DCT coefficients in order to rotate them, and then re-compress.
The DCT coefficients, fully expanded, will be the same size or larger than the original image data, as they have more bits of information.
It's lossless, because the loss in a jpeg is caused by quantization of the DCT coefficients. So long as you don't decode/re-encode/re-quantize these, no loss will be incurred.
But it will be memory intensive.
jpeg compression works very roughly as follows:
Convert image into YCbCr colour space.
Optionally downsample some of the channels (colour error is less perceptible than luminance error, so it is typical to 2x downsample the chroma channels). This is obviously lossy, but very predictably/stably so.
Transform 8x8 blocks of the image by a discrete cosine transform (DCT), moving the image into frequency space. The DCT coefficients are also in an 8x8 block, and use more bits for storage than the 8-bit image data did.
Quantize the DCT coefficients by a variable amount (this is the quality setting in most packages). The aim is to produce as many small and especially zero coefficients as possible. The is the main "lossy" aspect of jpeg compression.
Zig-zag through the 2D data to turn it into a 1D stream of coefficients which is roughly in frequency order. High frequencies are more likely to be zero'd out, so many packets will ideally end in a stream of zeros which can be truncated.
Compress (non-lossily) the (now quite compressible) data using huffman encoding.
So a 'non-lossy' transformation would want to avoid doing as much as possible of that - especially anything beyond the DCT quantization, but that does not avoid expanding the data.
I'm working on an XNA project and running into issues involving memory usage. I was curious if using indexed color PNG 8s vs PNG 24s or PNG 32s for certain textures would free up any memory. It definitely makes the app smaller, but I'm curious if when XNA's content manager loads them they somehow become uncompressed or act as loss-less images.
I answered a similar question just recently.
If you want to save space when distributing your game, you could distribute your PNG files (you can also use jpeg) and load them with Texture2D.FromStream. Note that you'll have to handle things like premultiplied alpha yourself.
If you want to save space on the GPU (and this also improves texture-fetch because the amount of data transferred is smaller) you can use a smaller texture format. The list of supported texture formats is available in this table.
While there is no support for palletised images (like PNG8) or 8-bit colour images at all, there are several 16-bit formats (Bgr565, Bgra5551, Bgra4444). DXT compressed formats are also available. They compress at 4:1 ratio, except opaque DXT1, which is 6:1.
You can use both techniques at once.
Someone just pointed to me to this link on Facebook, immediately after I posted this question.
http://msdn.microsoft.com/en-us/library/windowsphone/develop/hh855082%28v=vs.92%29.aspx
Apparently, in XNA, PNGs of any type become uncompressed in memory and I should look at using DXTs.
At this moment I use this scenario to load OpenGL texture from PNG:
load PNG via UIImage
get pixels data via bitmap context
repack pixels to new format (currently RGBA8 -> RGBA4, RGB8 -> RGB565, using ARM NEON instructions)
create OpenGL texture with data
(this approach is commonly used in Cocos2d engine)
It takes much time and seems to do extra work that may be done once per build. So I want to save repacked pixels data back into file and load it directly to OpenGL on second time.
I would know the practical advantages. Does anyone tried it? Is it worth to compress data via zip (as I know, current iDevices have bottleneck in file access)? Would be very thankful for real experience sharing.
Even better, if these are pre-existing images, compress them using PowerVR Texture Compression (PVRTC). PVRTC textures can be loaded directly, and are stored on the GPU in their compressed form, so they can be much smaller than the various raw pixel formats.
I provide an example of how to compress and use PVRTC textures in this sample code (the texture coordinates are a little messed up there, because I haven't corrected them yet). In that example, I just reuse Apple's PVRTexture sample class for handling this type of texture. The PVRTC textures are compressed via a script that's part of one of the build phases, so this can be automated for your various source images.
So, I have made some successful experiment:
I compress texture data by zlib (max compression ratio) and save it to file (via NSData methods). The size of file is much smaller then PNG in some cases.
As for loading time, I can't say exact timestamps because in my project there are 2 parallel threads - one is loading textures on background while another is still rendering scene. It is approximately twice faster - IMHO the main reason is that we copy image data directly to OpengGL without repacking, and input data amount smaller).
PS: Build optimization level plays very high role in loading time: about 4 seconds in debug configuration vs. 1 second in release.
Ignore any advice about PVRTC, that stuff is only useful for 3D textures that have limited color usage. It is better to just use 24 or 32 BPP textures from real images. If you would like to see a real working example of the process you describe then take a look at load-opengl-textures-with-alpha-channel-on-ios. The example shows how texture data can be compressed with 7zip (much better than zip) when attached to the app resource, but then the results are decompressed and saved to disk in an optimal format that can be directly sent to the video card without further pixel component rearranging. This example uses a POT texture, but it would not be too hard to adapt to non-POT and to use the Apple optimizations so that the texture data need not be explicitly copied into the graphics card. Those optimizations are already implemented when sending video data to CoreGraphics.