Do indexed PNG 8s use less memory when utilized by XNA

Do indexed PNG 8s use less memory when utilized by XNA - memory

I'm working on an XNA project and running into issues involving memory usage. I was curious if using indexed color PNG 8s vs PNG 24s or PNG 32s for certain textures would free up any memory. It definitely makes the app smaller, but I'm curious if when XNA's content manager loads them they somehow become uncompressed or act as loss-less images.

I answered a similar question just recently.
If you want to save space when distributing your game, you could distribute your PNG files (you can also use jpeg) and load them with Texture2D.FromStream. Note that you'll have to handle things like premultiplied alpha yourself.
If you want to save space on the GPU (and this also improves texture-fetch because the amount of data transferred is smaller) you can use a smaller texture format. The list of supported texture formats is available in this table.
While there is no support for palletised images (like PNG8) or 8-bit colour images at all, there are several 16-bit formats (Bgr565, Bgra5551, Bgra4444). DXT compressed formats are also available. They compress at 4:1 ratio, except opaque DXT1, which is 6:1.
You can use both techniques at once.

Someone just pointed to me to this link on Facebook, immediately after I posted this question.
http://msdn.microsoft.com/en-us/library/windowsphone/develop/hh855082%28v=vs.92%29.aspx
Apparently, in XNA, PNGs of any type become uncompressed in memory and I should look at using DXTs.

Related

FMX TBitmap fastest saving of transparent image

I am scaling icons (using a software scaler algorithm) and want to cache the newly resized icons by saving them to disk while preserving transparency (the alpha channel).
Calling "Bitmap.SaveToDisk('filename.bmp')" followed by "Bitmap.LoadFromDisk('filename.bmp')" strips the alpha channel.
Calling "Bitmap.SaveToDisk('filename.png')" followed by "Bitmap.LoadFromDisk('filename.png')" maintains the alpha channel but has a much higher CPU overhead due to the encoding/decoding required by the PNG format.
I'm aware I can go behind the scenes, get the scanlines and simply dump the scanline data to a file, but I was wondering if there was a more straightforward method with lower CPU utilization?
Edit #1:
I am still interested in an answer, but in the meanwhile I wrote a work-around unit that saves/loads raw ARGB data from a firemonkey TBitmap:
https://github.com/bLightZP/Save-and-Load-FMX-ARGB-Bitmap

How to handle image backgrounds for retina display

I am designing a game that makes use of large backgrounds. These are illustrated backgrounds, that are currently sitting at around 4.5 MB and as backgrounds, are sitting in the scene for the entirety of the game.
First, I am not sure if this would cause memory usage to amp up, but I imagine it would, given that there are also other overlaid textures on the screen. That is my first question: can it cause memory issues?
Second, if I have a background that is 2048 x 1536 and at a 300 dpi, and compress/optimise this image, would it reduce memory usage/CPU usage? Is there documentation that relates to how best to optimise these kinds of images?

There are several techniques to do that. It depends on how you're going to use the images.
If it's a background in movement you can split it in tiles, then you render smaller images.
Depends on the format, most of the people just know PNG and JPEG, but there are other projects/formats you can use. Some of them are the smaller size but slower read/write, so is up to you how to use them. i.e.: https://github.com/onevcat/APNGKit
If in your background is not necessary the alpha channel, use JPEG over PNG then you'll save some space.

High-performance copying of RGB pixel data to the screen in iOS

Our product contains a kind of software image decoder that essentially produces full-frame pixel data that needs to be rapidly copied the screen (we're running on iOS).
Currently we're using CGBitmapContextCreate and we access the memory buffer directly, then for each frame we call CGBitmapContextCreateImage, and then draw that bitmap to the screen. This is WAY too slow for full-screen refreshes on the iPad's retina display at a decent framerate (but it was okay for non-Retina-devices).
We've tried all kinds of OpenGL ES-based approaches, including the use of glTexImage2D and glTexSubImage2D (essentially rendering to a texture), but CPU usage is still high and we can't get more than ~30 FPS for full-screen refreshes on the iPad 3. The problem is that with 30 FPS, CPU usage is nearly at %100 just for copying the pixels to the screen, which means we don't have much to work with for our own rendering on the CPU.
We are open to using OpenGL or any iOS API that would give us maximum performance. The pixel data is formatted as a 32-bit-per-pixel RGBA data but we have some flexibility there...
Any suggestions?

So, the bad news is that you have run into a really hard problem. I have been doing quite a lot of research in this specific area and currently the only way that you can actually blit a framebuffer that is the size of the full screen at 2x is to use the h.264 decoder. There are quite a few nice tricks that can be done with OpenGL once you have image data already decoded into actual memory (take a look at GPUImage). But, the big problem is not how to move the pixels from live memory onto the screen. The real issue is how to move the pixels from the encoded form on disk into live memory. One can use file mapped memory to hold the pixels on disk, but the IO subsystem is not fast enough to be able to swap out enough pages to make it possible to stream 2x full screen size images from mapped memory. This used to work great with 1x full screen sizes, but now the 2x size screens are actually 4x the amount of memory and the hardware just cannot keep up. You could also try to store frames on disk in a more compressed format, like PNG. But, then decoding the compressed format changes the problem from IO bound to CPU bound and you are still stuck. Please have a look at my blog post opengl_write_texture_cache for the full source code and timing results I found with that approach. If you have a very specific format that you can limit the input image data to (like an 8 bit table), then you could use the GPU to blit 8 bit data as 32BPP pixels via a shader, as shown in this example xcode project opengl_color_cycle. But, my advice would be to look at how you could make use of the h.264 decoder since it is actually able to decode that much data in hardware and no other approaches are likely to give you the kind of results you are looking for.

After several years, and several different situations where I ran into this need, I've decided to implement a basic "pixel viewer" view for iOS. It supports highly optimized display of a pixel buffer in a wide variety of formats, including 32-bpp RGBA, 24-bpp RGB, and several YpCbCr formats.
It also supports all of the UIViewContentMode* for smart scaling, scale to fit/fill, etc.
The code is highly optimized (using OpenGL), and achieves excellent performance on even older iOS devices such as iPhone 5 or the original iPad Air. On those devices it achieves 60FPS on all pixel formats except for 24bpp formats, where it achieves around 30-50fps (I usually benchmark by showing a pixel buffer at the device's native resolution, so obviously an iPad has to push far more pixels than the iPhone 5).
Please check out EEPixelViewer.

CoreVideo is most likely the framework you should be looking at. With the OpenGL and CoreGraphics approaches, you're being hit hard by the cost of moving bitmap data from main memory onto GPU memory. This cost exists on desktops as well, but is especially painful on iPhones.
In this case, OpenGL won't net you much of a speed boost over CoreGraphics because the bottleneck is the texture data copy. OpenGL will get you a more efficient rendering pipeline, but the damage will have already been done by the texture copy.
So CoreVideo is the way to go. As I understand the framework, it exists to solve the very problem you're encountering.

The pbuffer or FBO can then be used as a texture map for further rendering by OpenGL ES. This is called Render to Texture or RTT. its much quicker search pbuffer or FBO in EGL

(iPhone, OpenGL) direct texture data storage in files

At this moment I use this scenario to load OpenGL texture from PNG:
load PNG via UIImage
get pixels data via bitmap context
repack pixels to new format (currently RGBA8 -> RGBA4, RGB8 -> RGB565, using ARM NEON instructions)
create OpenGL texture with data
(this approach is commonly used in Cocos2d engine)
It takes much time and seems to do extra work that may be done once per build. So I want to save repacked pixels data back into file and load it directly to OpenGL on second time.
I would know the practical advantages. Does anyone tried it? Is it worth to compress data via zip (as I know, current iDevices have bottleneck in file access)? Would be very thankful for real experience sharing.

Even better, if these are pre-existing images, compress them using PowerVR Texture Compression (PVRTC). PVRTC textures can be loaded directly, and are stored on the GPU in their compressed form, so they can be much smaller than the various raw pixel formats.
I provide an example of how to compress and use PVRTC textures in this sample code (the texture coordinates are a little messed up there, because I haven't corrected them yet). In that example, I just reuse Apple's PVRTexture sample class for handling this type of texture. The PVRTC textures are compressed via a script that's part of one of the build phases, so this can be automated for your various source images.

So, I have made some successful experiment:
I compress texture data by zlib (max compression ratio) and save it to file (via NSData methods). The size of file is much smaller then PNG in some cases.
As for loading time, I can't say exact timestamps because in my project there are 2 parallel threads - one is loading textures on background while another is still rendering scene. It is approximately twice faster - IMHO the main reason is that we copy image data directly to OpengGL without repacking, and input data amount smaller).
PS: Build optimization level plays very high role in loading time: about 4 seconds in debug configuration vs. 1 second in release.

Ignore any advice about PVRTC, that stuff is only useful for 3D textures that have limited color usage. It is better to just use 24 or 32 BPP textures from real images. If you would like to see a real working example of the process you describe then take a look at load-opengl-textures-with-alpha-channel-on-ios. The example shows how texture data can be compressed with 7zip (much better than zip) when attached to the app resource, but then the results are decompressed and saved to disk in an optimal format that can be directly sent to the video card without further pixel component rearranging. This example uses a POT texture, but it would not be too hard to adapt to non-POT and to use the Apple optimizations so that the texture data need not be explicitly copied into the graphics card. Those optimizations are already implemented when sending video data to CoreGraphics.

PVRTC compression increasing the file sizes of PNG

For iPhone game development, I switched from PNG format to PVRTC format for the sake of performance. But PVRTC compression is creating files that are much bigger than the PNG files.. So a PNG of 140 KB (1024x1024) gets bloated to 512 KB or more in the PVRTC format.. I read somewhere that a PNG file of 50KB got compressed to some 10KB and all, in my case, its the other way around..
Any reason why it happens this way and how I can avoid this.. If PVRTC compression is blindly doing 4bpp conversion (1024x1024x0.5) irrespective of the transparencies in the PNG, then whats the compression we are achieving here..
I have 100s of these 1024x1024 images in my game as there are numerous characters each doing some complex animations.. so in this rate of 512KB per image, my app would get more than 50MB.. which is unacceptable for my customer.. ( with PNG, I could have got my app to 10MB)..

In general, uncompressed image data is either 24bpp (RGB) or 32bpp (RGBA) flatrate. PVRTC is 4bpp (or 2bpp) flatrate so there is a compression of 6 or 8 (12 or 16) times compared to this.
A requirement for graphics hardware to use textures natively is that the format of the texture must be random accessible for the hardware. PVRTC is this kind of format, PNG is not and this is why PNG can achieve greater compression ratios. PVRTC is a runtime, deployment format; PNG is a storage format.
PVRTC compression is carried out on 4x4 blocks of pixels at a time and at a flat bit rate so it is easy to calculate where in memory to retrieve the data required to derive a particular texel's value from and there is only one access to memory required. There is dedicated circuitry in the graphics core which will decode this 4x4 block and give the texel value to your shader/texture combiner etc.
PNG compression does not work at a flat bitrate and is more complicated to retrieve specific values from; memory needs to be accessed from multiple locations in order to retrieve a single colour value and far more memory and processing would be required every single time a texture read occurs. So it's not suitable for use as a native texture format and this is why your textures must be decompressed before the graphics hardware will use them. This increases bandwidth use when compared to PVRTC, which requires no decompression for use.
So for offline storage (the size of your application on disk), PNG is smaller than PVRTC which is smaller than completely uncompressed. For runtime memory footprint and performance, PVRTC is smaller and faster than PNG which, because it must be decompressed, is just as large and slow as uncompressed textures. You might gain some advantage with PNG at initialisation for disk access, but then you'd lose time for decompression.
If you want to reduce the storage footprint of PVRTC you could try zip-style compression on the texture files and expand these when you load from disk.

PVRTC (PowerVR Texture Compression) is a texture compression format. On devices using PowerVR e.g. most higher end mobile phones including the iPhone and other ARM-based gadgets like the iPod it is very fast to draw since drawing it is hardware accelerated. It also uses much less memory since images are represented in their compressed form and decoded each draw, whereas a PNG needs to be decompressed before being drawn.
PNG is lossless compression.
PVRTC is lossy compression meaning it approximates the image. It has a completely different design criteria.
PVRTC will 'compress' (by approximating) any type of artwork, giving a fixed bits per texel, including photographic images.
PNG does not approximate the image, so if the image contains little redundancy it will hardly compress at all. On the other hand, a uniform image e.g. an illustration will compress best with PNG.
Its apples and oranges.

Place more than one frame tiled onto a single image and blit the subrectangles of the texture. This will dramatically reduce your memory consumption.
If you images are, say, 64x64, then you can place 256 of them on a 1024x1024 texture in a 16x16 arrangement.
With a little effort, images do not need to be all the same size, just so long as you keep track in the code of the rectangle in the texture that each image is at.
This is how iPhone game developers do it.

I agree with Will. There is no point in the question. I read the question 3 times, but I still don't know what Sankar want to know. It's just a complain, no question.
The only thing I can advice, don't use PVRTC if you mind to use it. It offers performance gain and saves VRAM, but it won't help you in this case. Because what you want is just reducing game volume, not a consideration about trade-off between performance and quality.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart