FMX TBitmap fastest saving of transparent image - delphi

I am scaling icons (using a software scaler algorithm) and want to cache the newly resized icons by saving them to disk while preserving transparency (the alpha channel).
Calling "Bitmap.SaveToDisk('filename.bmp')" followed by "Bitmap.LoadFromDisk('filename.bmp')" strips the alpha channel.
Calling "Bitmap.SaveToDisk('filename.png')" followed by "Bitmap.LoadFromDisk('filename.png')" maintains the alpha channel but has a much higher CPU overhead due to the encoding/decoding required by the PNG format.
I'm aware I can go behind the scenes, get the scanlines and simply dump the scanline data to a file, but I was wondering if there was a more straightforward method with lower CPU utilization?
Edit #1:
I am still interested in an answer, but in the meanwhile I wrote a work-around unit that saves/loads raw ARGB data from a firemonkey TBitmap:
https://github.com/bLightZP/Save-and-Load-FMX-ARGB-Bitmap

Related

Direct2D – Drawing rectangeles and circles to large images and saving to disk

My task is to draw a lot of simple geometric figures like rectangles and circles to large black-and-white images (about 4000x6000 pixels in size) and save the result to both, bitmap-files and a binary array representing each pixel as 1 if drawn or 0 otherwise. I was using GDI+ (=System.Drawing). Since this, however, took too long, I started having a look at Direct2D. I quickly learned how to draw to a Win32-window and thought I could use this to draw to a bitmap instead.
I learned how to load an image and display it here: https://msdn.microsoft.com/de-de/library/windows/desktop/ee719658(v=vs.85).aspx
But I could not find information on how to create a large ID2D1Bitmap and render to it.
How can I create a render target (must that be a ID2D1HwndRenderTarget?) associated with such a newly created (how?) big bitmap and draw rectangles and circles to it and save it to file, afterwards?
Thank You very much for showing me the right direction,
Jürgen
If I was to do it, I would roll my own code instead of using GDI or DirectX calls. The structure of a binary bitmap is very simple (packed array of bits), and once you have implemented a function to set a single pixel and one to draw a single run (horizontal line segment), drawing rectangles and circles comes easily.
If you don't feel comfortable with bit packing, you can work with a byte array instead (one pixel per byte), and convert the whole image in the end.
Writing the bitmap to a file is also not a big deal once you know about the binary file I/O operations (and you will find many ready-made functions on the Web).
Actually, when you know the specs of the layout of the bitmap file data, you don't need Windows at all.

Can I use gdi+ image as directx9 texture without copy overhead

I have a stream of jpeg pictures(25fps, aboud 1000x700) and i want to render it the screen with as less CPU usage as possible.
By now I found out a fast way to decompress jpeg images - it is a gdi+ api. On my machine it take about 50ns per frame. I don't know how do they manage to do it but it's true, libjpeg8 for example is a much much slower as remembered it.
I tried to use gdi+ to output a stretched picture but it uses to much CPU for such a simple job. So I switched to directx9. It's good for me, but I can't find a good way to convert a gdi+ picture to directx9 texture.
There are a lot of ways to do it and all of them slow and have high CPU usage.
One of them:
get surface from texture
get hdc from surface
create gdi+ graphics from hdc
draw without stretching (DrawI of flat API).
Another way:
lock bits of image
lock bits of surface
copy bits
By the way D3DXCreateTextureFromFileInMemory is slow.
The question is how can I use an image as texture without copy overhead? Or what is the best way to convert image to texture?

Do indexed PNG 8s use less memory when utilized by XNA

I'm working on an XNA project and running into issues involving memory usage. I was curious if using indexed color PNG 8s vs PNG 24s or PNG 32s for certain textures would free up any memory. It definitely makes the app smaller, but I'm curious if when XNA's content manager loads them they somehow become uncompressed or act as loss-less images.
I answered a similar question just recently.
If you want to save space when distributing your game, you could distribute your PNG files (you can also use jpeg) and load them with Texture2D.FromStream. Note that you'll have to handle things like premultiplied alpha yourself.
If you want to save space on the GPU (and this also improves texture-fetch because the amount of data transferred is smaller) you can use a smaller texture format. The list of supported texture formats is available in this table.
While there is no support for palletised images (like PNG8) or 8-bit colour images at all, there are several 16-bit formats (Bgr565, Bgra5551, Bgra4444). DXT compressed formats are also available. They compress at 4:1 ratio, except opaque DXT1, which is 6:1.
You can use both techniques at once.
Someone just pointed to me to this link on Facebook, immediately after I posted this question.
http://msdn.microsoft.com/en-us/library/windowsphone/develop/hh855082%28v=vs.92%29.aspx
Apparently, in XNA, PNGs of any type become uncompressed in memory and I should look at using DXTs.

High-performance copying of RGB pixel data to the screen in iOS

Our product contains a kind of software image decoder that essentially produces full-frame pixel data that needs to be rapidly copied the screen (we're running on iOS).
Currently we're using CGBitmapContextCreate and we access the memory buffer directly, then for each frame we call CGBitmapContextCreateImage, and then draw that bitmap to the screen. This is WAY too slow for full-screen refreshes on the iPad's retina display at a decent framerate (but it was okay for non-Retina-devices).
We've tried all kinds of OpenGL ES-based approaches, including the use of glTexImage2D and glTexSubImage2D (essentially rendering to a texture), but CPU usage is still high and we can't get more than ~30 FPS for full-screen refreshes on the iPad 3. The problem is that with 30 FPS, CPU usage is nearly at %100 just for copying the pixels to the screen, which means we don't have much to work with for our own rendering on the CPU.
We are open to using OpenGL or any iOS API that would give us maximum performance. The pixel data is formatted as a 32-bit-per-pixel RGBA data but we have some flexibility there...
Any suggestions?
So, the bad news is that you have run into a really hard problem. I have been doing quite a lot of research in this specific area and currently the only way that you can actually blit a framebuffer that is the size of the full screen at 2x is to use the h.264 decoder. There are quite a few nice tricks that can be done with OpenGL once you have image data already decoded into actual memory (take a look at GPUImage). But, the big problem is not how to move the pixels from live memory onto the screen. The real issue is how to move the pixels from the encoded form on disk into live memory. One can use file mapped memory to hold the pixels on disk, but the IO subsystem is not fast enough to be able to swap out enough pages to make it possible to stream 2x full screen size images from mapped memory. This used to work great with 1x full screen sizes, but now the 2x size screens are actually 4x the amount of memory and the hardware just cannot keep up. You could also try to store frames on disk in a more compressed format, like PNG. But, then decoding the compressed format changes the problem from IO bound to CPU bound and you are still stuck. Please have a look at my blog post opengl_write_texture_cache for the full source code and timing results I found with that approach. If you have a very specific format that you can limit the input image data to (like an 8 bit table), then you could use the GPU to blit 8 bit data as 32BPP pixels via a shader, as shown in this example xcode project opengl_color_cycle. But, my advice would be to look at how you could make use of the h.264 decoder since it is actually able to decode that much data in hardware and no other approaches are likely to give you the kind of results you are looking for.
After several years, and several different situations where I ran into this need, I've decided to implement a basic "pixel viewer" view for iOS. It supports highly optimized display of a pixel buffer in a wide variety of formats, including 32-bpp RGBA, 24-bpp RGB, and several YpCbCr formats.
It also supports all of the UIViewContentMode* for smart scaling, scale to fit/fill, etc.
The code is highly optimized (using OpenGL), and achieves excellent performance on even older iOS devices such as iPhone 5 or the original iPad Air. On those devices it achieves 60FPS on all pixel formats except for 24bpp formats, where it achieves around 30-50fps (I usually benchmark by showing a pixel buffer at the device's native resolution, so obviously an iPad has to push far more pixels than the iPhone 5).
Please check out EEPixelViewer.
CoreVideo is most likely the framework you should be looking at. With the OpenGL and CoreGraphics approaches, you're being hit hard by the cost of moving bitmap data from main memory onto GPU memory. This cost exists on desktops as well, but is especially painful on iPhones.
In this case, OpenGL won't net you much of a speed boost over CoreGraphics because the bottleneck is the texture data copy. OpenGL will get you a more efficient rendering pipeline, but the damage will have already been done by the texture copy.
So CoreVideo is the way to go. As I understand the framework, it exists to solve the very problem you're encountering.
The pbuffer or FBO can then be used as a texture map for further rendering by OpenGL ES. This is called Render to Texture or RTT. its much quicker search pbuffer or FBO in EGL

(iPhone, OpenGL) direct texture data storage in files

At this moment I use this scenario to load OpenGL texture from PNG:
load PNG via UIImage
get pixels data via bitmap context
repack pixels to new format (currently RGBA8 -> RGBA4, RGB8 -> RGB565, using ARM NEON instructions)
create OpenGL texture with data
(this approach is commonly used in Cocos2d engine)
It takes much time and seems to do extra work that may be done once per build. So I want to save repacked pixels data back into file and load it directly to OpenGL on second time.
I would know the practical advantages. Does anyone tried it? Is it worth to compress data via zip (as I know, current iDevices have bottleneck in file access)? Would be very thankful for real experience sharing.
Even better, if these are pre-existing images, compress them using PowerVR Texture Compression (PVRTC). PVRTC textures can be loaded directly, and are stored on the GPU in their compressed form, so they can be much smaller than the various raw pixel formats.
I provide an example of how to compress and use PVRTC textures in this sample code (the texture coordinates are a little messed up there, because I haven't corrected them yet). In that example, I just reuse Apple's PVRTexture sample class for handling this type of texture. The PVRTC textures are compressed via a script that's part of one of the build phases, so this can be automated for your various source images.
So, I have made some successful experiment:
I compress texture data by zlib (max compression ratio) and save it to file (via NSData methods). The size of file is much smaller then PNG in some cases.
As for loading time, I can't say exact timestamps because in my project there are 2 parallel threads - one is loading textures on background while another is still rendering scene. It is approximately twice faster - IMHO the main reason is that we copy image data directly to OpengGL without repacking, and input data amount smaller).
PS: Build optimization level plays very high role in loading time: about 4 seconds in debug configuration vs. 1 second in release.
Ignore any advice about PVRTC, that stuff is only useful for 3D textures that have limited color usage. It is better to just use 24 or 32 BPP textures from real images. If you would like to see a real working example of the process you describe then take a look at load-opengl-textures-with-alpha-channel-on-ios. The example shows how texture data can be compressed with 7zip (much better than zip) when attached to the app resource, but then the results are decompressed and saved to disk in an optimal format that can be directly sent to the video card without further pixel component rearranging. This example uses a POT texture, but it would not be too hard to adapt to non-POT and to use the Apple optimizations so that the texture data need not be explicitly copied into the graphics card. Those optimizations are already implemented when sending video data to CoreGraphics.

Resources