Large images with Direct2D - image-processing

Currently I am developing application for the Windows Store which does real time-image processing using Direct2D. It must support various sizes of images. The first problem I have faced is how to handle the situations when the image is larger than the maximum supported texture size. After some research and documentation reading I found the VirtualSurfaceImageSource as a solution. The idea was to load the image as IWICBitmap then to create render target with CreateWICBitmapRenderTarget (which as far as I know is not hardware accelerated). After some drawing operations I wanted to display the result to the screen by invalidating the corresponding region in the VirtualSurfaceImage source or when the NeedUpdate callback fires. I supposed that it is possible to do it by creating ID2D1Bitmap (hardware accelerated) and to call CopyFromRenderTarget with the render target created with CreateWICBitmapRenderTarget and the invalidated region as bounds, but the method returns D2DERR_WRONG_RESOURCE_DOMAIN as a result. Another reason for using IWICBitmap is one of the algorithms involved in the application which must have access to update the pixels of the image.
The question is why this logic doesn't work? Is this the right way to achieve my goal using Direct2D? Also as far as the render target created with CreateWICBitmapRenderTarget is not hardware accelerated if I want to do my image processing on the GPU with images larger than the maximum allowed texture size which is the best solution?
Thank you in advance.

You are correct that images larger than the texture limit must be handled in software.
However, the question to ask is whether or not you need that entire image every time you render.
You can use the hardware accel to render a portion of the large image that is loaded in a software target.
For example,
Use ID2D1RenderTarget::CreateSharedBitmap to make a bitmap that can be used by different resources.
Then create a ID2D1BitmapRenderTarget and render the large bitmap into that. (making sure to do BeginDraw, Clear, DrawBitmap, EndDraw). Both the bitmap and the render target can be cached for use by successive calls.
Then copy from that render target into a regular ID2D1Bitmap with the portion that will fit into the texture memory using the ID2D1Bitmap::CopyFromRenderTarget method.
Finally draw that to the real render target, pRT->DrawBitmap

Related

glReadPixels specify resolution

I am trying to capture a screenshot on iOS from an OpenGL view using glReadPixels at half of the native resolution.
glReadPixels is quite slow on retina screens so I'd like to somehow force reading every second pixel and every second row, resulting in a non-retina screenshot (1/4 of the resolution).
I tried setting these:
glPixelStorei(GL_PACK_SKIP_PIXELS, 2);
glPixelStorei(GL_PACK_SKIP_ROWS, 2);
before calling glReadPixels but it doesn't seem to be changing absolutely anything. Instead, it just renders 1/4 of the original image because the width and height I'm passing to glReadPixels is the view's non-retina size.
Alternatively, if you know any more performant way of capturing an OpenGL screenshot, feel free to share it as well.
I don't think there's a very direct way of doing what you're looking for. As you already found out, GL_PACK_SKIP_ROWS and GL_PACK_SKIP_PIXELS do not have the functionality you intended. They only control how many rows/pixels are skipped at the start, not after each row/pixel. And I believe they control skipping in the destination memory anyway, not in the framebuffer you're reading from.
One simple approach to a partial solution would be to make a separate glReadPixels() call per row, which you can then make for every second row. You would still have to copy every second pixel from those rows, but at least it would cut the amount of data you read in half. And it does reduce the additional amount of memory to almost a quarter, since you would only store one row at full resolution. Of course you have overhead for making many more glReadPixels() calls, so it's hard to predict if this will be faster overall.
The nicer approach would be to produce a half-resolution frame that you can read directly. To do that, you could either:
If your toolkits allow it, re-render the frame at half the resolution. You could use an FBO as render target for this, with half the size of the window.
Copy the frame, while downscaling it in the process. Again, create an FBO with a render target half the size, and copy from default framebuffer to this FBO using glBlitFramebuffer().
You can also look into making the read back asynchronous by using a pixel pack buffer (see GL_PACK_BUFFER argument to glBindBuffer()). This will most likely not make the operation faster, but it allows you to continue feeding commands to the GPU while you're waiting for the glReadPixels() results to arrive. It might help you take screenshots while being less disruptive to the game play.

AS3 AIR iOS - How to control when BitmapData is cached/uncached from the GPU?

This question's kind of a 4-parter:
Is it true that all BitmapData is immediately cached to the GPU as soon as it's created (even if it's never applied to a Bitmap or added to stage?)
Does this still happen if the GPU texture buffer is already full? Bonus points: if so, what's the preferential swap method the GPU chooses to select which textures to remove from memory?
If (1), then does setting the width/height of any BitmapData uncache it and/or does replacing its pixels therefore upload the new pixels to the same memory address on the GPU? Bonus: What if the size changes?
To bring this all together, would a hybrid class that extends BitmapData but stores its actual data in a ByteArray be able to use setPixels/getPixels on itself to control upload/download from the GPU as necessary, to buffer a large number of bitmaps? Bonus: Would speed improve for actually placing them in Bitmaps if the instances of this class were static?
Here are some answers
No. In AIR, you manually upload bitmaps to GPU and have control WHEN to do it
As far as I've reached, if the buffer is full, you simply get an error for it - the GPU cannot make a choice what do to. Removing a random texture won't be nice if it's important to you, right? :)
You can check for example Starling and how it uploads textures to GPU. Once you force it to do so, it doesn't care what you do with the bitmap. It's like making a photo image of an object so that you can just show it instead of explaining it with words. It won't matter if you change the object, the photo will be still the same.
Simplified answer: no. Again - it's best to check out how textures are created and how you upload stuff to GPU.

Handle large images in iOS

I want to allow the user to select a photo, without limiting the size, and then edit it.
My idea is to create a thumbnail of the large photo with the same size as the screen for editing, and then, when the editing is finished, use the large photo to make the same edit that was performed on the thumbnail.
When I use UIGraphicsBeginImageContext to create a thumbnail image, it will cause a memory issue.
I know it's hard to edit the whole large image directly due to hardware limits, so I want to know if there is a way I can downsample the large image to less then 2048*2048 wihout memory issues?
I found that there is a BitmapFactory Class which has an inSampleSize option which can downsample a photo in Android platform. How can this be done on iOS?
You need to handle the image loading using UIImage which doesn't actually load the image into memory and then create a bitmap context at the size of the resulting image that you want (so this will be the amount of memory used). Then you need to iterate a number of times drawing tiles from the original image (this is where parts of the image data are loaded into memory) using CGImageCreateWithImageInRect into the destination context using CGContextDrawImage.
See this sample code from Apple.
Large images don't fit in memory. So loading them into memory to then resize them doesn't work.
To work with very large images you have to tile them. Lots of solutions out there already for example see if this can solve your problem:
https://github.com/dhoerl/PhotoScrollerNetwork
I implemented my own custom solution but that was specific to our environment where we had an image tiler running server side already & I could just request specific tiles of large images (madea server, it's really cool)
The reason tiling works is that basically you only ever keep the visible pixels in memory, and there isn't that many of those. All tiles not currently visible are factored out to the disk cache, or flash memory cache as it were.
Take a look at this work by Trevor Harmon. It improved my app's performance.I believe it will work for you too.
https://github.com/coryalder/UIImage_Resize

Reducing Flash file size in KB?

I have a Flash file that I need to reduce the size of.
The reason that I need to reduce its size is that I will need to convert this into an iPhone app.
currently it only has 2 buttons and 2 TLF textfileds on the stage one, layer one and the size of the file is 355KB.
I have also placed the code on layer 2.
is there anyway to reduce the size of it so I won't have problems when publishing and sending for app store?
Thanks
The biggest portion of that file size will be related to TLF. TLF (Text-Layout-Framework) is huge and is generally not recommended on mobile (as it has pretty high cpu usage).
If you're not using any TLF specific features, then it would be wise to change your text fields to use classic text instead (DF3).
Beyond TLF, make sure you're using vector objects instead of bitmaps wherever you can as that will drastically reduce file size. If you are using bitmaps, you can play around with the compression settings to optimize file size further. You can do this globally in the Publish Settings (JPEG Quality) or individually on a graphics properties menu.
One note with Vector graphics and mobile, simple vectors will run ok, but complex vectors will run terribly. Make sure to set cacheAsBitmap = true; on any complex (or even all) vectors to improve performance. OR in FLashPRO, click on a movieClip and in the properties panel, go to the "Display" twirl down, and set cache as bitmap in the Render setting.

Antialiasing/Multisampling in D3D9

I'm writing a 3d modeling application in D3D9 that I'd like to make as broadly compatible as possible. This means using few hardware-dependent features, i.e. multisampling. However, while the realtime render doesn't need to be flawless, I do need to provide nice-looking screen captures, which without multisampling, look quite aliased and poor.
To produce my screen captures, I create a temporary surface in memory, render the scene to it once, then save it to a file. My first thought of how I could achieve an antialiased capture was to create my off-screen stencilsurface as multisampled, but of course DX wouldn't allow that since the device itself had been initialized with D3DMULTISAMPLE_NONE.
To start off, here's a sample of exactly how I create the screencapture. I know that it'd be simpler to just save the backbuffer of an already-rendered frame, however I need the ability to save images of dimension different than the actual render window - which is why I do it this way. Error checking, code for restoring state, and releasing resource are ommitted here for brevity. m_d3ddev is my LPDIRECT3DDEVICE9.
//Get the current pp
LPDIRECT3DSWAPCHAIN9 sc;
D3DPRESENT_PARAMETERS pp;
m_d3ddev->GetSwapChain(0, &sc);
sc->GetPresentParameters(&pp);
//Create a new surface to which we'll render
LPDIRECT3DSURFACE9 ScreenShotSurface= NULL;
LPDIRECT3DSURFACE9 newDepthStencil = NULL;
LPDIRECT3DTEXTURE9 pRenderTexture = NULL;
m_d3ddev->CreateDepthStencilSurface(_Width, _Height, pp.AutoDepthStencilFormat, pp.MultiSampleType, pp.MultiSampleQuality, FALSE, &newDepthStencil, NULL );
m_d3ddev->SetDepthStencilSurface( newDepthStencil );
m_d3ddev->CreateTexture(_Width, _Height, 1, D3DUSAGE_RENDERTARGET, pp.BackBufferFormat, D3DPOOL_DEFAULT, &pRenderTexture, NULL);
pRenderTexture->GetSurfaceLevel(0,&ScreenShotSurface);
//Render the scene to the new surface
m_d3ddev->SetRenderTarget(0, ScreenShotSurface);
RenderFrame();
//Save the surface to a file
D3DXSaveSurfaceToFile(_OutFile, D3DXIFF_JPG, ScreenShotSurface, NULL, NULL);
You can see the call to CreateDepthStencilSurface(), which is where I was hoping I could replace pp.MultiSampleType with i.e. D3DMULTISAMPLE_4_SAMPLES, but that didn't work.
My next thought was to create an entirely different LPDIRECT3DDEVICE9 as a D3DDEVTYPE_REF, which always supports D3DMULTISAMPLE_4_SAMPLES (regardless of the video card). However, all of my resources (meshes, textures) have been loaded into m_d3ddev, my HAL device, thus I couldn't use them for rendering the scene under the REF device. Note that resources can be shared between devices under Direct3d9ex (Vista), but I'm working on XP. Since there are quite a lot of resources, reloading everything to render this one frame, then unloading them, is too time-inefficient for my application.
I looked at other options for antialiasing the image post-capture (i.e. 3x3 blur filter), but they all generated pretty crappy results, so I'd really like to try and get an antialiased scene right out of D3D if possible....
Any wisdom or pointers would be GREATLY appreciated...
Thanks!
Supersampling by either rendering to a larger buffer and scaling down or combining jittered buffers is probably your best bet. Combining multiple jittered buffers should give you the best quality for a given number of samples (better than the regular grid from simply rendering an equivalent number of samples at a multiple of the resolution and scaling down) but has the extra overhead of multiple rendering passes. It has the advantage of not being limited by the maximum supported size of your render target though and allows you to choose pretty much an arbitrary level of AA (though you'll have to watch out for precision issues if combining many jittered buffers).
The article "Antialiasing with Accumulation Buffer" at opengl.org describes how to modify your projection matrix for jittered sampling (OpenGL but the math is basically the same). The paper "Interleaved Sampling" by Alexander Keller and Wolfgang Heidrich talks about an extension of the technique that gives you a better sampling pattern at the expense of even more rendering passes. Sorry about not providing links - as a new user I can only post one link per answer. Google should find them for you.
If you want to go the route of rendering to a larger buffer and down sampling but don't want to be limited by the maximum allowed render target size then you can generate a tiled image using off center projection matrices as described here.
You could always render to a texture that is twice the width and height (ie 4x the size) and then supersample it down.
Admittedly you'd still get problems if the card can't create a texture 4x the size of the back buffer ...
Edit: There is another way that comes to mind.
If you repeat the frame n-times with tiny jitters to the view matrix you will be able to generate as many images as you like which you can then add together afterwards to form a very highly anti-aliased image. The bonus is, it will work on any machine that can render the image. It is, obviously, slower though. Still 256xAA really does look good when you do this!
This article http://msdn.microsoft.com/en-us/library/bb172266(VS.85).aspx seems to imply that you can use the render state flag D3DRS_MULTISAMPLEANTIALIAS to control this. Can you create your device with antialiasing enabled but turn it off for screen rendering and on for your offscreen rendering using this render state flag?
I've not tried this myself though.

Resources