Does using a canvas element as texture source involve CPU/GPU sync and memory copy? - webgl

I have a HTMLCanvasElement with some pixels rendered on it. Now I want to upload its data to another framebuffer (via color attachment texture).
gl.texImage2d(...otherArgs, canvasElement);
I'm wondering if this operation involves CPU/GPU sync and memory copy (from GPU memory to CPU memory and then to GPU memory again).
If it is implementation specific, my app will run in chromium (electron/edge webview).

Looking at the source code today there's various GPU-GPU fast-paths for canvas sources, including 2D canvases. So today (since 2013) there's no CPU readback for Canvas2D texImage sources in chromium.

Related

Fast display of image using openCV

I have written an image processing application using Visual C++ forms and OpenCV on a windows machine. Everything seems to work ok, but displaying the images is very slow - only a few fps. I would like to be able to get to 30 or so. I am currently using the standard imshow(...) followed by waitkey(1).
My question is: Is there a better (i.e. faster) way to get an image from memory to the monitor.
The Mat structure used by openCV is essentially a fancy header pointing to a contiguous block of unsigned char values.
Edit:
I tested my code with the VS2013 profiler and it claims that I am spending 50% of the execution time in imshow/waitkey.
I've seen several discussions on this in the OpenCV Q/A forum and they always end with "you shouldn't be using imshow except for debugging" but nobody is suggesting anything else to use, so I thought I'd try here.
guy
Without seeing what you have, here is the approach I would take to achieve what you want.
Have a dedicated thread for frame acquisition from the camera. Insert the acquired frames into a synchronized queue, that is consumed by:
Image processing thread. Takes frames from the queue, processes them into images suitable for display. It changes a synchronized output image, and notifies GUI about it.
Main (GUI) thread is only dedicated to display. When it is notified of an image update, it swaps the synchronized output image with its current working image. (To avoid copying and extra allocations, we just reuse those two image buffers.) Then it invalidates the window. In a WM_PAINT handler, it then displays the image using BitBlt.
Some notes:
Minimize allocation/deallocation of buffers. For acquisition, you could have a pre-allocated pool of buffers to cycle through.
Prepare the output images in format and size that suit display.
Keep track of the number of frames in the queue and set some upper limit. Define an algorithm for dropping excess frames, so that you don't run out of memory and don't lag too much.
If you just want to ditch the sleep in waitKey and want something simpler, have a look at this question
Instrument your code -- add timing of the crucial parts using high resolution timer. Log them, and/or keep statistics, history.

App keep crashing due to memory pressure

My app is saving and retrieving data from Parse.com. And showing images, buttons, scrollviews, etc.. (the normal stuff). Then when I got near finishing my app, it started to receive memory warnings and the app started crashing often. I checked it in the Instruments and noticed the live bytes was extremely high at some points, and I can't figure out why.
Is the app crashing because of the high live bytes? What should value of the live bytes be?
Obiously something is going on in the VM. But I have no idea what this is. What is the VM: CG raster data? And this: VM: CG Image? I am not using CGImages only UIImages
Is the app crashing because of the high live bytes?
Yes.
What should value of the live bytes be?
There's not fixed number. The limits change from OS version to OS version, and sometimes depend on the device and what else is going on at the moment. The right thing to do is (a) try not to use so much, and (b) heed the warnings and dispose of stuff you don't need.
Obiously something is going on in the VM. But I have no idea what this is. What is the VM: CG raster data? And this: VM: CG Image? I am not using CGImages only UIImages
A UIImage is just a wrapper around a CGImage.
You have too many images alive at the same time. That's the problem you have to fix.
So, how many is too many? It depends on how big they are.
Also, note that the "raster data" is the decompressed size. A 5Mpix RGBA 8bpp image takes 20MB of RAM for its raster data, whether the file is 8MB or 8KB.
I still feel the number is too high though, or is 30-40 MB an okey number handling 3-6 full-screen sized images at a time? This is when tested on a 4 year old iPhone4, iOS 7. If that matters.
On an iPhone 4, "full-screen" means 640x960 pixels. 8bpp RGBA means 4 bytes per pixel. So, with 6 such images, that's 640*960*4*6 = 14MB. So, that's the absolute minimum storage you should expect if you've loaded and drawn 6 full-screen images.
So, why do you actually see more than twice that?
Well, as Images and Memory Management in the class reference says:
In low-memory situations, image data may be purged from a UIImage object to free up memory on the system. This purging behavior affects only the image data stored internally by the UIImage object and not the object itself. When you attempt to draw an image whose data has been purged, the image object automatically reloads the data from its original file. This extra load step, however, may incur a small performance penalty.
So think of that 14MB as basically a cache that iOS uses to speed things up, in case you want to draw the images again. If you run a little low on memory, it'll purge the cache automatically, so you don't have to worry about it.
So, that leaves you with 16-24MB, which is presumably used by the buffers of your UI widgets and layers and by the compositor behind the scenes. That's a bit more than the theoretical minimum of 14MB, but not horribly so.
If you want to reduce memory usage further, what you probably need to do is not draw all 6 images. If they're full-screen, there's no way the user can see more than 1 or 2 at a time. So, you could load and render them on demand instead of preloading them (or, if you can predict which one will usually be needed next, preload 1 of them instead of all of them), and destroy them when they're no longer visible. Since you'd then only have 2 images instead of 6, that should drop your memory usage from 16-24MB + a 14MB cache to 5-9MB + a 5MB cache. This obviously means a bit more CPU—it probably won't noticeably affect responsiveness or battery drain, but you'd want to test that. And, more importantly, it will definitely make your code more complicated.
Obviously, if it's appropriate for your images, you could also do things like using non-Retina images (which will cut memory by 75%) or dropping color depth from RGBA-8 to ARGB-1555 (50%), but most images don't look as good that way (which is why we have high-color Retina displays).

what is byte alignment (cache line alignment) for Core Animation? Why it matters?

I am loading images on a scroll view in a non-lazy way, so that the stutter behavior is not seen. The code works and the FPS is close to 60.
BUT, I do not understand what is byte alignment (or cache line alignment) for Core Animation?
As mentioned here and here this is an important thing to do. However, I noticed as long as I do the steps mentioned here, byte-alignment or not does not really matter.
Anyone knows what exactly it is?
When the CPU copies something from memory into the CPU cache it does so in chunks. Those chunks are cache lines and they are of a fixed size. When data is stored in the CPU cache, it's store as lines. Making your data fit into the cache line size for your target architecture can be important for performance because it affects data locality.
ARMv7 uses 32 byte cache lines (like PowerPC). The A9 processor uses 64 byte cache lines. Because of this, you will see the most benefit by rendering into a rectangle that is on a 64 byte boundary and has dimensions that are a multiple of 64 bytes.
On the other hand, the graphics accelerator does prefer working with image data that is a square power of two in dimensions. This doesn't have anything to do with cache lines or byte alignment. This is another thing that can have a large impact on performance.
In the specific cases you linked to, the Apple API being called (Core Animation, QT, etc). is performing these kinds of optimizations on the caller's behalf. In the case of CoreAnimation, the caller is giving it data that it is optimizing for the hardware. According to what Path wrote in the documentation you linked to, they suggest giving Core Animation data it will not have to optimize (in this case, optimizing and making a copy) to avoid the optimization step.
So if your images are some multiple of 64 bytes in dimension and each dimension is a square power of two, you're good to go ;) Rendering that image into an area of the screen that is on a 64 byte boundary is also good, but is not always realistic for anything but a full screen application like a game.
That said, use Instruments. Build your application, profile it with Instruments and a representative workload (UIAutomation is great for this). If you see scrolling performance problems Instruments will give you everything you need to zero in on where the bottleneck is.
I can honestly say that all of the scrolling performance problems I have seen have not involved byte alignment or cache lines. Instead it's been other forms of Core Animation abuse (not using rasterization and caching), or doing too much other work on the main thread, etc.
The guidance on the effect of byte alignment on performance is mentioned in the Quartz 2D Programming Guide
This is the format that Core Animation is optimizing images to when it does a copy. If you already have your data in the format Core Animation wants, it will skip the potentially expensive optimization step.
If you want to know more about how the iOS graphics pipeline works, see:
WWDC 2012 Session 238 "iOS App Performance: Graphics and Animations"
WWDC 2012 Session 235 "iOS App Performance: Responsiveness"
WWDC 2011 Session 121 "Understanding UIKit Rendering"
iOS Device Compatibility Reference: OpenGL ES Graphics

AS3 AIR iOS - How to control when BitmapData is cached/uncached from the GPU?

This question's kind of a 4-parter:
Is it true that all BitmapData is immediately cached to the GPU as soon as it's created (even if it's never applied to a Bitmap or added to stage?)
Does this still happen if the GPU texture buffer is already full? Bonus points: if so, what's the preferential swap method the GPU chooses to select which textures to remove from memory?
If (1), then does setting the width/height of any BitmapData uncache it and/or does replacing its pixels therefore upload the new pixels to the same memory address on the GPU? Bonus: What if the size changes?
To bring this all together, would a hybrid class that extends BitmapData but stores its actual data in a ByteArray be able to use setPixels/getPixels on itself to control upload/download from the GPU as necessary, to buffer a large number of bitmaps? Bonus: Would speed improve for actually placing them in Bitmaps if the instances of this class were static?
Here are some answers
No. In AIR, you manually upload bitmaps to GPU and have control WHEN to do it
As far as I've reached, if the buffer is full, you simply get an error for it - the GPU cannot make a choice what do to. Removing a random texture won't be nice if it's important to you, right? :)
You can check for example Starling and how it uploads textures to GPU. Once you force it to do so, it doesn't care what you do with the bitmap. It's like making a photo image of an object so that you can just show it instead of explaining it with words. It won't matter if you change the object, the photo will be still the same.
Simplified answer: no. Again - it's best to check out how textures are created and how you upload stuff to GPU.

in opengl es 2 how do I free up a texture (ios hard crash)

I have an iOS opengl es 2.0 app that needs to use a TON of large textures. Ideally 4096x4096. I have a struct array that contains all the info about the texture, and as I need to use each one I glGenTextures a new texture id and load the image there, free up the uiimage, etc. That all works great.
My app uses a bunch of textures for UI, image processing, etc. About 4-5 of the 15 I'm using for all of that are 4k x 4k. Rest are smaller. And then these load-as-needed textures are also 4k.
On loading about the 4th-5th of those the app crashes HARD. No console or debug. Just quits to the springboard in the middle of trying to load the next texture.
I don't have a memory leak - I ran instruments. I'm using ARC. I can post the crash report from the Organizer but it doesn't have much info. Just that my app's rpages was 170504.
I could post the image load code but its the same code I've used on all my apps for years. The new thing is pushing the system that hard and trying to load that many large textures.
Q1: Anyone have experience with using a ton of large textures?
So I resolved to the fact that I'll have to do preview res stuff at 1024x1024 and then final res stuff at 4096. The 1k images are now loading as needed and staying loaded. The 4k images will all be loaded one at a time into the same texture to be used and then move on to the next.
I wrote into my image loader a preview parameter and when set it shrinks the image to fit in 1024 during the load. Now Instead of crashing on the 4th or 5th I can add textures 'all day'. My GUESS is that I could do 16x as many as before. But I only need like 20-30 at a time. (only!) So far I've tried 20 with no memory warnings or crashes.
However.. if the app keeps running, because my textures are loaded at unique texture ids, at some point I would hit that spot where I need to unload one that's no longer needed to load the next one. This is probably very simple, but....
Q2: How do I free up a texture that's at an texture id when I no longer need it?
Q3: Will a memory warning tell me that I need to free up an open gl texture?
Q4: Aren't textures loaded on the PVR chip? Are they or how are they even taking up the phone's memory?
Thanks!
Removing Texture:
You have to use this GL call from the main thread.
glDeleteTextures(1, &_texture);
Memory warning is a general call to the application. It will not give you specific information. It is always better to remove unwanted textures from the memory if they are not needed anymore. Eg: We usually remove textures used in menu when the user moves to the In-Game screens, they are reloaded again when the user navigates back. This is much easier to manage memory than waiting for the system to call memory warning.
When you load PNG image, the data is decompressed and stored raw as array of colors per pixel. A 1K texture will use 4 mb despite of content/colors in the image. PVR is a hardware decompression chip which will decompress realtime when the image is used by the GPU, and the image file size you see is what memory it uses.

Resources