I have a question regarding OpenGL ES context size.
I have two OpenGL contexts running on iPad retina using GLKView. The view
is configured to have no depth / stencil / multisampling but only a 32 bit framebuffer.
A single buffer takes 12MB (2048*1536*4 bytes). Profiling my application reveals I have 3 IOKit allocations of 12MB plus one allocation of 12MB from Core Animation. I suspect they are all related. My guess is that Core animation caches the resulting frame buffer which explains the one 12MB coming from it.Also, I'm calling deleteDrawable on the GLKView which is hidden, which means that I would have expected a single 12MB buffer from IOKit and maybe another one from Core Animation. Does anyone have any experience with OpenGL memory consumption, how to reduce it and why do I see three IOKit allocations although I have only a single GLView at any given time?
I believe that iOS devices use triple buffering internally, which would explain the extra allocations. This was mentioned by John Carmack in an email printed here.
Related
I am making an iOS app which can render big number of textures which I stream from disk on the fly. I use a NSCache for LRU cache of textures. There is one screen with a 3D model and one screen with a full screen detail of a texture where this texture can be changed with swiping. Kind of a very simple carousel. The app never takes more then 250MiB of RAM on 1GiB devices, the textures' cache works good.
For the fullscreen view I have a cache of VBOs based on the screen resolution and texture resolution (different texture coordinates). I never delete these VBOs and always check if the VBO is OK (glIsBuffer()). The screens are separate UIViewControllers and I use the same EAGLContext in both of them, no context sharing. This is OK as it is on the same thread.
All this is Open GL ES 2.0 and everything works good. I can switch between the 3D/2D screens, change the textures. The textures are created/deleted on the fly as needed based on the available memory.
BUT sometimes I get a random crash when rendering a full screen quad when calling:
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
This can happen when I receive a lot of memory warnings in a row. Sometimes I can get hundreds of memory warnings in few seconds and the app works OK but sometimes it will crash while swiping to new full screen texture quad. This happens even for textures that were already rendered in full screen. It never crashes on the 3D model which uses the same textures.
The crash report is always on the glDrawArrays call (in my code) with a EXC_BAD_ACCESS KERN_INVALID_ADDRESS at 0x00000018 exception. The last call in a stack trace is always gleRunVertexSubmitARM. This happens on various iPads and iPhones.
It looks like a system memory pressure corrupts some GL memory but I do not know when, where and why.
I have also tried switching from VBO to the old way of having vertex data on the heap, where I first check if the vertex data is not NULL before calling glDrawArrays. The result is the same, random crashes in low memory situations.
Any ideas what could be wrong? The address 0x00000018 in the EXC_BAD_ACCESS is really bad but I do not know whose address it should be. Could a de-allocated texture or shader cause a EXC_BAD_ACCESS in glDrawArrays?
After several days of intensive debugging I finally figured it out. The problem was the NSCache storing OpenGL textures. Under memory pressure the NSCache starts removing items from it to free memory. The problem is that in that situation it does it on its own background thread (com.apple.root.utility-qos), thus there is no GL context for that thread (and no sharegroup with the main context) so the texture name is not a valid name (cannot be deleted) and will leak the GL memory. So after some memory warnings there was a lot of leaked GL textures, memory full and app finally crashed.
TL DR:
Do not use NSCache for caching OpenGL objects because it will leak them after a memory warning.
My app is saving and retrieving data from Parse.com. And showing images, buttons, scrollviews, etc.. (the normal stuff). Then when I got near finishing my app, it started to receive memory warnings and the app started crashing often. I checked it in the Instruments and noticed the live bytes was extremely high at some points, and I can't figure out why.
Is the app crashing because of the high live bytes? What should value of the live bytes be?
Obiously something is going on in the VM. But I have no idea what this is. What is the VM: CG raster data? And this: VM: CG Image? I am not using CGImages only UIImages
Is the app crashing because of the high live bytes?
Yes.
What should value of the live bytes be?
There's not fixed number. The limits change from OS version to OS version, and sometimes depend on the device and what else is going on at the moment. The right thing to do is (a) try not to use so much, and (b) heed the warnings and dispose of stuff you don't need.
Obiously something is going on in the VM. But I have no idea what this is. What is the VM: CG raster data? And this: VM: CG Image? I am not using CGImages only UIImages
A UIImage is just a wrapper around a CGImage.
You have too many images alive at the same time. That's the problem you have to fix.
So, how many is too many? It depends on how big they are.
Also, note that the "raster data" is the decompressed size. A 5Mpix RGBA 8bpp image takes 20MB of RAM for its raster data, whether the file is 8MB or 8KB.
I still feel the number is too high though, or is 30-40 MB an okey number handling 3-6 full-screen sized images at a time? This is when tested on a 4 year old iPhone4, iOS 7. If that matters.
On an iPhone 4, "full-screen" means 640x960 pixels. 8bpp RGBA means 4 bytes per pixel. So, with 6 such images, that's 640*960*4*6 = 14MB. So, that's the absolute minimum storage you should expect if you've loaded and drawn 6 full-screen images.
So, why do you actually see more than twice that?
Well, as Images and Memory Management in the class reference says:
In low-memory situations, image data may be purged from a UIImage object to free up memory on the system. This purging behavior affects only the image data stored internally by the UIImage object and not the object itself. When you attempt to draw an image whose data has been purged, the image object automatically reloads the data from its original file. This extra load step, however, may incur a small performance penalty.
So think of that 14MB as basically a cache that iOS uses to speed things up, in case you want to draw the images again. If you run a little low on memory, it'll purge the cache automatically, so you don't have to worry about it.
So, that leaves you with 16-24MB, which is presumably used by the buffers of your UI widgets and layers and by the compositor behind the scenes. That's a bit more than the theoretical minimum of 14MB, but not horribly so.
If you want to reduce memory usage further, what you probably need to do is not draw all 6 images. If they're full-screen, there's no way the user can see more than 1 or 2 at a time. So, you could load and render them on demand instead of preloading them (or, if you can predict which one will usually be needed next, preload 1 of them instead of all of them), and destroy them when they're no longer visible. Since you'd then only have 2 images instead of 6, that should drop your memory usage from 16-24MB + a 14MB cache to 5-9MB + a 5MB cache. This obviously means a bit more CPU—it probably won't noticeably affect responsiveness or battery drain, but you'd want to test that. And, more importantly, it will definitely make your code more complicated.
Obviously, if it's appropriate for your images, you could also do things like using non-Retina images (which will cut memory by 75%) or dropping color depth from RGBA-8 to ARGB-1555 (50%), but most images don't look as good that way (which is why we have high-color Retina displays).
I am loading images on a scroll view in a non-lazy way, so that the stutter behavior is not seen. The code works and the FPS is close to 60.
BUT, I do not understand what is byte alignment (or cache line alignment) for Core Animation?
As mentioned here and here this is an important thing to do. However, I noticed as long as I do the steps mentioned here, byte-alignment or not does not really matter.
Anyone knows what exactly it is?
When the CPU copies something from memory into the CPU cache it does so in chunks. Those chunks are cache lines and they are of a fixed size. When data is stored in the CPU cache, it's store as lines. Making your data fit into the cache line size for your target architecture can be important for performance because it affects data locality.
ARMv7 uses 32 byte cache lines (like PowerPC). The A9 processor uses 64 byte cache lines. Because of this, you will see the most benefit by rendering into a rectangle that is on a 64 byte boundary and has dimensions that are a multiple of 64 bytes.
On the other hand, the graphics accelerator does prefer working with image data that is a square power of two in dimensions. This doesn't have anything to do with cache lines or byte alignment. This is another thing that can have a large impact on performance.
In the specific cases you linked to, the Apple API being called (Core Animation, QT, etc). is performing these kinds of optimizations on the caller's behalf. In the case of CoreAnimation, the caller is giving it data that it is optimizing for the hardware. According to what Path wrote in the documentation you linked to, they suggest giving Core Animation data it will not have to optimize (in this case, optimizing and making a copy) to avoid the optimization step.
So if your images are some multiple of 64 bytes in dimension and each dimension is a square power of two, you're good to go ;) Rendering that image into an area of the screen that is on a 64 byte boundary is also good, but is not always realistic for anything but a full screen application like a game.
That said, use Instruments. Build your application, profile it with Instruments and a representative workload (UIAutomation is great for this). If you see scrolling performance problems Instruments will give you everything you need to zero in on where the bottleneck is.
I can honestly say that all of the scrolling performance problems I have seen have not involved byte alignment or cache lines. Instead it's been other forms of Core Animation abuse (not using rasterization and caching), or doing too much other work on the main thread, etc.
The guidance on the effect of byte alignment on performance is mentioned in the Quartz 2D Programming Guide
This is the format that Core Animation is optimizing images to when it does a copy. If you already have your data in the format Core Animation wants, it will skip the potentially expensive optimization step.
If you want to know more about how the iOS graphics pipeline works, see:
WWDC 2012 Session 238 "iOS App Performance: Graphics and Animations"
WWDC 2012 Session 235 "iOS App Performance: Responsiveness"
WWDC 2011 Session 121 "Understanding UIKit Rendering"
iOS Device Compatibility Reference: OpenGL ES Graphics
I have an iOS opengl es 2.0 app that needs to use a TON of large textures. Ideally 4096x4096. I have a struct array that contains all the info about the texture, and as I need to use each one I glGenTextures a new texture id and load the image there, free up the uiimage, etc. That all works great.
My app uses a bunch of textures for UI, image processing, etc. About 4-5 of the 15 I'm using for all of that are 4k x 4k. Rest are smaller. And then these load-as-needed textures are also 4k.
On loading about the 4th-5th of those the app crashes HARD. No console or debug. Just quits to the springboard in the middle of trying to load the next texture.
I don't have a memory leak - I ran instruments. I'm using ARC. I can post the crash report from the Organizer but it doesn't have much info. Just that my app's rpages was 170504.
I could post the image load code but its the same code I've used on all my apps for years. The new thing is pushing the system that hard and trying to load that many large textures.
Q1: Anyone have experience with using a ton of large textures?
So I resolved to the fact that I'll have to do preview res stuff at 1024x1024 and then final res stuff at 4096. The 1k images are now loading as needed and staying loaded. The 4k images will all be loaded one at a time into the same texture to be used and then move on to the next.
I wrote into my image loader a preview parameter and when set it shrinks the image to fit in 1024 during the load. Now Instead of crashing on the 4th or 5th I can add textures 'all day'. My GUESS is that I could do 16x as many as before. But I only need like 20-30 at a time. (only!) So far I've tried 20 with no memory warnings or crashes.
However.. if the app keeps running, because my textures are loaded at unique texture ids, at some point I would hit that spot where I need to unload one that's no longer needed to load the next one. This is probably very simple, but....
Q2: How do I free up a texture that's at an texture id when I no longer need it?
Q3: Will a memory warning tell me that I need to free up an open gl texture?
Q4: Aren't textures loaded on the PVR chip? Are they or how are they even taking up the phone's memory?
Thanks!
Removing Texture:
You have to use this GL call from the main thread.
glDeleteTextures(1, &_texture);
Memory warning is a general call to the application. It will not give you specific information. It is always better to remove unwanted textures from the memory if they are not needed anymore. Eg: We usually remove textures used in menu when the user moves to the In-Game screens, they are reloaded again when the user navigates back. This is much easier to manage memory than waiting for the system to call memory warning.
When you load PNG image, the data is decompressed and stored raw as array of colors per pixel. A 1K texture will use 4 mb despite of content/colors in the image. PVR is a hardware decompression chip which will decompress realtime when the image is used by the GPU, and the image file size you see is what memory it uses.
I have a texture-heavy OpenGL game that I'd like to tune based on how much RAM the device has. The highest resolution textures I have work fine on an iPhone 4 or iPad2, but earlier devices crash in the middle of loading the textures. I have low-res versions of these textures, but I need to know when to use them.
My current tactic is to detect specific older devices (3GS has a low-res screen; iPad has no camera), then only load the hi-res textures for IPad2 and above and iPhone 4 and above — and I suppose I'll need to do something for the iPod touch. But I'd much rather use feature detection than hard-coding device models, since model detection is fragile against future changes to APIs and hardware.
Another possibility I'm considering is to load the hi-res textures first, then drop and replace them with lo-res the moment I get a low memory warning. However, I'm not sure I'll get the chance to response; I've noticed that the app often dies before any notification appears on the debug console.
How do I detect whether the device I'm running on has insufficient RAM to load hi-res versions of my textures?
Taking a step back, is there some other adaptive technique I can use that's specific to OpenGL texture memory?
Notes:
I've searched on and off SO for answers related to available RAM detection, but they all basically advise profiling memory usage and eliminating waste (minimising the lifetime of temporaries, and all that guff). I've done as much of that as I can, and there is no way I am going to squeeze the hi-res textures into the older devices.
PVRTC isn't an option. The textures contain data to be used by fragment shaders and must be stored in a lossless format.
To get the total (maximum) physical RAM of the device use [NSProcessInfo processInfo].physicalMemory.
See Documentation.
Total physical RAM is available via sysctl(), as documented in this blog post and implemented as a nice clean API here (see the implementation of totalMemory in the corresponding .m file).
I've lifted the blog's code for convenience and posterity:
#include <sys/sysctl.h>
size_t phys_mem()
{
int mib[] = { CTL_HW, HW_PHYSMEM };
size_t mem;
size_t len = sizeof(mem);
sysctl(mib, 2, &mem, &len, NULL, 0);
return mem;
}
I don't know if Apple will approve an app that uses sysctl() in this manner. It is documented, but only for Mac OS X.
The most important think you need to know for memory management in this case is wether to use High or low res textures. The simplest way I use is to check this
CGFloat scale = [[UIScreen mainScreen] scale];
if ((scale > 1.0) || (self.view.frame.size.width > 320)) {
highRes = TRUE;
}
This works for all devices so far and should be future proof, newer devices will use the high res.
You might also calculate right there the aspect ratio (helps later on ipad vs iphone)
aspect = self.view.frame.size.width/self.view.frame.size.width
Don't load highres first it kills your apps load time, on my 3G most of my startup is spent loading (even the low res) textures, just test for this right at the beginning and don't touch the highres stuff.
On older devices the program will die without warning due to big textures, may have something to do with de debugger not being able to trap the video memory consumption and dying itself.
For greater optimizations consider tinting you mipmaps to check the lowest texture size that's actually being used (only if your using 3D objects).
Forget the video RAM size issue, memory is actually shared, so you're competing for system memory, on older devices you had a MB limit for use, but it's still system memory.
About memory management, there are many ways to do it, the simplest should be mark the textures that are loaded, and textures that are needed, when a memory warning comes, dump the textures that are loaded but not needed...
As far as I know the 3 most important things one can do is -
implement - (void)didReceiveMemoryWarning and respond when iOS sends warnings 1 & 2.
Profile code in Instruments trying to find leaks & better mem optimum ways of implementation.
Detect device types & maybe use that info.
Use some form of texture compressions like PVRTC to save space.
I think you are doing most of them. The thing is one does not even know accurately how much RAM iOS devices has. Apple does not publish tech specs for iOS devices.
Also, it cannot be assumed that only after say 100mb of consumption you will get a mem warning. The warnings that iOS gives varies depending on the current state of the device & what other apps are running & how much mem they are consuming. So it gets tricky.
I can suggest 2 must read sections - Best Practices for Working with Texture Data and Tuning Your OpenGL ES Application