gpus_ReturnGuiltyForHardwareRestart Crash in [EAGLContext presentRenderbuffer] - ios

I'm getting a lot of crashes in EAGLContext presentRenderbuffer on iOS 11, but only on iPhone 6/6+ and older.
As per this post, I think we've already ruled out VBO-related problems by rewriting everything to not use VBO/VAOs, but the crash wasn't fixed by that.
There are a few other questions on SO about this but no solution -- has anyone else been seeing the uptick in this crash and been able to resolve it?
TL;DR:
Here is what we know so far:
The crash is specific to iOS11, iPhone 5S/6/6+. It doesn’t occur on 6S and up.
The core of the OpenGL stack returns gpus_ReturnGuiltyForHardwareRestart
It occurs when we attempt to invoke [EAGLContext presentRenderbuffer] from a CAEAGLLayer
We don’t have a repro.
What we have tried so far:
Remove any reference to VBO/VAO in our rendering stack. Didn’t help.
We have tried reproing with a large range of drawing scenarios (rotation, resize, background/foreground). No luck.
As far as we can tell, there is nothing specific in our application logic between the iPhone 6 family and the iPhone 6S family.
Some clues (that could be relevant but not necessarily):
We know that when the presentRenderBuffer is invoked off main thread, and some CATransaction are occurring at the same time on the main thread, the crash rate goes up.
When presentRenderBuffer is invoked on main thread (along with the whole drawing pipeline), the crash rate goes slightly down but not drastically.
A substantial chunk (~20%) of the crashes occurs when the layer goes off screen and/or gets out of the view hierarchy.
Here is the stack trace:
libGPUSupportMercury.dylib gpus_ReturnGuiltyForHardwareRestart
1 AGXGLDriver gldUpdateDispatch
2 libGPUSupportMercury.dylib gpusSubmitDataBuffers
3 AGXGLDriver gldUpdateDispatch
4 GLEngine gliPresentViewES_Exec
5 OpenGLES -[EAGLContext presentRenderbuffer:]

From my experience I get this sort of crashes in these cases:
Calling OpenGL API when application is in UIApplicationStateBackground state.
Using objects (textures, VBO, etc) that was created in OpenGL context that have different shareGroup. This can happened if you don't call [EAGLContext setCurrentContext:..] before rendering or other work with OpenGL object.
Invalid geometry. For example this can happened if you allocate index buffer for bigger size that you need. Fill it with some values and then try to render with size that was used at allocation. Sometimes this works (tail of buffer is filled with 0, and you don't see any visual glitches). Sometimes it will crash (when tail of buffer filled with junk, and reference to point that is out of bounds).
Hope this helps in some way.
P.S. Maybe you tell some more info about your application? I write application that render vector maps at iOS and don't face any troubles with iOS 11 at this moment. Rendering pipeline is pretty simple CADisplayLink call our callback on main thread when we can render next frame. Each view with OpenGL scene can have several background contexts to load resources in background (ofc it have same shareGroup with main context).

Related

Random crash when calling glDrawArrays

I am making an iOS app which can render big number of textures which I stream from disk on the fly. I use a NSCache for LRU cache of textures. There is one screen with a 3D model and one screen with a full screen detail of a texture where this texture can be changed with swiping. Kind of a very simple carousel. The app never takes more then 250MiB of RAM on 1GiB devices, the textures' cache works good.
For the fullscreen view I have a cache of VBOs based on the screen resolution and texture resolution (different texture coordinates). I never delete these VBOs and always check if the VBO is OK (glIsBuffer()). The screens are separate UIViewControllers and I use the same EAGLContext in both of them, no context sharing. This is OK as it is on the same thread.
All this is Open GL ES 2.0 and everything works good. I can switch between the 3D/2D screens, change the textures. The textures are created/deleted on the fly as needed based on the available memory.
BUT sometimes I get a random crash when rendering a full screen quad when calling:
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
This can happen when I receive a lot of memory warnings in a row. Sometimes I can get hundreds of memory warnings in few seconds and the app works OK but sometimes it will crash while swiping to new full screen texture quad. This happens even for textures that were already rendered in full screen. It never crashes on the 3D model which uses the same textures.
The crash report is always on the glDrawArrays call (in my code) with a EXC_BAD_ACCESS KERN_INVALID_ADDRESS at 0x00000018 exception. The last call in a stack trace is always gleRunVertexSubmitARM. This happens on various iPads and iPhones.
It looks like a system memory pressure corrupts some GL memory but I do not know when, where and why.
I have also tried switching from VBO to the old way of having vertex data on the heap, where I first check if the vertex data is not NULL before calling glDrawArrays. The result is the same, random crashes in low memory situations.
Any ideas what could be wrong? The address 0x00000018 in the EXC_BAD_ACCESS is really bad but I do not know whose address it should be. Could a de-allocated texture or shader cause a EXC_BAD_ACCESS in glDrawArrays?
After several days of intensive debugging I finally figured it out. The problem was the NSCache storing OpenGL textures. Under memory pressure the NSCache starts removing items from it to free memory. The problem is that in that situation it does it on its own background thread (com.apple.root.utility-qos), thus there is no GL context for that thread (and no sharegroup with the main context) so the texture name is not a valid name (cannot be deleted) and will leak the GL memory. So after some memory warnings there was a lot of leaked GL textures, memory full and app finally crashed.
TL DR:
Do not use NSCache for caching OpenGL objects because it will leak them after a memory warning.

Why does Xcode 5.1 OpenGLES frame capture cause app to crash?

I'm trying to debug some hand written OpenGLES 2.0 code on iOS 7. The code runs fine in so far as it doesn't crash, run out of memory or behave erratically on both the simulator and on an actual iPhone device but the graphical output is not what I'm expecting.
I'm using the Capture OpenGLES frame feature in Xcode 5.1 to try to debug the OGL calls but I find that when I click the button to capture a frame the app crashes (in OpenGL rendering code - glDrawArrays() to be exact) with an EXC_BAD_ACCESS, code = 1.
To repeat, the code will run fine with no crashes for arbitrarily long and it is only when I click the button in the debugger to capture the frame that the crash occurs.
Any suggestions as to what I may be doing wrongly that would cause this to happen ?
I don't know exactly what you are doing, but here is what I was doing that caused normal (although different than expected) rendering, and crash only when attempting to capture the frame:
Load texture asynchronously (own code, not GLKit, but similar method) in background thread, by using background EAGLContext (same share group as main context). Pass a C block as 'completion handler', that takes the created texture as the only argument, to pass it back to the client when created.
On completion, call the block (Note that this is from the texture loading method, so we are still running in the background thread/gl context.)
From within the completion block, create a sprite using said texture. The sprite creation involves generating a Vertex Array Object with the vertex/texture coords, shader attribute locations etc. Said code does not call openGL ES functions directly, but instead uses a set of wrapper functions that cache OpenGL ES state on the client (app) side, and only call the actual functions if the values involved do change. Because gl state is cached on the client side, a separate data structure is needed for each gl context, and the caching functions must always know which context they are dealing with. The VAO generating code was not aware that is was being run on the background context, so the caching was likely corrupted/out of sync.
Render said sprite every frame: nothing is drawn. When attempting OpenGL Frame Capture, it crashes with EXC_BAD_ACCESS at: glDrawElements(GL_TRIANGLE_STRIP, 4, GL_UNSIGNED_SHORT, 0);
I do not really need to create the geometry on a background thread, so what I did is force calling the completion block on the main thread (see this question), so that once the texture was ready all the sprites are created using the main gl context.

iOS OpenGL context memory consumption

I have a question regarding OpenGL ES context size.
I have two OpenGL contexts running on iPad retina using GLKView. The view
is configured to have no depth / stencil / multisampling but only a 32 bit framebuffer.
A single buffer takes 12MB (2048*1536*4 bytes). Profiling my application reveals I have 3 IOKit allocations of 12MB plus one allocation of 12MB from Core Animation. I suspect they are all related. My guess is that Core animation caches the resulting frame buffer which explains the one 12MB coming from it.Also, I'm calling deleteDrawable on the GLKView which is hidden, which means that I would have expected a single 12MB buffer from IOKit and maybe another one from Core Animation. Does anyone have any experience with OpenGL memory consumption, how to reduce it and why do I see three IOKit allocations although I have only a single GLView at any given time?
I believe that iOS devices use triple buffering internally, which would explain the extra allocations. This was mentioned by John Carmack in an email printed here.

Significant application performance difference between IOS simulator and Iphone

Problem in a nutshell
I have been building an IOS application in recent weeks and have run into some trouble.The application is plays an animation by manipulating and then drawing an image raster multiple times per second. The image is drawn by assigning it to a UIViews CALayer like so self.layer.contents = (id)pimage.CGImage; The calculation and rendering are seperated in two CADisplayLinks.
This animation technique achieves a satisfactory performance on the IPhone 6.1 simulator but when it is build on the physical device (Iphone 4s running IOS 6.1.3) it experiences a significant slow down. The slow down is so bad that it actually makes the application unusable.
Suspected Issues
I have read, in this question Difference of memory organization between iOS device and iPhone simulator , that the simulator is allowed to use far more memory than the actual device. However, while observing my apps memory usage in in "instruments", I noticed that the total memory usage never exceeds 3Mbs. So Im unsure if that is actually the problem but it's probably worth pointing out.
According to this question, Does the iOS-Simulator use multiple cores? , the IOS simulator runs of an intel chip while actual my device uses an apple A5 chip. I suspect that this may also be the cause of the slowdown.
I am considering rewriting the animation in Open GL, however Id first like to try and improve the existing code before I take any drastic steps.
Any help in identifying what the problem is would be greatly appreciated.
Update
Thanks to all those who offered suggestions.
I discovered while profiling that the main bottleneck was actually clearing the image raster for the next animation. I decided to rewrite the rendering of the animations in opengl. It didn't take as long as anticipated. The app now achieves a pretty good level of performance and is a little bit simpler.
This is a classic problem. The simulator is using the resource of your high-powered workstation/laptop.
Unfortunately the only solutions is to go back and optimize your code, especially the display stuff.
Typically, you want to try to minimize the drawing time from the computation time, which it sounds like you are doing, but make sure you don't compute on the main thread.
dispatch_queue_t queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0ul);
dispatch_async(queue, ^{
// Do the computation
});
You can use instruments while running on the device, so the CoreGraphics instruments is available to see what is using all the time and point to the offending code. Unfortunately, you probably already know what it is and it's just going to come down to optimizations.
The slowdown is most likely related to blitting the images. I assume you are using a series of still images that get changed in the display look callback. I believe that if you can use CALayers that get added to your primary view/layer (while removing the old one), and which contain already CGImageRefs, you can then use CGContextDrawImage() to blit the image in the layer's drawInContext method. Set the context to use copy not blend, so it just replaces the old bits.
You can use a dispatch queue to create CALayer subclasses containing an image on a secondary thread, then of course the drawing happens on the main queue. You can use some throttling to maintain a queue of CALayers of 10 or so, and replenishing them as they are consumed.
if this doesn't do it then OpenGL may help, but again none of this helps moving bits between the processor and the GPU (since you are using stacks of images, not just animating one).

Multithreading OpenGL ES on iOS for games

Currently, I have a fixed timestep game loop running on a second thread in my game. The OpenGL context is on the same thread, and rendering is done once per frame after any updates. So the main "loop" has to wait for drawing each frame before it can proceed. This wasn't really a problem until I wrote my particle system. Upwards of 1500+ particles with a physics step of 16ms causes the framerate to drop just below 30, anymore and it's worse. The particle rendering can't be optimized anymore without losing capability, so I decided to try moving OpenGL to a 3rd thread. I know this is somewhat of an extreme case, but I feel it should be able to handle it.
I've thought of running 2 loops concurrently, one for the main stepping (fixed timestep) and one for drawing (however fast it can go). However the rendering calls pass in data that may be changed each update, so I was concerned that locking would slow it down and negate benefit. However, after implenting a test to do this, I'm just getting EXC_BAD_ACCESS after less than a second of runtime. I assume because they're trying to access the same data at the same time? I thought the system automatically handled this?
When I was first learning OpenGL on the iPhone, I had OpenGL setup on the main thread, and would call performSelectorOnMainThread:withObject:waitUntilDone: with the rendering selector, and these errors would happen any time waitUntilDone was false. If it was true, it would happen randomly sometimes, but sometimes I could let it run for 30 mins and it would be fine. Same concept as what's happening now I assume. I am getting the first frame drawn to the screen before it crashes though, so I know something is working.
How would this be properly handled, and if so would it even provide the speed up I'm looking for? Or would multiple access slow it down just as much?

Resources