My application performs several rendering operations on the first frame (I am using Metal, although I think the same applies to GLES). For example, it renders to targets that are used in subsequent frames, but not updated after that. I am trying to debug some of draw calls from these rendering operations, and I would like to use the 'GPU Capture Frame' functionality to do so. I have used it in the past for on-demand GPU frame debugging, and it is very useful.
Unfortunately, I can't seem to find a way to capture the first frame. For example, this option is unavailable when broken in the debugger (setting a breakpoint before the first frame). The Xcode behaviors also don't seem to allow for capturing the frame once debugging starts. There also doesn't appear to even be an API for performing GPU captures, in Metal APIs or the CAMetalLayer.
Has anybody done this successfully?
I've come across this again, and figured it out properly now. I'll add this as a separate answer, since it's a completely different approach from my other answer.
First, some background. There are three components to capturing a GPU frame:
Telling Xcode that you want to capture a GPU frame. In typical documented use, you do this manually by clicking the GPU Frame Capture "camera" button in Xcode.
Indicating the start of the next frame to capture. Normally, this occurs at the next occurrence of MTLCommandBuffer presentDrawable:, which is invoked to present the framebuffer to the underlying view.
Indicating the end of the frame being captured. Normally, this occurs at the next-but-one occurrence of MTLCommandBuffer presentDrawable:.
In capturing the first frame, or activity before the first frame, only the third of these is available, so we need an alternate way to perform the first two items:
To tell Xcode to begin capturing a frame, add a breakpoint in Xcode at a line in your code somewhere before the point at which you want to start capturing a frame. Right-click the breakpoint, select Edit Breakpoint... from the pop-up menu, and add a Capture GPU Frame action to the breakpoint:
To indicate the start of the frame to capture, before the first occurrence of MTLCommandBuffer presentDrawable:, you can use the MTLCommandQueue insertDebugCaptureBoundary method. For example, you could invoke this method as soon as you instantiate the MTLCommandQueue, to immediately begin capturing everything submitted to the queue. Make sure the breakpoint in item 1 will be triggered before the point this code is invoked.
To indicate the end of the captured frame, you can either rely on the first normal occurrence of MTLCommandBuffer presentDrawable:, or you can add a second invocation of MTLCommandQueue insertDebugCaptureBoundary.
Finally, the MTLCommandQueue insertDebugCaptureBoundary method does not actually cause the frame to be captured. It just marks a boundary point, so you can leave it in your code for future debugging use. Wrap it in a DEBUG compilation conditional if you want it gone from production code.
Try...
[myMTLCommandEncoder insertDebugSignpost: #"com.apple.GPUTools.event.debug-frame"].
To be honest, I haven't tried it myself, but it's analogous to the similar
glInsertEventMarkerEXT(0, "com.apple.GPUTools.event.debug-frame")
documented for OpenGL ES, and there is some mention on the web of it working for Metal.
First, in Metal, I usually use Metal to do parallel compute, then GPU Capture frame is alway grey. So, there are two ways until now I found is Ok.
In iOS 11
you can use the [[MTLCaptureManager alloc] startCaptureWithDevice:m_Device]; to capture frame so you can profile the compute shader performance
lower than iOS 11 (MTLCaptureManager && MTLCaptureScope are new in iOS 11.0 )
you can use the breakpoint, then edit the Action.Capture GPU Frame
Related
I was following raywenderlich's Metal tutorial, but got stuck rendering a texture on a plane, it seems to be showing only one color of the image, not the entire image. I'm running on an iPad iOS 12.3.
Here's a repo for the project: https://github.com/TheJoseph-Dev/MyMetalProgram
May anyone help me?
In your Renderer implementation, set a breakpoint on the line that reads:
private lazy var device = metalView.device
And run your code.
At the point in which this line is executed, the metalView exists, but the device on that metalView is nil. Similar problems can be seen for the other lazy properties of the renderer.
You may wish to use a less complex property style as it appears the properties are not being collected when the view is in the state you expect. I suspect that the view will not create resources like its device until it is attached to a window which will happen after viewDidLoad.
I noticed a performance issue when using GMSMapView as part of my view hierarchy. Important note: the map doesn't take up the whole screen, it is used as a table view header. These issues affect the behaviour of the table view itself - low FPS, which results in a bad user experience, so I'm trying to resolve these issues or at least understand what GMSMapView is doing.
Using Time Profiler I found out that this is cased by the GMSMapView re-rendering on every frame (as I can tell), because the heaviest stack trace on the main thread is:
Which is called from:
CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE1_PERFORM_FUNCTION
(I guess it's just how internals of Google Maps work - it registers a CADisplayLink with a runloop and re-renders it on each frame)
This results in a heavy CPU usage (up to 99%):
Note that on the selected area nothing really happens to the app, it just displays a static map and a table view (no user interaction, no frame changes, no anything)
If I conduct the same test without a GMSMapView as a subview, the CPU usage in the same place is almost 0%:
Now to the question. Why is that happening and how to stop this behaviour?
I found a method called - (void) stopRendering on a GMSMapView, and tested it - the results are good and I get the same performance as when I completely remove the map view. However, this method is marked as deprecated and it states that it will be removed in a future releases of the SDK, which makes it a bad candidate for a long-term solution.
Any help, explanation or clues will be appreciated!
The problem with stopRendering is that it places the burden of managing the map state on the developer.
Instead of using stopRendering, you should limit the frame rate of the object. GMSMapView has a property called preferredFrameRate. You should be using that. It says in the documentation that by default preferredFrameRate is set to maximum, or to re-render every frame.
preferredFrameRate is of an enum called GMSFrameRate. Which has values:
kGMSFrameRatePowerSave
kGMSFrameRateConservative
kGMSFrameRateMaximum
You should be using #2 to ensure that the map stays fluid during user interaction, but also doesnt render needlessly. By default preferredFrameRate is set to #3.
The app uses OpenGL ES2 and the GLKit framework, and the render/update loop provided by GLKitViewController. It used to run at a steady 60 fps on my iPad2 with iOS7.1, but once I updated the iPad2 to iOS8.1, the exact same code now fluctuates between 56-59 FPS. (CPU utlitization, however, remains at 40-60% as before ).
Profiling reveals that the OpenGL drawing commands are using a much larger proportion of CPU time than they used to. The biggest change seems to be that calls to "GLKBaseEffect prepareToDraw" are taking much longer than they used to.
(The app uses a single GLKBaseEffect which is reconfigured at various points during the render loop, neccessitating a call to prepareToDraw each time. I realise it may be possible to optimize by having multiple instances of GLKBaseEffect, and that is something I was considering for later, however, the performance, as it was, was solid on iOS7.1)
I'm now examining theĀ OpenGL ES Analyzer trace in Instruments to determine the OpenGL calls generated by "GLKBaseEffect prepareToDraw", to see if anything seems unusual, and will update the post accordingly once I've managed to figure anything out.
I'd be very grateful for any guidance on how to progress at this point - why might calls to GLKBaseEffect prepareToDraw take longer on iOS8.1?
The cause of the problem was identified by Jim Hillhouse and confirmed by Frogblast on the Apple Dev Forums thread "OpenGL Performance Drops > 50% in iOS 8 GM": setting the text property of a UITextField (or UILabel, in my case) in a view which is a subview of GLKView is causing the GLKView superview to layout, which is then causing framebuffers to be deallocated and reallocated. This wasn't happening in iOS 7.
Jim Hillhouse's workaround was to place the subview inside a UIViewController, and embed that in GLKView. I've done the same, using a Container View to hold the view controller, and can confirm that it works.
So, Im using D3D in a windowed application.
I inited D3D with the following parameters:
windowed: true;
backbufferformat: D3DFMT_X8R8G8B8;
presentinterval: D3DPRESENT_INTERVAL_ONE;
swapeffect: DISCARD
Each time OnPaint is called, I render the image to the backbuffer and present it to front.
As far as I know (and so does MSDN say), once I set D3DPRESENT_INTERVAL_ONE, vsync will work.
But in this case, the image is teared when dragging horizontally.
(It seems there's a line across the image, image below the line shows on the monitor and the above part follows.)
Some sites say D3DPRESENT_INTERVAL_ONE will not work in windowed applications.
How can I enable vsync anyway?
p.s. I finally found D3D vsync is enabled, while some window settings are not right that perhaps the window itself is not sync ed. though, I haven't peek the settings out.
I assume you're using D3D9? Should add that tag. On your D3DPRESENT_PARAMS variable:
if (bVysncEnabled)
{
presentParams.PresentationInterval = D3DPRESENT_INTERVAL_ONE;
presentParams.FullScreen_RefreshRateInHz = D3DPRESENT_RATE_DEFAULT;
}
else
{
presentParams.PresentationInterval = D3DPRESENT_INTERVAL_IMMEDIATE;
presentParams.FullScreen_RefreshRateInHz = 0;
}
If you've done this and you're using the old GDI stuff, it's not your vsync setting that's wrong, but the window settings. You must enable double buffering or you'll still get tearing.
You cannot vsync while in windowed, only in fullscreen. However, you could potentially ghetto it by getting info from the default display and finding the refresh rate, and then nerfing your renderer to only render at that rate...although I wouldn't suggest that route.
Windowed historically haven't been able to be vsynced for d3d, and only recently has this been possible, when aero is enabled in WinVista or Win7 and the app isn't running in presentation mode immidiate.
How often do you call ::OnPaint ? The reason I am asking is, that you must be calling ::OnPaint more often than the refresh rate of your attached monitor.
For me, I solved the refresh issue by forcing an ::OnPaint whenever the message loop is idle with invalidating the window. What will happen if you do that, is, that the RenderPresent command for D3D will WAIT until the graphic card finished rendering, which gives you a very precise timing of the ::OnPaint in sync with the actual monitor refresh rate !
I am having good success with this, and the statements above that windowed mode cannot vsync is definitely not true. Even in DirectX 9 Win XP, this just works.
Oh and last but not least, if you have more than one display attached, make sure to vsync with the actual display which presents your window. This seems a bit more tricky.
not exactly D3D, but AntiTearing.html describes how MPC-HC uses windowed EVR et al to try and avoid tearing of a windowed display. The links here: http://betterlogic.com/roger/2012/05/gdi-vsync-to-avoid-tearing/ may be useful for synchronizing, too (albeit something of a work around).
I have a bit of a problem setting up my DirectX10 (Win32/c++) application for fullscreen mode. The problem is that I want to have my app running in fullscreen right from the start. This can be done by taking the DXGISwapChain::SetFullScreenState function. This works, but i get a small notice in my Visualc++ 2008 debugger which states:
"DXGI Warning: IDXGISwapChain::Present: Fullscreen presentation inefficiencies incurred due to application not using IDXGISwapChain::ResizeBuffers appropriately, specifying a DXGI_MODE_DESC not available in IDXGIOutput::GetDisplayModeList, or not using DXGI_SWAP_CHAIN_FLAG_ALLOW_MODE_SWITCH."
What this means is that DirectX will not take full ownership of the graphicscard and flip the images from front to backbuffer but instead blit them which is much slower.
Now, i do have the DXGI_SWAP_CHAIN_FLAG_ALLOW_MODE_SWITCH enabled and i did try to resize my buffers but i have absolutely no idea what would be the best way to go into fullscreen mode. I have looked on MSDN but there they mostly assume you will only go into Fullscreen by pressing Alt+Enter which lest DXGI do all the work. If someone please could post a bit of code which takes DirectX10 into fullscreen mode and takes full advantage of the "flipping" it would be greatly appriciated!
For anybody interested in the code used on resize:
ReleaseCOM(m_pD3DRenderTargetView);
ReleaseCOM(m_pD3DDepthStencilView);
ReleaseCOM(m_pD3DDepthStencilBuffer);
DXGI_MODE_DESC* mod = new DXGI_MODE_DESC;
mod->Format = DXGI_FORMAT_R8G8B8A8_UNORM;
mod->Height = m_ScreenHeight;
mod->Width = m_ScreenWidth;
mod->RefreshRate.Denominator = 0;
mod->RefreshRate.Numerator = 0;
mod->ScanlineOrdering = DXGI_MODE_SCANLINE_ORDER_UNSPECIFIED;
mod->Scaling = DXGI_MODE_SCALING_UNSPECIFIED;
delete mod; mod = 0;
m_pSwapChain->ResizeTarget(mod);
HR(m_pSwapChain->ResizeBuffers(1, m_ScreenWidth, m_ScreenHeight, DXGI_FORMAT_R8G8B8A8_UNORM, DXGI_SWAP_CHAIN_FLAG_ALLOW_MODE_SWITCH))
throw(Exception(GET_BUFFER_FAIL, AT));
//problem area
m_pSwapChain->SetFullscreenState(TRUE, NULL);
ID3D10Texture2D* pBackBuffer;
HR( m_pSwapChain->GetBuffer(0, __uuidof(ID3D10Texture2D), (LPVOID*)&pBackBuffer))
throw(Exception(GET_BUFFER_FAIL, AT)); //continues as usual
Is there any reason you delete your mode desc?
Have you also tried putting your mode desc through "FindClosestMatchingMode"?
Check out http://msdn.microsoft.com/en-us/library/cc627095(VS.85).aspx The "Full-Screen issues" section contains a lot of useful information.
There are some prerequisites for enabling flipping in DXGI (which is the most efficient fullscreen presentation mode):
1) You should go into fullscreen state specifying a mode that exists in the system (you could do that either by using mode from IDXGIOutput::GetDisplayModeList or finding it using IDXGIOutput::FindClosestMatchingMode). Your code just specifies screen resolution, so most likely mode is set correctly.
2) After SetFullscreenState, you should call ResizeBuffers with the right buffer size matching mode, this is where DXGI would setup flipping mode.
Typically, it should happen naturally as reaction to WM_SIZE message send by SetFullscreenState transition, so if your app doesn't call ResizeBuffers on WM_SIZE, it probably should.
You can call ResizeBuffers manually after SetFullscreenState and that should work as well.
And yeah, MSDN has a good article about DXGI practices, including fullscreen transition:
http://msdn.microsoft.com/en-us/library/cc627095(VS.85).aspx#Full_Screen_Issues