I'm adding several CALayers as sublayers of the layer of a UIView. The contents of each layer is a different image downloaded from a server. Each layer is animated from offscreen to a randomly generated position. The image data is downloaded asynchronously. Each image is approx 300x300 or smaller.
As a result of the random placement, the layers overlap and some are obscured by the layers above them. This is all good.
I'm removing layers as they become completely obscured from view using the suggestion in the answer to this question The calculations to determine coverage occur on a separate thread.
I have a UIPanGestureRecognizer that allows the user to drag the layers around on the screen.
I'm encountering a performance problem when the number of layers added approaches 25-30 and gets progressively worse. The animations become choppy and often are completely absent (the newly added layers simply appear in their final position). And the pan gestures are either ignored or result in choppy repositioning of the selected layer.
I'm supposing that I'm killing the GPU with all of the layers overlapping and another layer animating above?
Any suggestions on how to improve performance?
Best practices for dealing with large numbers of layers?
Is it better to have the layer begin animated in a separate view.layer than the previously added layers?
Thanks!
Couple of quick things to check.
Run the Core Animation Instrument and look for opacity. Just setting the layer's opaque flag to YES is not enough, if the underlying image has an alpha component the layer will take that into consideration.
If the data you are getting from your sever has alpha then you should redraw with Quartz and save the file locally in the new format that does not include alpha.
Make doubly sure that you are not putting 1 mega pixel images into 100x100 tiles. Also Core Animation Instrument, switching on 'Color Misaligned Images' and look for yellow.
30 to 50 layers should be no problem.
If all the layers don't fit in the GPU's memory or portion of memory, things will slow down a lot. Loading and unloading GPU memory is reported to be quite slow compared to compositing layers in GPU memory. You'll have to experiment with this memory limit, as older devices have less GPU memory available than more recent devices.
Related
I've written an app with an object detection model and process images when an object is detected. The problem I'm running into is when an object is detected with 99% confidence but the frame I'm processing is very blurry.
I've considered analyzing the frame and attempting to detect blurriness or detecting device movement and not analyzing frames when the device is moving a lot.
Do you have any other suggestions to only process un-blurry photos or solutions other than the ones I've proposed? Thanks
You might have issues detecting "movement" when for instance driving in car. In that case looking at something inside your car is not considered as movement while looking at something outside is (if it's not far away anyway). There can be many other cases for this.
I would start by checking if camera is in focus. It is not the same as checking if frame is blurry but it might be very close.
The other option I can think of is simply checking 2 or more sequential frames and see if they are relatively the same. To do something like that it is bast to define a grid for instance 16x16 on which you evaluate similar values. You would need to mipmap your photos which manually means resizing it by half till you get to 16x16 image (2000x1500 would become 1024x1024 -> 512x512 -> 256x256 ...). Then grab those 16x16 pixels and store them. Once you have enough frames (at least 2) you can start comparing these values. GPU is perfect for resizing but those 16x16 values are probably best evaluated on the CPU. What you need to do is basically find an average pixel difference in 2 sequential 16x16 buffers. Then use that to evaluate if detection should be enabled.
This procedure may still not be perfect but it should be relatively feasible from performance perspective. There may be some shortcuts as some tools maybe already do resizing so that you don't need to "halve" them manually. From theoretical perspective you are creating sectors and compute their average color. If all the sectors have almost same color between 2 or more frames there is a high chance the camera did not move in that time much and the image should not be blurry from movement. Still if camera is not in focus you can have multiple sequential frames that are exactly the same but in fact they are all blurry. Same happens if you detect phone movement.
I've got a CAReplicatorLayer with a lot of replications of a small CAShapeLayer (a circle).
Then I animate another CAShapeLayer (a square). It's constantly moving left and right.
The CoreAnimation FPS which the profiler shows are very low (around 20-30) and you can see how much it lags on the device (iPad 3, iOS 8.1).
I know that I could potentially increase the performance a little by rasterizing parts of the scene, but I'm looking for another appraoch here. How can I improve the performance of this demo project without rasterizing? (In my main project there's so many layers that I can't use rasterization as it will take up too much memory)
Here's the project:
https://dl.dropboxusercontent.com/u/40859730/CoreAnimationPerformanceTest.zip
EDIT: I forgot to say that I'm CPU bound, the GPU usage is very very low, while the CARenderServer usage goes up to 100-120%
I just generated a gradient with transparency programmatically by adding a solid color and a gradient to an image mask. I then applied the resulting image to my UIView.layer.content. The visual is fine, but when I scroll object under the transparency, the app gets chunky. Is there a way to speed up?
My minital thought was caching the resulting gradient. Another thought was to create a gradient that is only one pixel wide and stretch it to cover the desired area. Will either of these approaches help the performance?
Joe
I recall reading (though I don't remember where) that core graphics gradients can have a noticeable effect on performance. If you can, using a png for your gradient instead should resolve the issue that you are seeing.
Given one texture sheet is it better to have one or multiple CCSpriteBatchNodes? Or does this not affect at all the GPU computational cost in processing the non visible CCSprite quads?
I am thinking about performance and referring to this question and answer I got. Basically it suggests that I should use more than one CCSpriteBatchNode even if I have only one file. I don't understand if the sentence "Too many batched sprites still affects performance negatively even if they are not visible or outside the screen" is applicable also having two CCSpriteBatchNode instead of one. In other words, does the sentence refer to this "The GPU is responsible for cancelling draws of quads that are not visible due to being entirely outside the screen. It still needs to process those quads."? And if so it should meant that it doesn't really matter how may CCSpriteBatchNode instances I have using the same texture sheet, right?
How can I optimize this? I mean, how can I avoid the GPU having to process the non visible quads?
Would you be able to answer to at least the questions in bold?
First case: Too many nodes (or sprites) in the scene and many of them are out of screen/visible area. In this case for each sprite, GPU has to check if its outside the visible area or not. Too many sprite-nodes means too much load on GPU.
Adding more CCSpriteBatchNode should not effect the performance. Because the sprite-sheet bitmap is loaded to the GPU memory, and an array of coordinates is kept by the application for drawing individual sprites. So if you put 2 images in 2 different CCSpriteBatchNodes or 2 images in 1, it will be same for both CPU and GPU.
How to optimize?
The best way would be to remove the invisible nodes/sprites from the parent. But it depends on your application.
FPS drops certainly because of two reasons:
fillrate - when a lot of sprites overlap each others (and additionally if we render high-res texture into small sprite)
redundant state changes - in this case the heaviest are shader and texture switches
You can render sprites outside of screen in single batch and this doesn't drop performance singnificantly. Pay attention that rendering sprite with zero opacity (or transparent texture) takes the same time as non-transparent sprite.
First of all, this really sounds like a case of premature optimization. Do a test with the number of sprites you expect to be on screen, and some added, others removed. Do you get 60 fps on the oldest supported device? If yes, good, no need to optimize. If no, tweak the code design to see what actually makes a difference.
I mean, how can I avoid the GPU having to process the non visible quads?
You can't, unless you're going to rewrite how cocos2d handles drawing of sprites/batched sprites.
it doesn't really matter how may CCSpriteBatchNode instances I have using the same texture sheet, right?
Each additional sprite batch node adds a draw call. Given how many sprites they can batch into a single draw call, the benefit far outweighs the drawbacks. Whether you have one, two or three sprite batch nodes makes absolutely no difference.
I have been doing some experimenting with iOS drawing. To do a practical exercise I wrote a BarChart component. The following is the class diagram (well, I wasnt allowed to upload images) so let me write it in words. I have a NGBarChartView which inherits from UIView has 2 protocols NGBarChartViewDataSource and NGBarChartViewDelegate. And the code is at https://github.com/mraghuram/NELGPieChart/blob/master/NELGPieChart/NGBarChartView.m
To draw the barChart, I have created each barChart as a different CAShapeLayer. The reason I did this is two fold, first I could just create a UIBezierPath and attach that to a CAShapeLayer object and two, I can easily track if a barItem is touched or not by using [Layer hitTest] method. The component works pretty well. However, I am not comfortable with the approach I have taken to draw the barCharts. Hence this note. I need expert opinion on the following
By using the CAShapeLayer and creating BarItems I am really not
using the UIGraphicsContext, is this a good design?
My approach will create several CALayers inside a UIView. Is there a
limit, based on performance, to the number of CALayers you can
create in a UIView.
If a good alternative is to use CGContext* methods then, whats the
right way to identify if a particular path has been touched
From an Animation point of view, such as the Bar blinking when you
tap on it, is the Layer design better or the CGContext design
better.
Help is very much appreciated. BTW, you are free to look at my code and comment. I will gladly accept any suggestions to improve.
Best,
Murali
IMO, generally, any kind of drawing shapes needs heavy processing power. And compositing cached bitmap with GPU is very cheap than drawing all of them again. So in many cases, we caches all drawings into a bitmap, and in iOS, CALayer is in charge of that.
Anyway, if your bitmaps exceed video memory limit, Quartz cannot composite all layers at once. Consequently, Quartz have to draw single frame over multiple phases. And this needs reloading some textures into GPU. This can impact on performance. I am not sure on this because iPhone VRAM is known to be integrated with system RAM. Anyway it's still true that it needs more work on even that case. If even system memory becomes insufficient, system can purge existing bitmap and ask to redraw them later.
CAShapeLayer will do all of CGContext(I believe you meant this) works instead of you. You can do that yourself if you felt needs of more lower level optimization.
Yes. Obviously, everything has limit by performance view. If you're using hundreds of layers with large alpha-blended graphics, it'll cause performance problem. Anyway, generally, it doesn't happen because layer composition is accelerated by GPU. If your graph lines are not so many, and they're basically opaque, you'll be fine.
All you have to know is once graphics drawings are composited, there is no way to decompose them back. Because composition itself is a sort of optimization by lossy compression, So you have only two options (1) redraw all graphics when mutation is required. Or (2) Make cached bitmap of each display element (like graph line) and just composite as your needs. This is just what the CALayers are doing.
Absolutely layer-based approach is far better. Any kind of free shape drawing (even it's done within GPU) needs a lot more processing power than simple bitmap composition (which will become two textured triangles) by GPU. Of course, unless your layers doesn't exceeds video memory limit.
I hope this helps.