I have a single container view inside a UIScrollView, and that container view has several CAShapeLayer based subviews (about 200). Each CAShapeLayer contains a very simple CGPath (a filled polygon of about 10 points). You can see it as a sort of map.
The container view itself is big (about 1000x2500 points) but I have implemented zooming using the transform property (I'm not using UIScrollView implementation), so at small scales it's entirely visible.
At high scales (only a part of the container view is visible on screen) this works great and scrolls smoothly, even on old hardware.
However, at small scales (when most of the container view is visible), this results in very bad scrolling performance on old hardware (40 fps on an iPhone 4S). And if I go over 200 subviews (which is something I would like to do), this is much worse (down to 15fps) even on newer hardware.
I have done some profiling but I can't find the bottleneck. During scrolling I make the following measurements (on average) :
Activity monitor instrument :
CPU : 8%
GPU driver instrument :
Device utilization : 40%
Renderer utilization : 35%
Tiler utilisation : 8%
FPS : 40
Here is what I have tried :
Setting shouldRasterize with the proper rasterizationScale on all CAShapeLayer. It makes things worse.
Setting shouldRasterize with the proper rasterizationScale on the layer of the container view. This improves scrolling for very small scales (when the entire container is visible inside the scrollView), but makes it much worse for bigger scales. Activating rasterization only for for small scales leaves a gap (between 0.5 an 2.0 approximately) where both options result in lost frames.
Using layers only instead of views. Doesn't improve anything.
Using CATiledLayer for the container view layer. Doesn't improve anything.
Any ideas ? If there was a limitation of the hardware shouldn't I see a 100% somewhere ? What else should I be profiling ?
At this point the main thing that I'm asking is help on how to profile my app and understand what is causing lost frames. Then maybe, depending on what is happening and what is doable, I will try to improve things. I'm not asking for every single possible tweak in the book that could maybe improve things a little if i'm lucky.
EDIT
Just found out that although my app is running at 8% CPU, backboardd runs at 100% CPU during scrolling. So the bottleneck is in the Core Animation render server !
Now I just have to figure out why. Maybe there are just too many layers... Anyone knows what is the maximum number of layers the render server can process every 16ms ? I can't find any official documentation about that.
EDIT 2
So after some more profiling it looks like it has nothing to do with the number of layers. If I combine all the shapes in a small number of layers (around 10) by combining some of the paths (and as a result using more complex paths), the result is approximately the same. backboardd still runs at 100% CPU during scrolling.
So I guess that it's the paths processing/drawing/something that takes time, regardless of wether or not they are split in smaller/simpler paths. Maybe it's simply bound to the number of points. I'm starting to think that there is nothing that I can do...
Related
I have a CATiledLayer inside a UIScrollView. When I pinch to zoom in as far as possible the memory occupied by "other processes" momentarily becomes very high. Sometimes it is enough for jetsam to kill my app and sometimes my iPad will spontaneously reboot.
The large amount of memory deallocates when I release my fingers. It copes if I zoom in only very slowly. The worst scenario is when I put two fingers on the screen and move them steadily apart at a moderate speed.
Comparison at rest, and during pinch:
For context, I am rendering PDF content and it works fine apart from this issue. Normally I use 4 levels of detail and 1024x1024 tiles.
I can still reproduce the problem consistently using all default values on CATiledLayer for tile size, detail and bias. I have also tried not rendering any content into the tiles (just filling them white) and the problem occurs the same way.
Does anyone know how to prevent this "other processes" memory building up so much?
I am making a game, and it involves a sandstorm. I decided that the basic concept would be that I would make an image that looks roughly like a sandstorm, and then decorate it with some particles/whatever else it takes.
I ran into an issue at step one. I threw together a simple image for testing purposes:
I added that to my game, and the FPS dropped by 60%. I was surprised by the effect one image had, but I wasn't too worried about it. I cut the resolution of the image in half, and again, lots of lag.
Is spritekit/iOS really that bad at handling moderately sized images with alpha? I read on another question that the simulator is bad at rendering, but that can't be the entire problem.
Is there any hope for getting this to render without slicing my performance? The particles work well, everything else runs at 60fps just fine, but the addition of this image is apparently a severe drain on resources.
EDIT: I tested my game out on my phone, and I got no lag. So apparently, the simulator is just really bad at rendering after all. At the same time, I am curious as to how to speed up performance, as there is clearly some kind of lag going on.
I'm no expert on SpriteKit, but I had similar experiences with plain core animation and layering.
The issue is that an image with alpha, even for the "opaque" parts of it, it introduces a redrawing call on all the sublayers underneath it for every time it moves. First check if this is actually the problem, and then, try one of this, and see if it improves:
SKCropNode could prevent for rendering the underneath elements
Tile the image so only the border has alpha
Snapshot the underneath layers.
Reduce the amount of nodes being rendered, hide the ones that are "under the sandstorm".
And you should be using real devices to test performance of your game, you cannot rely on the simulator for that.
I have a collection view that displays 3 images, two labels, and 1 attributed string(strings are of different colors and font sizes and values are not unique for every cell). One of the images is coming from the web and I used AFnetworking to do the downloading and caching. The collection view displays 15 cells simultaneously.
When I scroll I can only achieve 25 frames/sec.
Below are the things I did:
-Processing of data were done ahead and cached to objects
-Image and views are opaque
-Cells and views are reused
I have done all the optimizations I know but I can't achieve at least 55 frames/sec.
If you could share other techniques to speed up the re-use of cells.
I was even thinking of pre-rendering the subviews off screen and cache it somewhere but I am not sure how it is done.
When I run the app on the iPhone it is fast since it only shows at least four cells at a time.
The first thing you need to do is fire up instruments and find out if you're CPU-bound (computation or regular I/O is a bottleneck) or GPU-bound (the graphics card is struggling). . depending on which of these is the issue the solution varies.
Here's a video tutorial that shows how to do this (among other things) . . This one is from Sean Woodhouse # Itty Bitty Apps (they make the fine Reveal tool).
NB: In the context of performance tuning we usually talk about being I/O bound or CPU bound as separate concerns, however I've grouped them together here meaning "due to either slow computation or I/O data is not getting to the graphics card fast enough". . if this is indeed the problem, then the next step is to find out whether it is indeed related to waiting on I/O or the CPU is maxed-out.
Instruments can be really confusing at first, but the above videos helped me to harness its power.
And here's another great tutorial from Anthony Egerton.
What is the size of the image that you use?
One of the optimization technique which would work is that
Resize the image so that it matches the size of the view you are displaying.
I have a custom view (inherited from UIView) in my app. The custom view overrides
- (void) drawRect:(CGRect) rect
The problem is: the drawRect: executes many times longer on iPad 3 than on iPad 2 (about 0.1 second on iPad 3 and 0.003 second on iPad 2). It's about 30 times slower.
Basically, I am using some pre-created layers and draw them in the drawRect:. The last call
CGContextDrawLayerAtPoint(context, CGPointZero, m_currentLayer);
takes most of the time (about 95% of total time in drawRect:)
What might be slowing things so much and how should I fix the cause?
UPDATE:
There are no threads directly involved. I do call setNeedsDisplay: in one thread and drawRect: gets called from another but that's it. The same goes for locks (there are no locks used).
The view gets redrawn in response to touches (it's a coloring book app). On iPad 2 I get reasonable delay between a touch and an update of the screen. I want to achieve the same on iPad 3.
So, the iPad 3 is definitely slower in a lot of areas. I have a theory about this. Marco Arment noted that the method renderInContext is ridiculously slow on the new iPad. I also found this to be the case when trying to create a magnifying glass for a custom text view. In the end I had to forego renderInContext for custom Core Graphics drawing.
I've also been having problem hitting the dreaded wait_fences errors on my core graphics drawing here: Only on new iPad 3: wait_fences: failed to receive reply: 10004003.
This is what I've figured out so far. The iPad 3 obviously has 4 times the pixels to drive. This can cause problems in two place:
First, the CPU. All core graphics drawing is done by the CPU. In the case of rotational events, if the CPU takes too long to draw, it hits the wait_fences error, which I believe is simply a call that tells the device to wait a little longer to actually perform the rotation, thus the delay.
Transferring images to the GPU. The GPU obviously handles the retina resolution just fine (see Infinity Blade 2). But when core graphics draws, it draws its images directly to the GPU buffers to avoid memcpy. However, either the GPU buffers haven't changes since the iPad 2 or they just didn't make them large enough, because it's remarkably easy to overload those buffers. When that happens, I believe the CPU writes the images to standard memory and then copies them to the GPU when the GPU buffers can handle it. This, I think is what causes the performance problems. That extra copy is time consuming with so many pixels and slows things down considerably.
To avoid memcpy I recommend several things:
Only draw what you need. Avoid drawing anything offscreen at all costs. If you're drawing a large view, but only display part of that view (subviews covering it, for example) try to find a way to only draw what is visible.
If you have to draw a large view, consider breaking the view up in to parts either as subviews or sublayers (probably sublayers in your case). And only redraw what you need. Take the notability app, for example. When you zoom in, you can literally watch it redraw one square at a time. Or in safari you can watch it update squares as you scroll. Unfortunately, I haven't had to do this so I'm uncertain of the methodology.
Try to keep your drawings simple. I had an awesome looking custom core text view that had to redraw on every character entered. Very slow. I changed the background to simple white (in core graphics) and it sped up well. Even better would be for me to not redraw the background.
I would like to point out that my theory is conjecture. Apple doesn't really explain what exactly they do. My theory is just based on what they have said and how the iPad responds as well as my own experimentation.
UPDATE
So Apple has now released the 2012 WWDC Developer videos. They have two videos that may help you (requires developer account):
iOS App Performance: Responsiveness
iOS App Performance: Graphics and Animation
One thing they talk about I think may help you is using the method: setNeedsDisplayInRect:(CGRect)rect. Using this method instead of the normal setNeedsDisplay and making sure that your drawRect method only draws the rect given to it can greatly help performance. Personally, I use the function: CGContextClipToRect(context, rect); to clip my drawing only to the rect provided.
As an example, I have a separate class I use to draw text directly to my views using Core Text. My UIView subclass keeps a reference to this object and uses it to draw it's text rather than use a UILabel. I used to refresh the entire view (setNeedsDisplay) when the text change. Now I have my CoreText object calculate the changed CGRect and use setNeedsDisplayInRect to only change the portion of the view that contains the text. This really helped my performance when scrolling.
I ended up using approach described in #Kurt Revis answer for similar question.
I minimized number of layers used, added UIImageView and set its image to an UIImage wrapping my CGImageRef. Please read the mentioned answer to get more details about the approach.
In the end my application become even simpler than before and works with almost identical speed on iPad 2 and iPad 3.
few months ago I've found a really awesome sample code from Apple site. The sample is called "LargeImageDownsizing" the wonderful thing is that it explain a lot about how image are read from resources and then rendered on screen. Digging into that code I've found something that is disturbing me a little. The downsized image is passed to a view that has a CATiledLayer, but without giving a piece of image at each tile to improve memory performance, it just set the tile size and then load image (I'm making things simple to go to the concept). So my question basically is why?Why use a CATiledLayer if it is not feed in the right way, they could have used a normal UIImageView... So I made few tests to understand if I was right. Modifing the code simple adding a scrollview with an image view as subview and responding to the delegate scrollview for zoom. I went to those conclusions testing on device and sim:
-The memory impact and footprint is exactly the same, even during zooming scrolling operation and it doesn't surprise me at all, the image is decompressed in memory
-Time profile say that a tileview take more time to be drawn during scrolling zoom operation instead of a uiimageview and that doesn't surprise me at all again the uiimageview is already drawn
-If I send memory warning nothing change between the two solution(only on sim)
-Testing Core Animation performance I get the same results around 60FPS
So what's the deal between those two views/layers why should I pick one instead of the other in these specific case? UIImageView seems to win the battle.
I hope that someone could help me to understand that.
They might perform the same for small images because ghen the only difference in terms os performance is that CATiledLayer draws on a background thread. Depending on the tile size CATiledLayer would even be slower because it has to draw multiple tiles for one image.
BUT ...
the point of CATiledLayer is that you don't need to draw all tiles, especially when zooming into a very very large image. It is smart to know which parts are actually needed. It also is smart about evicting tiles that are not needed any more.
Or this mechanism to work you need to provide the individual parts of the image separately. We're talking a total size of an image that probably cannot be held in memory uncompressed.