Cocos2d: GPU having to process the non visible CCSprite quads

Cocos2d: GPU having to process the non visible CCSprite quads - ios

Given one texture sheet is it better to have one or multiple CCSpriteBatchNodes? Or does this not affect at all the GPU computational cost in processing the non visible CCSprite quads?
I am thinking about performance and referring to this question and answer I got. Basically it suggests that I should use more than one CCSpriteBatchNode even if I have only one file. I don't understand if the sentence "Too many batched sprites still affects performance negatively even if they are not visible or outside the screen" is applicable also having two CCSpriteBatchNode instead of one. In other words, does the sentence refer to this "The GPU is responsible for cancelling draws of quads that are not visible due to being entirely outside the screen. It still needs to process those quads."? And if so it should meant that it doesn't really matter how may CCSpriteBatchNode instances I have using the same texture sheet, right?
How can I optimize this? I mean, how can I avoid the GPU having to process the non visible quads?
Would you be able to answer to at least the questions in bold?

First case: Too many nodes (or sprites) in the scene and many of them are out of screen/visible area. In this case for each sprite, GPU has to check if its outside the visible area or not. Too many sprite-nodes means too much load on GPU.
Adding more CCSpriteBatchNode should not effect the performance. Because the sprite-sheet bitmap is loaded to the GPU memory, and an array of coordinates is kept by the application for drawing individual sprites. So if you put 2 images in 2 different CCSpriteBatchNodes or 2 images in 1, it will be same for both CPU and GPU.
How to optimize?
The best way would be to remove the invisible nodes/sprites from the parent. But it depends on your application.

FPS drops certainly because of two reasons:
fillrate - when a lot of sprites overlap each others (and additionally if we render high-res texture into small sprite)
redundant state changes - in this case the heaviest are shader and texture switches
You can render sprites outside of screen in single batch and this doesn't drop performance singnificantly. Pay attention that rendering sprite with zero opacity (or transparent texture) takes the same time as non-transparent sprite.

First of all, this really sounds like a case of premature optimization. Do a test with the number of sprites you expect to be on screen, and some added, others removed. Do you get 60 fps on the oldest supported device? If yes, good, no need to optimize. If no, tweak the code design to see what actually makes a difference.
I mean, how can I avoid the GPU having to process the non visible quads?
You can't, unless you're going to rewrite how cocos2d handles drawing of sprites/batched sprites.
it doesn't really matter how may CCSpriteBatchNode instances I have using the same texture sheet, right?
Each additional sprite batch node adds a draw call. Given how many sprites they can batch into a single draw call, the benefit far outweighs the drawbacks. Whether you have one, two or three sprite batch nodes makes absolutely no difference.

Related

XNA problem when draw many different batches

I have a 2D game in 4 directions, and I'm having problems with FPS (or GPU) because I have to draw a lot of textures.
I've read a lot about techniques to optimize performance, but I don't know what I can do anymore.
The main problem is that in some occasions I have about 200 creatures, where I have to draw his body (it is a single sprite) but also draw spells and other animations on his body. So, I think that is when it starts to give conflicts because the loop where I draw each creature, must change the textures for each creature, that is body>animation1>animation2>animation3 and this about 200 (creatures) times at 60 fps. Which lowers the fps to about 40-50.
Any suggestions?
This is how it looks:

The issue is probably - as you have already suggested - the constant switching between different textures. This is much slower than drawing the same number of sprites with the same texture.
To change this, consider putting all your textures into a single big texture. You then always draw that texture. This obviously would look quite wrong, so you also have to tell XNA which part of the texture you want to draw. For that, you can use the SourceRectangle parameter that can be passed to SpriteBatch.Draw(...). That way, you can always render the same texture but can still have different images on screen.
See also this answer about texture atlasses for more details.

SKLabelNodes drop fps

I have a little game based on SpriteKit.
In this game I use lots of Nodes with letters (or combinations of letters) on it that user can move around to build words.
Those nodes are basically SKSpriteNodes with SKLabelNodes on them.
When I have a considerably large amount of nodes, Draw count increases and FPS drop dramatically.
Obviously, when I remove SKLabelNodes, Draw count stays low. But I still need those letters.
The question is, what is the best way to add those letters without dropping FPS?

There are three ways to do this, each is a different blend of compromises.
The first, would be the easiest, is to use shouldRasterize on the existing labels. This doesn't seem to exist for labels in Sprite Kit. DOH!
Use bitmapped textures as letters on objects, actually as sprites, the thing that Sprite Kit handles best. This will involve using a bitmap font generator, such as the excellent BMGlyph as pointed out by Whirlwind in the comments.
This will give not be easy because the coding part will be a little more labour intensive, but you should get the absolute best performance this way.
You can swap letters, too, still, but you will need to think of them as subsections of the texture rather than as letters. An array or dictionary with each letter's position in the texture assigned to something easy to remember will be both performant and easy to use. But labour intensive to setup. Much more so than SKLabelNode
Or, you could go wild, and create textures with code, by using an SKLabelNode on a virtual object and then "rendering" or "drawing" that to a texture, and then using that texture(s) for each letter onto the objects/sprites. Similar to how BMGlyph works, but MUCH more time consuming and much less flexible.
BMGlyph is the best combination of speed and ease of use, and it has some pretty fancy effects for creating nice looking text, too.

SKPhysicsBody slowing down program

I have a random maze generator that starts building small mazes then progress into massive levels. The "C"s are collectables and the "T"s are tiles. the "P" is the player starting position. I included a sample tile map below.
The performance issue is not when I have a small 6x12 pattern like here; it shows up when I've got a 20x20 pattern for example.
Each character is a tile, and each tile has it's own SKPhysicsBody. The tiles are not square, they are complex polygons and the tiles don't quite touch each other.
The "C"s need to be able to be removed one at a time, and the "T"s are permanent for the level and don't move. Also the maze only shows 6x4 section of tiles at a time and moves the background to the view centered on the player.
I've tried making the T's and C's rectangles which drastically improves performance (but still slower than desired) although the user won't care for this, the shape of the tile is just too different.
Are there any performance tricks you pros can muster up to fix this?
TTTTTT
TCTTCT
TCCCCT
TTCTCT
TCCTCT
TTCCTT
TTTCTT
TTCCCT
TCCTCT
TCTTCT
TTCCCT
TTPTTT

The tiles are not square, they are complex polygons
I think this is your problem. Also, if your bodies are dynamic, setting them static will drastically improve performance. You can also try pooling. And be aware, that performance on the simulator is drastically lower than on the real device.

What kind of collision method are you using?
SpriteKit provides several possibilities to define the shape of the SKPhysicsBody. The best performance provides a rectangle or a circle:
myPhysicsBody = SKPhysicsBody(rectangleOfSize: mySprite.size)
You can also define more complex shapes like a triangle, which have a worse performance.
Using the texture (SpriteKit will use all non transparent pixels to detect the shape by itself) has the worst performance:
myPhysicsBody = SKPhysicsBody(texture: mySprite.texture, size: mySprite.size)
Activating 'usesPreciseCollisionDetection' will also have a negative impact on your performance.

cocos2d flicker

I have a pool of CCSprites numbering 1200 in each of two arrays, displayGrid1 and displayGrid2. I turn them visible or invisible when showing walls or floors. Floors have a number of different textures and are not z-order dependent. Walls also have several textures and are z-order dependent.
I am getting about 6-7 frames when moving which is okay because its a turn based isometric rogue-like. However, I am also getting a small amount of flicker, which I think is performance related, because there is no flicker on the simulator.
I would like to improve performance. I am considering using an array CCSpriteBatchNodes for the floor which is not z-order dependent but am concerned with the cost of adding and removing sprites frequently between the elements of this array, which would be necessary I think.
Can anyone please advise as to how I can improve performance?

As mentioned in the comments, you're using multiple small sprite files loaded individually which can cause performance issues as there is wasted memory being used to store excess pixel data around each of the individual sprites. Each row of pixel data in an OpenGL texture must have a number of bytes totaling a power of 2 for performance reasons. Although I believe OpenGL ES under iOS does this automatically, it can come with a big performance hit. Grouping sprites together into a single texture that is correctly sized can be a tremendous boon to rendering performance.
You can use an App like Zwoptex to group all these smaller sprite files into a larger, more manageable sprite sheets/texture atlas and utilize one CCSpriteBatchNode for each sprite sheet/texture atlas.
Cocos2D has pretty good support for utilizing sprite sheets with texture atlases and converting your code to using these instead of individual files can be done with little effort. Creating individual sprites from a texture atlas is easy, you just call the sprite by name instead of from the file.
CCSpriteBatchNodes group OpenGL calls for their sprites together, a process known as batching, which causes the operating system and OpenGL to have to make less round trips to the GPU which greatly improves performance. Unfortunately, CCSpriteBatchNodes are limited to only being able to draw sprites for the texture that backs them (enter sprite sheets/texture atlases).

CALayer vs CGContext, which is a better design approach?

I have been doing some experimenting with iOS drawing. To do a practical exercise I wrote a BarChart component. The following is the class diagram (well, I wasnt allowed to upload images) so let me write it in words. I have a NGBarChartView which inherits from UIView has 2 protocols NGBarChartViewDataSource and NGBarChartViewDelegate. And the code is at https://github.com/mraghuram/NELGPieChart/blob/master/NELGPieChart/NGBarChartView.m
To draw the barChart, I have created each barChart as a different CAShapeLayer. The reason I did this is two fold, first I could just create a UIBezierPath and attach that to a CAShapeLayer object and two, I can easily track if a barItem is touched or not by using [Layer hitTest] method. The component works pretty well. However, I am not comfortable with the approach I have taken to draw the barCharts. Hence this note. I need expert opinion on the following
By using the CAShapeLayer and creating BarItems I am really not
using the UIGraphicsContext, is this a good design?
My approach will create several CALayers inside a UIView. Is there a
limit, based on performance, to the number of CALayers you can
create in a UIView.
If a good alternative is to use CGContext* methods then, whats the
right way to identify if a particular path has been touched
From an Animation point of view, such as the Bar blinking when you
tap on it, is the Layer design better or the CGContext design
better.
Help is very much appreciated. BTW, you are free to look at my code and comment. I will gladly accept any suggestions to improve.
Best,
Murali

IMO, generally, any kind of drawing shapes needs heavy processing power. And compositing cached bitmap with GPU is very cheap than drawing all of them again. So in many cases, we caches all drawings into a bitmap, and in iOS, CALayer is in charge of that.
Anyway, if your bitmaps exceed video memory limit, Quartz cannot composite all layers at once. Consequently, Quartz have to draw single frame over multiple phases. And this needs reloading some textures into GPU. This can impact on performance. I am not sure on this because iPhone VRAM is known to be integrated with system RAM. Anyway it's still true that it needs more work on even that case. If even system memory becomes insufficient, system can purge existing bitmap and ask to redraw them later.
CAShapeLayer will do all of CGContext(I believe you meant this) works instead of you. You can do that yourself if you felt needs of more lower level optimization.
Yes. Obviously, everything has limit by performance view. If you're using hundreds of layers with large alpha-blended graphics, it'll cause performance problem. Anyway, generally, it doesn't happen because layer composition is accelerated by GPU. If your graph lines are not so many, and they're basically opaque, you'll be fine.
All you have to know is once graphics drawings are composited, there is no way to decompose them back. Because composition itself is a sort of optimization by lossy compression, So you have only two options (1) redraw all graphics when mutation is required. Or (2) Make cached bitmap of each display element (like graph line) and just composite as your needs. This is just what the CALayers are doing.
Absolutely layer-based approach is far better. Any kind of free shape drawing (even it's done within GPU) needs a lot more processing power than simple bitmap composition (which will become two textured triangles) by GPU. Of course, unless your layers doesn't exceeds video memory limit.
I hope this helps.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart