I have profiled one of my apps using Allocations, and find that whenever I call a particular method my amount of "Live Bytes" increases by 300 KB. I have no idea what could be causing this.
The following line of code is the culprit:
CNTile *newTile = [self getTileAtPosition:3];
The associated method reads like this:
- (CNTile *)getTileAtPosition:(int)pos
{
CNTile *tileToReturn;
for (int x = 0; x < [row count]; x++)
{
for (int y = 0; y < [col count]; y++)
{
The code here generates four CGPoints and a CGMutablePathRef,
then uses CGPathContainsPoint to determine which CNTile to return.
}
}
return tileToReturn;
}
I should mention that my CNTile class contains only a UIView and UIImageView, as well as a few simple variables (such as ints and BOOLs).
Any help would be greatly appreciated!
How do you create CGMutablePathRef? With CGPathCreateMutable? If yes, make sure you use CGPathRelease to release it:
CGMutablePathRef thePath = CGPathCreateMutable();
...
CGPathRelease(thePath);
Related
I'm trying to perform some basic cellular automata on compute shader (DirectCompute) but without double buffering, so I'm using unordered access view to a RWTexture2D<uint> for the data, however I'm having some really strange hang/crash here, I could make a very small snippet that produces the issue:
int w = 256;
for (int x = 0; x < w; ++x)
{
for (int y = 1; y < w; ++y)
{
if (map[int2(x, y - 1)])
{
map[int2(x, y)] = 10;
map[int2(x, y-1)] = 30;
}
}
}
where map is RWTexture2D<uint>.
If I remove the if or one of the assignments, it works, I thought it could be some kind of limit so I tried looping just 1/4 of the texture but the problem persists. That code is dispatched with (1,1,1) and kernel numthreads is (1,1,1) too, in my real-world scenario I want to loop from bottom to top and fill the voids (0) with the pixel I'm currently looping (think of a "falling sand" kind of effect), so it can't be parallel except in columns since it depends on the bottom pixel.
I don't understand what is causing the shader to hang though, there's no error or anything, it simply hangs and never not even times out.
EDIT:
After some further investigation, I came across something really intriguing; when I pass that w value in a constant buffer it all works fine. I have no idea what would cause that, maybe it's some compiling optimization that went wrong, maybe it tries to unroll the loop what causes some issue, and passing the value in a constant buffer disables that, however I'm compiling the shaders in debug with no optimization so I don't know.
I've had issues declaring variables in global scope like this before. I believe it's because it's not static const (so declare as a static const and it should work). Most likely, it's treating it as a constant buffer (with some default naming) and the contents are undefined since you're not binding a buffer, which causes undefined results. So the following code should work:
static const int w = 256;
for (int x = 0; x < w; ++x)
{
for (int y = 1; y < w; ++y)
{
if (map[int2(x, y - 1)])
{
map[int2(x, y)] = 10;
map[int2(x, y-1)] = 30;
}
}
}
I implemented the EZAudioPlotGL in about 4 different view controllers. At times only the top part of it is showing , even though shouldMirror is set to YES at all times. Any Suggestions ?
I have used "EZAudioPlot.h" class rather than "EZAudioPlotGL.h". which resolves issue of inconsistent wave form and works same as "EZAudioPlotGL.h"
and implemented clear method in "EZAudioPlot.m" class because it does not have implementation or clear method.
-(void)clear
{
float empty[_changingHistorySize];
// Figure out better way to do this
for(int i = 0; i < _changingHistorySize; i++ )
{
empty[i] = 0.0f;
}
for(int i = 0; i < _scrollHistoryLength; i++)
{
_scrollHistory[i] = 0.0f;
}
_scrollHistoryIndex = 0;
[self setSampleData:_scrollHistory
length:(!_setMaxLength?kEZAudioPlotMaxHistoryBufferLength:_scrollHistoryLength)];
}
my script below does exactly what i need it to do, add blocks to a scene in any random order in the view. the only problem is, as i increase the amount of "block" nodes, they tend to overlap on one another and clump up, I'm wondering if there is a way i can add a "barrier" around each block node so that they cannot overlap but still give a random feel? My current code is below:
-(void) addBlocks:(int) count {
for (int i = 0; i< count; i++) {
SKSpriteNode *blocks = [SKSpriteNode spriteNodeWithImageNamed:#"Ball"];
blocks.physicsBody = [SKPhysicsBody bodyWithCircleOfRadius:(blocks.size.width/2)];
blocks.physicsBody.dynamic = NO;
blocks.position = CGPointMake(self.frame.size.width/2, self.frame.size.height/2);
[self addChild:blocks];
int blockRandomPositionX = arc4random() % 290;
int blockRandomPositionY = arc4random() % 532;
blockRandomPositionY = blockRandomPositionY + 15;
blockRandomPositionX = blockRandomPositionX + 15;
blocks.position = CGPointMake(blockRandomPositionX, blockRandomPositionY);
}
}
Any help highly appreciated, thanks!
To prevent two nodes from overlapping, you should check the newly created node's random position with intersectsNode: to see if it overlaps any other nodes. You also have to add each successfully added node into an array against which you run the intersectsNode: check.
Look at the SKNode Class Reference for detailed information.
I'm working on a Spritekit Tower Defence game. ARC is enabled. (And I intend to run this code in the background, though presently it's just running on the main thread.)
In my update loop (which is running up to 60 times a second) I call a method called getTargetsForTowers. After profiling this method, I've found two items in the list that are chewing up my CPU time: objc_object::sidetable_retain/release, and I'm trying to find out what they are.
I'd like to understand more about what this is and if I can improve performance by reducing them or getting rid of them altogether.
There are 300 enemies and 446 towers in my test scenario. The majority of the CPU time is reported in the tower loop.
- (void)getTargetsForTowers {
NSArray *enemiesCopy = [enemiesOnMap copy];
for (CCUnit *enemy in enemiesCopy) {
float edte = enemy.distanceToEnd;
CGPoint enemyPos = enemy.position;
[self calculateTravelDistanceForEnemy:enemy];
if (enemy.actualHealth > 0) {
NSArray *tiles = [self getTilesForEnemy:enemy];
for (CCTileInfo *tile in tiles) {
NSArray *tileTowers = tile.towers;
for (CCSKTower *tower in tileTowers) {
BOOL hasTarget = tower.hasTarget;
BOOL passes = !hasTarget;
if (!passes) {
CCUnit *tg = tower.target;
float tdte = tg.distanceToEnd;
passes = edte < tdte;
}
if (passes) {
BOOL inRange = [self circle:tower.position withRadius:tower.attackRange collisionWithCircle:enemyPos collisionCircleRadius:1];
if (inRange) {
tower.hasTarget = YES;
tower.target = enemy;
}
}
}
}
}
}
}
Screenshots from Time Profile (after 60 seconds of running):
image one http://imageshack.com/a/img22/2258/y18v.png
image two http://imageshack.com/a/img833/7969/7fy3.png
(I've been reading about blocks, arc, strong/weak references, etc., so I tried making the variables (such as CCSKTower *tower) __weak, which did get rid of those two items, but that added a whole bunch of new items related to retaining/creating/destroying the weak variables, and I think they consumed more CPU time than before.)
I'd appreciate any input on this. Thanks.
EDIT:
There's another method that I would like to improve as well which is:
- (NSArray *)getTilesForEnemy:(CCUnit *)enemy {
NSMutableArray *tiles = [[NSMutableArray alloc] init];
float enemyWidthHalf = enemy.size.width/2;
float enemyHeightHalf = enemy.size.height/2;
float enemyX = enemy.position.x;
float enemyY = enemy.position.y;
CGVector topLeft = [self getVectorForPoint:CGPointMake(enemyX-enemyWidthHalf, enemyY+enemyHeightHalf)];
CGVector topRight = [self getVectorForPoint:CGPointMake(enemyX+enemyWidthHalf, enemyY+enemyHeightHalf)];
CGVector bottomLeft = [self getVectorForPoint:CGPointMake(enemyX-enemyWidthHalf, enemyY-enemyHeightHalf)];
CGVector bottomRight = [self getVectorForPoint:CGPointMake(enemyX+enemyWidthHalf, enemyY-enemyHeightHalf)];
CCTileInfo *tile = nil;
for (float x = topLeft.dx; x < bottomRight.dx+1; x++) {
for (float y = bottomLeft.dy; y < topRight.dy+1; y++) {
if (x > -(gameHalfCols+1) && x < gameHalfCols) {
if (y < gameHalfRows && y > -(gameHalfRows+1)) {
int xIndex = (int)(x+gameHalfCols);
int yIndex = (int)(y+gameHalfRows);
tile = tileGrid[xIndex][yIndex];
if (tile != nil) {
[tiles addObject:tile];
}
}
}
}
}
return tiles;
}
I've looked over it repeatedly and there's nothing I really can see. Perhaps there's nothing more that can be done.
Screenshots:
One issue is that you create a new reference to tower.target, but only use that reference once. So simply rewriting that section should improve your performance, e.g.
if (!passes) {
float tdte = tower.target.distanceToEnd;
passes = edte < tdte;
}
Based on your comment, it seems that there's no way to avoid a retain/release if you access a property on tower.target. So let's try radical surgery. Specifically, try adding a distanceToEnd property to the tower, to keep track of the distanceToEnd for the tower's current target. The resulting code would look like this.
- (void)getTargetsForTowers {
// initialization to copy 'distanceToEnd' value to each tower that has a target
for ( CCSKTower *tower in towersOnMap )
if ( tower.hasTarget )
tower.distanceToEnd = tower.target.distanceToEnd;
NSArray *enemiesCopy = [enemiesOnMap copy];
for (CCUnit *enemy in enemiesCopy) {
float edte = enemy.distanceToEnd;
CGPoint enemyPos = enemy.position;
[self calculateTravelDistanceForEnemy:enemy];
if (enemy.actualHealth > 0) {
NSArray *tiles = [self getTilesForEnemy:enemy];
for (CCTileInfo *tile in tiles) {
NSArray *tileTowers = tile.towers;
for (CCSKTower *tower in tileTowers) {
if ( !tower.hasTarget || edte < tower.distanceToEnd ) {
BOOL inRange = [self circle:tower.position withRadius:tower.attackRange collisionWithCircle:enemyPos collisionCircleRadius:1];
if (inRange) {
tower.hasTarget = YES;
tower.target = enemy;
tower.distanceToEnd = edte; // update 'distanceToEnd' on the tower to match new target
}
}
}
}
}
}
}
My impression is that there's not much to be done about the getTilesForEnemy method. Looking at the Running Time image for getTilesForEnemy it's clear that the load is fairly evenly spread among the various components of the method, with only three items above 10%. The top item getVectorForPoint isn't even in the innermost loop. The second item insertObject is apparently the result of the addObject call in the inner loop, but there's nothing to be done for that call, it's required to generate the final result.
At the next level up (see the wvry.png image), you can see that getTilesForEnemy is now 15.3% of the total time spent in getTargetsForTowers. So even if it were possible to reduce getVectorForPoint from 17.3% to 7.3% there would not be a significant reduction in running time. The savings in getTilesForEnemy would be 10%, but because getTilesForEnemy is only 15.3% of the time in getTargetsForTowers, the overall savings would only be 1.53%.
Conclusion, because the components of getTilesForEnemy are balanced and below 20%, and because getTilesForEnemy is only 15.3% of the higher level method, no significant savings will be gained by trying to optimize getTilesForEnemy.
So once again the only option is radical surgery, and this time I mean a total rewrite of the algorithm. Such action should only be taken if the app still isn't performing up to spec. You've run into the limitations of ARC and NSArray's. Both of those technologies are extremely powerful and flexible, and are perfect for high-level development. However, they both have significant overhead which limits performance. So the question becomes, "How do you write the getTargetsForTowers without using ARC and NSArray's?". The answer is to use arrays of C structs to represent the objects. The resulting top level pseudo code would be something like this
copy the enemy information into an array of C structs
copy the tower information into an array of C structs
(note that the target for a tower is just an 'int', which is the index of an enemy in the enemy array)
for ( each enemy in the enemy array )
{
create an array of C structs for the tiles
for ( each tile )
for ( each tower in the tile )
update the tower target if needed
}
copy the updated tower information back into the NSArray of tower objects
For your second method, this part seems unclear and inefficient:
for (float x = topLeft.dx; x < bottomRight.dx+1; x++) {
for (float y = bottomLeft.dy; y < topRight.dy+1; y++) {
if (x > -(gameHalfCols+1) && x < gameHalfCols) {
if (y < gameHalfRows && y > -(gameHalfRows+1)) {
For instance, there's no point in spinning the y loop if your x is out of bounds. You could just do this:
for (float x = topLeft.dx; x < bottomRight.dx+1; x++) {
if (x > -(gameHalfCols+1) && x < gameHalfCols) {
for (float y = bottomLeft.dy; y < topRight.dy+1; y++) {
if (y < gameHalfRows && y > -(gameHalfRows+1)) {
More importantly, the point of the first for loop is to start x at some minimum and increment it to some maximum, and the if statement is there to make sure x is at least some minimum and less than some maximum, so there's no reason to have both a for() and an if(). I don't know what the values might look like for topLeft.dx and gameHalfCols, so I can't tell you the best way to do this.
But, for example, if topLeft.dx is always integral, you might say:
for (float x = MAX(topLeft.dx, ceil(-(gameHalfCols+1))); x < bottomRight.dx+1 && x < gameHalfCols; x++) {
for (float y = ...
You could similarly improve the 'y' for this way. This sin't just fewer lines of code, it also prevents the loops from spinning a bunch of extra times with no effect: the 'if' statements just make the loops spin quickly to their ends, but including the logic inside the 'for's themselves makes them only loop over values that you'll actually use in computations.
To expand my comments to a complete answer:
The normal, correct Objective-C behaviour when returning an object property is to retain and then autorelease it. That's because otherwise code like this (imagine you're in the world before ARC):
TYTemporaryWorker *object = [[TYTemporaryWorker alloc] initWithSomeValue:value];
NSNumber *someResult = object.someResult;
[object release];
return someResult;
would otherwise be invalid. object has been deallocated so if someResult hasn't been retained and autoreleased then it will become a dangling pointer. ARC makes this sort of slightly less direct (the strong reference in someResult would have retained the number beyond the lifetime of object but then it would have been autoreleased for the return) but the principle remains and, in any case, whether an individual .m file has been compiled with ARC is not supposed to affect callers.
(aside: notice that weak isn't just strong without retains — is has related costs because the runtime has to establish a link from the object to the weak reference in order to know find it again and nil it if the object begins deallocation)
Supposing you wanted to create a new type of property that isn't strong and isn't unsafe_unretained but is rather defined to be that the object returned is safe for use for as long as the original owner is alive but unsafe afterwards. So it's a strong set but an unsafe_unretained get.
It's untested but I think the correct means to do that would be:
// we can't write want a synthesised getter that doesn't attempt to retain
// or autorelease, so we'd better flag up the pointer as potentially being
// unsafe to access
#property (nonatomic, unsafe_unretained) NSNumber *someResult;
...
#implementation TYWhatever
{
NSNumber *_retainedResult; // a strong reference, since
// I haven't said otherwise —
// this reference is not publicly exposed
}
- (void)setSomeResult:(NSNumber *)number
{
// set the unsafe and unretained version,
// as the default setter would have
_someResult = number;
// also take a strong reference to the object passed in,
// to extend its lifecycle to match ours
_retainedResult = number;
}
It's going to get quite verbose as you add more properties but what you're doing is contrary to normal Objective-C conventions so limited compiler help is probably to be expected.
I'm having trouble inserting multiple children of same sprite and accessing it (or setting positions for them on runtime). Kindly advise any suitable method preferably point out my mistake. Here is my approach.
//In the Init Method...
//int i is defined in the start.
for (i = 1; i < 4; i++)
{
hurdle = [CCSprite spriteWithFile:#"hurdle1.png"];
[self addChild:hurdle z:i tag:i];
hurdle.position = CGPointMake(150 * i, 0);
}
It spreads all the sprites on the canvas. then in some "UPDATE Function" I'm calling this.
hurdle.position = CGPointMake(hurdle.position.x - 5, 10);
if (hurdle.position.x <= -5) {
hurdle.position = ccp(480, 10);
}
It works but as expected only one instance moves horizontally. I want all the instances to be moved so I am trying to use this....
for (i = 1; i < 4; i++){
[hurdle getChildByTag:i].position = CGPointMake(hurdle.position.x - 5, 10);
//OR
[hurdle getChildByTag:i].position = CGPointMake([hurdle getChildByTag:i].position.x - 5, 10);
}
I've tried getting LOGs on various places and realized that getChildByTag doesn't work the way I'm trying to use it.
The problem is in the last block of code. You should make a local reference to each CCSprite within your for loop.
Since you added the sprites to self, you will retrieve them as children of self
for (i = 1; i < 4; i++){
CCSprite * enumHurdle = [self getChildByTag:i];
enumHurdle.position = CGPointMake(enumHurdle.position.x - 5, 10);
}
Be careful if you create any other sprites this way in the same scene. It is bad design to give any two sprites the same tag.
EDIT about avoiding duplicate tags.
If you know how many sprites you will have. Use an enum of tags and refer to the sprites by name.
If not, knowing how many groups and putting a limit on the size of groups could make it managable.
ie
say you have 3 parts of code where you are generating sprites like this. You can include an enum in your .m (under #implementation line) and put the limits there
// Choose names that describe the groups of sprites
enum { kGroupOne = 0, // limiting the size of each group to 100
kGroupTwo = 100, // (besides the last group, but that is not important)
kGroupThree = 200,
};
Then when you create each group
// group 1
for (i = kGroupOne; i < 4; i++){
// set up code here
}
// group 2
// g2_size is made up, insert whatever you want
for (i = kGroupTwo; i < g2_size; i++) {
// set up code here
}
.
.
.
Then to retrieve in groups
for (i = kGroupOne; i < 4; i++){
CCSprite * enumHurdle = [self getChildByTag:i];
enumHurdle.position = CGPointMake(enumHurdle.position.x - 5, 10);
}
.
.
.
Hopefully that sparks your creativity. Now have some fun.
Something that I do often is group objects of like kind that I want to act on in a similar way by adding them to a CCNode and add that CCNode to the layer.
I would create a class that derives from CCNode
Then I can put all my logic in that node and access then via [self children]
for(CCSprite *hurdle in [self children]) {
// Do what you need to do
}