iPhone Performance Killers

On March 8, 2011, in Programming, by Steffen Itterheim

Have a look at the following code, and then answer these questions before reading on:

  1. Which function will run faster?
  2. What will be the framerate for each function when run 100 times per frame on an iPhone 3G?
  3. Will wrapping the 100 calls to function1 in an NSAutoreleasePool show any difference?

[cc lang=”ObjC” height=”465″]
-(void) function1
{
CGPoint pos = [self position];
id x = [NSNumber numberWithFloat:pos.x];
id y = [NSNumber numberWithFloat:pos.y];
id objects = [NSArray arrayWithObjects:x, y, nil];
id keys = [NSArray arrayWithObjects:@”x”, @”y”, nil];
id dict = [NSDictionary dictionaryWithObjects:objects forKeys:keys];
dict; // avoid compiler warning, is a noop
}

-(void) function2
{
CGPoint pos = [self position];
id x = [[NSNumber alloc] initWithFloat:pos.x];
id y = [[NSNumber alloc] initWithFloat:pos.y];
id objects = [[NSArray alloc] initWithObjects:x, y, nil];
id keys = [[NSArray alloc] initWithObjects:@”x”, @”y”, nil];
id dict = [[NSDictionary alloc] initWithObjects:objects forKeys:keys];
[x release];
[y release];
[objects release];
[keys release];
[dict release];
}
[/cc]

The Answers

  1. Which function will run faster? Answer: function1
  2. What will be the framerate for each function when run 100 times per frame on an iPhone 3G? Answer: 27 fps for function1 and 24 fps for function2.
  3. Will wrapping the 100 calls to function1 in an NSAutoreleasePool show any difference? Answer: no, but memory of temporary objects is released immediately.

Needless to say, on an iPod (4th Generation) and an iPad these tests all run at 60 fps and give no indication whatsoever that the performance on an iPhone 3G would suffer this much (and neither does the Simulator, of course). All the more reason to test early and often on older devices.

To autorelease or not?

Common wisdom may tell you that alloc/release is faster than autorelease. Even Apple recommends avoiding autorelease, right?

Not quite, because this is often misunderstood: Apple recommends to avoid autorelease but only for functions which create a lot of temporary objects and because of the constrained memory – not because it’s slow or even dangerous – autorelease is not dangerous.

Since memory is so constrained on 1st and 2nd generation iOS devices, it’s best to release that memory as soon as possible and don’t leave it allocated for longer than necessary. To achieve this, you can choose to do two things in this case: use alloc/release or enclose the loop in an NSAutoreleasePool. The latter option is preferred since it will release the memory right away, and not some time later. And autorelease is generally preferable because you will never, ever forget to send a release message to an object – which means it’ll be leaked and forever use up memory.

You can write well-performing, even better-performing code by using autorelease and using NSAutoreleasePool around tight loops creating many temporary autorelease objects.

Innocent looking code kills framerate

Did you expect that creating 100 rather simple NSDictionary instances each frame would drag the framerate down to around 24-27 fps? Me neither. I knew the code wasn’t going to be blazing fast, but I never expected it to have such an impact. However, it can be optimized somewhat since I’m unnecessarily creating two NSArray instances to hold the keys and values respectively before using them to create the NSDictionary. In fact we can get rid of them by using dictionaryWithObjectsAndKeys and doing this in a single step:

[cc lang=”ObjC”]
-(void) function1Optimized
{
CGPoint pos = [self position];
id x = [NSNumber numberWithFloat:pos.x];
id y = [NSNumber numberWithFloat:pos.y];
id dict = [NSDictionary dictionaryWithObjectsAndKeys:x, @”x”, y, @”y”, nil];
dict; // avoid compiler warning, is a noop
}
[/cc]

Sometimes it helps to look around what other ways there are to run the same code. In terms of performance this is an order of a magnitude faster and now clocks in at 42 fps. Still not good enough for realtime rendering obviously but an improvement of over 50% by cutting two NSArray allocations is a very simple and effective optimization.

Just as a general guideline, when I get rid of the two NSNumber instances and simply pass empty strings for x and y the framerate went back up to 60 fps. Of course that’s over-optimizing to the point where the code doesn’t work anymore. It just goes to show how expensive the creation of NSDictionary and NSArray are, as is wrapping simple types in NSNumber or NSValue objects.

If you can avoid allocation and temporary objects, avoid it. If you can’t, at least avoid creating temporary objects every frame. Re-use objects as much as possible. Unfortunately, that’s not an option for NSNumber objects since you can’t change the value of a NSNumber instance.