When you embark on a project, the first thing a developer ought to do is to run some basic math. Especially if you already have some specs regarding the number and sizes of assets. Because otherwise you may end up trying hard to work around a memory related issue which perhaps even modern desktop computers would struggle with.

So today, I’ll do some math for you, the things you should consider before starting a project or adding one more of those big new shiny features to your app. Kind of like an addendum to my popular article about memory optimization and reducing bundle size.

How much wood texture would a woodchuck choke on if a woodchuck could choke on wood textures?

A texture is an in-memory representation of an image made up of individual pixels. Each pixel uses a certain amount of memory to represent its color. A texture’s memory size is therefore simply the product of width * height * sizeof(color).

Before I go any further, I like to stress it again: the size of an image file is much smaller than the size of the texture generated from the image. Don’t use image file sizes to make memory usage estimations.

Most common are 32-Bit and 16-Bit textures which use 4 and 2 Bytes respectively per pixel. A 4096×4096 image with 32-Bit color depth therefore uses 64 MB memory. Let that sink in for a moment …

At 16-Bit it only uses half of that, though without color dithering (TexturePacker does this for you) this might not look too good depending on the image.

This is pretty much what you’re stuck with unless you export textures as .pvr.ccz. Not only does this format load tremendously faster than PNG (not to speak of JPG which are unbearably slow to load in cocos2d), the .pvr.ccz format also reduces the texture memory size because the texture can stay compressed in memory.

It’s extremely difficult to estimate how much smaller a PVR texture’s memory footprint will be without actually giving it a try. But you can expect anywhere between 10% to 50% reduction.

To the non-power-of-two!

Continue reading »

Due to technical issues (blank page) I had to split the previous article (now focuses only on memory optimization) in two. This is the second part which (mostly) focuses on reducing the app bundle size.

Loading Assets In Sequence

Here’s the code that I use to load textures or other assets asynchronously (in background, on another thread).

Imagine loadAssetsThenGotoMainMenu being a scheduled method that runs every frame or perhaps less often. The assetLoadCount and loadingAsset variables are declared in the @interface as int and BOOL respectively.

When this method runs the first case statement is executed. To avoid accidentally loading the same image multiple times every time the selector runs, the loadingAsset flag is set to YES. When the texture cache has completed loading this texture, it will call the increaseAssetLoadCount. This then ensures that the next case statement is executed the next time loadAssetsThenGotoMainMenu runs.

The cool thing about this solution is that you can easily add more switch statements to add more textures to load. Because the default case is where the scene changes, and that only happens if there are no more switch cases to process.

Be sure not to skip a number in the switch cases because that will also run the default case.

Decreasing the size of your app

Besides the memory usage advantage, reducing the color bit depth of textures to 16 bit will also significantly reduce their size. But there are other options that will allow you to reduce your app’s size, perhaps significantly.

TexturePacker PNG Optimization

If for some reason you still want to use PNG files instead of the highly recommended .pvr.ccz file format, TexturePacker has a slider named “Png Opt Level” to help reduce the size of PNGs (doesn’t affect loading time though):

As far as I understand it, it tries a given number of optimizations and picks the one that creates the smallest file size. The downside is that the maximum level can take very long for large texture atlases, in some cases 10 to 20 minutes on a Late 2009 27″ iMac. The task is multi-threaded, so it should be a lot faster on quad core systems.

Fortunately there’s really no need to do this unless you’re ready to release the app. Question is, how much can it reduce the size of PNG files?

Continue reading »

I’m currently completing one last contract project. One of the last things I had to deal with was to optimize the game’s memory usage.

In today’s iDevBlogADay article I’ll explain how I was able to cut down memory usage by about 25-30 MB (down to 90-95 MB, ie fixing memory warning related crashes) as well as reducing the size of the app bundle from around 25 MB to below 20 MB (which would have been more awesome if Apple hadn’t already increased the over-the-air download limit from 20 MB to 50 MB some time ago).

I’ll also explain how to animate the loading screen while you’re loading resource files, and I’ll add some best practices and common wisdom too.

What’s using 90% of the memory?

Take a guess.

In almost all cases, it’s textures that consume most of the app’s memory. So textures is where you look to optimize first and foremost if you’re having memory warning troubles.

Avoid loading PNG/JPG Textures one after another

The problem with texture loading in cocos2d is that it happens in two steps: first, a UIImage is created from the image file. Then a CCTexture2D object is created from that UIImage. This means while a texture is being loaded, it will consume twice as much memory for a short time period.

The problem used to be so bad that if you loaded 4 textures one after another in the same method, at the end of the method each texture would still consume twice as much memory as it ought to, probably because of the way autorelease works.

I’m not sure if this is still the case, or whether this only applies to manual reference counting but not ARC. I made it a habit to load textures in sequence, waiting at least one frame before trying to load another. This will allow any texture loading overhead to be released from memory. Besides, as you’ll see later, if you want to load textures and other assets in the background this asset-load-sequencing is something you’ll do anyway.

Continue reading »

iPhone Performance Killers

On March 8, 2011, in Programming, by Steffen Itterheim

Have a look at the following code, and then answer these questions before reading on:

  1. Which function will run faster?
  2. What will be the framerate for each function when run 100 times per frame on an iPhone 3G?
  3. Will wrapping the 100 calls to function1 in an NSAutoreleasePool show any difference?

[cc lang=”ObjC” height=”465″]
-(void) function1
{
CGPoint pos = [self position];
id x = [NSNumber numberWithFloat:pos.x];
id y = [NSNumber numberWithFloat:pos.y];
id objects = [NSArray arrayWithObjects:x, y, nil];
id keys = [NSArray arrayWithObjects:@”x”, @”y”, nil];
id dict = [NSDictionary dictionaryWithObjects:objects forKeys:keys];
dict; // avoid compiler warning, is a noop
}

-(void) function2
{
CGPoint pos = [self position];
id x = [[NSNumber alloc] initWithFloat:pos.x];
id y = [[NSNumber alloc] initWithFloat:pos.y];
id objects = [[NSArray alloc] initWithObjects:x, y, nil];
id keys = [[NSArray alloc] initWithObjects:@”x”, @”y”, nil];
id dict = [[NSDictionary alloc] initWithObjects:objects forKeys:keys];
[x release];
[y release];
[objects release];
[keys release];
[dict release];
}
[/cc]

The Answers

  1. Which function will run faster? Answer: function1
  2. What will be the framerate for each function when run 100 times per frame on an iPhone 3G? Answer: 27 fps for function1 and 24 fps for function2.
  3. Will wrapping the 100 calls to function1 in an NSAutoreleasePool show any difference? Answer: no, but memory of temporary objects is released immediately.

Needless to say, on an iPod (4th Generation) and an iPad these tests all run at 60 fps and give no indication whatsoever that the performance on an iPhone 3G would suffer this much (and neither does the Simulator, of course). All the more reason to test early and often on older devices.

To autorelease or not?

Common wisdom may tell you that alloc/release is faster than autorelease. Even Apple recommends avoiding autorelease, right?

Not quite, because this is often misunderstood: Apple recommends to avoid autorelease but only for functions which create a lot of temporary objects and because of the constrained memory – not because it’s slow or even dangerous – autorelease is not dangerous.

Since memory is so constrained on 1st and 2nd generation iOS devices, it’s best to release that memory as soon as possible and don’t leave it allocated for longer than necessary. To achieve this, you can choose to do two things in this case: use alloc/release or enclose the loop in an NSAutoreleasePool. The latter option is preferred since it will release the memory right away, and not some time later. And autorelease is generally preferable because you will never, ever forget to send a release message to an object – which means it’ll be leaked and forever use up memory.

You can write well-performing, even better-performing code by using autorelease and using NSAutoreleasePool around tight loops creating many temporary autorelease objects.

Innocent looking code kills framerate

Did you expect that creating 100 rather simple NSDictionary instances each frame would drag the framerate down to around 24-27 fps? Me neither. I knew the code wasn’t going to be blazing fast, but I never expected it to have such an impact. However, it can be optimized somewhat since I’m unnecessarily creating two NSArray instances to hold the keys and values respectively before using them to create the NSDictionary. In fact we can get rid of them by using dictionaryWithObjectsAndKeys and doing this in a single step:

[cc lang=”ObjC”]
-(void) function1Optimized
{
CGPoint pos = [self position];
id x = [NSNumber numberWithFloat:pos.x];
id y = [NSNumber numberWithFloat:pos.y];
id dict = [NSDictionary dictionaryWithObjectsAndKeys:x, @”x”, y, @”y”, nil];
dict; // avoid compiler warning, is a noop
}
[/cc]

Sometimes it helps to look around what other ways there are to run the same code. In terms of performance this is an order of a magnitude faster and now clocks in at 42 fps. Still not good enough for realtime rendering obviously but an improvement of over 50% by cutting two NSArray allocations is a very simple and effective optimization.

Just as a general guideline, when I get rid of the two NSNumber instances and simply pass empty strings for x and y the framerate went back up to 60 fps. Of course that’s over-optimizing to the point where the code doesn’t work anymore. It just goes to show how expensive the creation of NSDictionary and NSArray are, as is wrapping simple types in NSNumber or NSValue objects.

If you can avoid allocation and temporary objects, avoid it. If you can’t, at least avoid creating temporary objects every frame. Re-use objects as much as possible. Unfortunately, that’s not an option for NSNumber objects since you can’t change the value of a NSNumber instance.

cocos2d Book, Chapter 6: Spritesheets & Zwoptex

On July 30, 2010, in Announcements, book, cocos2d, by Steffen Itterheim

Chapter 6 – Spritesheets and Zwoptex

In this chapter the focus will be on Spritesheets (Texture Atlas), what they are and when, where and why to use them. Of course a chapter about Spritesheets wouldn’t be complete without introducing the Zwoptex tool. The graphics added in this chapter will then be used for the game created in the following chapter.

The chapter will be submitted on Friday, August 6th.

Anything about Spritesheets you always wanted to know?

Just let me know. I’ll be researching what kind of issues people were and are having regarding Spritesheets. I want to make sure that they are all covered in the book.

Please leave a comment or write me an email.

Summary of working on Chapter 5 – Game Building Blocks

I finally found a better title for the chapter. A big part is about working with Scenes and Layers. A LoadingScene class is implemented to avoid the memory overlap when transitioning between two scenes. Layers are used to modify the game objects seperately from the static UI. I explain how to use targeted touch handlers to handle touch input for each individual layer, either swallowing touches or not.

The issue of whether to subclass CCSprite or not is discussed and an example is given how to create game objects using composition and without subclassing from CCNode and how that changes touch input and scheduling.

At the end the remaining specialized CCNode classes such as CCProgressTimer, CCParallaxNode and the CCRibbon class with the CCMotionStreak are given a treatment.

As you can see from the pictures, I’m also making good progress at becoming a great pixel artist. Only I have a looooooong way ahead of me still. But I admit, the little I know about art and how much less I’ve practiced it, I’m pretty happy about the results and having fun with it. The cool aspect of it is that this should be instructive art. It doesn’t have to be good. So I just go ahead and do it and tend to be positively surprised by the results. I’ll probably touch this subject in the next chapter about Spritesheets: doing your own art. It’s better than nothing, it’s still creative work even if it may be ugly to others, and it’s a lot more satisfying to do everything yourself, even if it takes a bit longer and doesn’t look as good. At least it’s all yours, you’re having fun, and learn something along the way. And you can always find an artist sometime later who will just draw over your existing images or who replaces your fart sound effects with something more appropriate.

Btw, if you’re looking for a decent and free image editing program for the Mac, I’ve been using Seashore for about a year now and I’m pretty happy with it.

While helping others solve their cocos2d project issues over the past year it became obvious that many projects have at least one major problem in one of these areas:

  • memory management
  • resource management
  • code structure

Examples

Memory management issues normally range from allocating too much memory, either by loading too many textures up front which are only going to be needed later, or by memory leaks such as scenes not deallocating when switching scenes. Resource management problems range from not adding the right resources to the right target, often resulting in increased App size because resources are added to the bundle but never used by the App. It could also mean loading identical resource files except that they have different filenames (copies?), using up additional memory. Or not tightly packing sprites into Texture Atlases but instead using one Texture Atlas per game object – while this is understandable from a standpoint of logical seperation it does waste opportunities for optimization.

Finally, code structure or lack thereof regularly leads to “everything in one class” code design which is most likely an evolutionary process rather than intentional. It’s not uncommon to see classes with thousands of lines of code, sometimes even going past 10,000 lines of code in one class. Other things are using too many CCLayers without them adding a clear benefit, for example just to group all nodes at a specific z order together or to group them by functionality, eg one layer for enemies, one for players, one for background, one for UI, one for score, one for particle effects, and so on – without any of these layers being used for what they’re really good at: modifying multiple nodes at once, like moving, scaling, rotating or z-reordering them. And of course there’s the copy & paste hell, large blocks of code reproduced in various places only to modify some parameters instead of creating a method which takes the modifiable parameters as arguments. Even professionals I worked with got so used to doing that it became hard just to overcome the resistance of letting go of old habits. But they learned.

Summary

Nothing of this code design and structuring strikes me as odd or surprising. I’ve written code like this myself. I also believe if it’s good enough and works, then why the hell not? It’s a matter of experience and it’s only with experience that you clearly see how to improve things. This boils down to the regular learning curve where only training and tutoring and just simply making mistakes and learning from them helps in the long run. That’s how we learn things.

On the other hand, the things like Memory and Resource Management can also be learned but they have a different nature. They can be statistically assessed, they could be calculated and verified automatically. This makes me wonder if there isn’t some kind of automation and information tools that would help developers achieve better results in terms of memory usage and resource management? In the meantime it’s all about raising awareness …

Raising Memory Awareness

Most importantly I think we need to raise more awareness to these issues to cocos2d developers. One step towards that would be for cocos2d to display a “available memory counter” alongside the FPS counter. I used to patch CCDirector to simply display memory instead of FPS since that was always more important to me. Fellow cocos2d developer Joseph sent me his version to display both – I simply didn’t think of the obvious. So if you’d like to see FPS and available memory next to each other I think you can handle the changes to CCDirector outlined here:

Raising awareness to leaking Scenes

In addition I highly, strongly and with utmost reinforcement (without pulling out a gun) recommend to cocos2d developers to frequently check your scene’s dealloc methods. Preferably add a breakpoint there, or at the very least add the logging line: CCLOG(@”dealloc: %@”, self). If you want a more visible but less intrusive method you could do something like flashing the screen or playing a sound whenever the last scene is deallocated, so that you get so used to it that when you’re not seeing or hearing it anymore it immediately raises your attention.

If at any time during the development of your project the dealloc method of a scene isn’t called when you change scenes, you’re leaking memory. Leaking the whole scene is a memory leak of the worst kind. You want to catch that early while you can still retrace your steps that might have caused the problem. Once you get to using hundreds of assets and thousands of lines of code and then realize the scene isn’t deallocated, you’ll be in for a fun ride trying to figure out where that’s coming from. In that case, removing nodes by uncommenting them until you can close in on the culprit is probably the best strategy, next to using Instruments (which I haven’t found too helpful in those cases).

I ran into such a problem once because I was passing the CCScene object to subclasses so that they have access to the scene’s methods. The subclass retained the scene and was itself derived from CCNode and added to the CCScene as child. The problem with that: during cleanup of the scene it correctly removed all child nodes but some of the child nodes still retained the scene. Because of that their dealloc method was never called, and in turn the scene was never deallocated.