Alex headshot

AlBlue’s Blog

Macs, Modularity and More

Reflections on Objective-C

2011, objectivec

I've been using Objective-C on and off for a couple of decades, and although it has a few warts and historical dependencies, it still remains one of my favourite languages to write in. Partially, it's because it was my first OO language (Java wasn't invented until 3 years afterwards), and partly because the language has a number of powerful features that don't occur in other languages.

Many older developers may feel the same way about Smalltalk; certainly, a number of patterns survived though Smalltalk itself never grew to a significant impact in the industry.

(One can argue that only one company seriously backs Objective-C; but then again, only one company backs VB.Net, and that's widely used too.)

What is more difficult to understand is the number of developers who code in C++ but either haven't tried, or don't want to try, Objective-C. gcc, the main compiler in the open source world, has had Objective-C support since before Java was invented. Yes, there have been a few recent changes to the language (blocks, properties) but it's still a viable toolchain.

One of the main reasons to prefer Objective-C over C++ is a sane and consistent memory management model. With any C based language, you have to know if you are responsible for freeing the result of the call, or if not, strcpy'ing the results to a new bit of memory. This results in significant churn, as each caller in the stack may spend it's work bitshuffling data around just to ensure ownership of data. Quite often this results in APIs designed with pass-the-pointer semantics instead, where a block of memory is pre-allocated up the stack, then recursively passed down the layers until an API fills it.

None of this helps with dynamically generated content, and is significantly error prone. A significant chunk of code can be either error handling, or bitshuffling, masking what is happening under the covers.

Objective-C gets this right by using refcounting (or gc, but more on that later). When initially created with "new" (or "alloc/init"), an object's refcount is set to 1. Refcounts are incremented with each "retain" call, and decremented with "release". Logic in the release call checks to see if it's the last one out the door (refcount 0) and switched off the lights (or "dealloc" call).

This is built into NSObject, so everything has these semantics. Objective-C users never call "dealloc" themselves; whereas C++ users have to call this all the time.

OK, so Objective-C does refcounting consistently. Fine - there are C++ libraries that support this model as well. What makes Objective-C better?

The trick up Objective-C's sleeve is to realise that there are situations where you want to return a temporary result - say, the result of a sprintf type operation - and then immediately discard the result. Clearly the API can't free the memory before it returns it, but nor do we want the callers to suffer the same fate as their C and C++ brethren where the only way to know if they should free it is to read the API docs.

Objective-C has a unique way of dealing with this problem through the use of autorelease pools. An autorelease pool is a global list of objects to which any thread has access. Any object which is created in a temporary fashion can be added to the autorelease pool, so that it doesn't go out of scope. This allows a return-and-forget method to add a transient object to the autorelease pool and then neither it nor any callers need worry about it again.

Left unabated this autorelease pool would grow in size and eventually cause a depletion of memory resources. Instead, this autorelease pool is flushed periodically ("drained") to remove any objects in its scope, and replaced with a new autorelease pool. That way, memory is recycled and caller code doesn't have to worry about doing extra work.

The autorelease pool is normally set up as part of the system when loading an app; you can see in the main.m of a generated project that there's an autorelease pool creation at the start of the code. If you don't have one, you see messages about objects being created with no autorelease and just leaking; but since you rarely change main.m this isn't usually a problem.

Most iOS and OSX apps are runloop based. This means at the top level of the app, there's a "while(true) doStuff" type loop that runs through the possible inputs (network activity, keyboard, mouse etc) and if so runs the appropriate action. (Rather than just CPU spinning, it uses external triggers to decide when to take action though.) What this means is there is a place which can be used to regularly drain the autorelease pool, which is done at the end of each cycle. So mouse move event can generate all sorts of trash objects in the stack, and at the end of that action, the autorelease pool is emptied.

Although I indicated that the pool is global above, there can actually be more than one pool. What happens is the autorelease pool uses thread-local storage to store its pool (and by the way, multiple threads and multiple runloops need to have heir own autorelease pools allocated therefore). When you invoke "autorelease" on an object, it asks the NSAutoreleasePool for the pool in use, which obtains it from the thread local storage.

In addition, you can have multiple autorelease pools per thread. It acts as if there were a stack of autorelease pools and the current pool is the top one in the stack. (It's also this code which prints out the "just leaking" message if the stack of pools is empty.)

To push a new pool on the stack, just create a new autorelease pool. Any subsequent autoreleased objects which get created are added to this pool instead of the original one. Once that pool is released, it gets popped off the pool stack, and subsequent autoreleases will go back to the prior pool.

It's not generally necessary to create your own autorelease pools, but if you have a tight loop in a while block that is long running and generates a lot of temporary objects, it can improve performance to create/release an autorelease pool outside the while loop and then place a pool drain each loop iteration - or every other, or every third...

The autorelease mechanism is just one of the things which makes Objective-C superior to C++, in my opinion. It enables patterns such as "if you init, you ownit" whilst others like NSString's stringWithFormat will give you a pre-autoreleased string back.

Having a standard retain count enables the "you keep it, you retain it" pattern (now somewhat superseded by the retain property generator). It also allows tools like llvm to perform semantic analysis on the code to verify that the retain/release calls are balanced.

Like other reference counting mechanisms, it doesn't prevent circular references pinning memory or prevent leaks by forgetting to release or errors caused by double releasing. However as a standard language feature it gives the developer far more time to concentrate on writing the app and not debugging memory related errors. (The OSX runtime permits the use of zombie tracking, as does the "Leaks" app, to assist with such cases.)

Hopefully this post will reach my C++ counterparts to explain one of the key differences between the languages. It's such a key feature - like RAII is in C++ - that it's a core part of how to write Objective-C code. By explaining the principles, and without using any scary at or square brackets, I hope this conveys how memory management in Objective-C works. And for those who are new to the Objective-C platform, I hope it's instructive to know what's going on under the covers.