Jeremy W. Sherman

stay a while, and listen

Leak-Free Recursive Blocks

Sometimes, you want a block to be able to call itself. That means it needs a reference to itself. And that means you have a wonderful opportunity to create a strong reference cycle that will endure till the end of time, or at least till your app exits.

The Solution

__weak __block block_t recurse;
block_t block;
recurse = block = ^(id val) {

Getting There

The prototypical recursive function example is the factorial function:

factorial(uintmax_t n)
    NSParameterAssert(n >= 0);
    if (n == 0) return 1;
    return n * factorial(n);

So that’s a function. Now we want to make it a block.

As a block, that factorial self-reference is tricky. A block only gets a name when you assign it to a variable. That variable assignment won’t happen till after the block is created. So you need a __block variable:

uintmax_t (^__block factorial)(uintmax_t n) = ^uintmax_t (uintmax_t n) {
    NSParameterAssert(n >= 0);
    if (n == 0) return 1;
    return n * factorial(n);

Sadly, that’s a __strong __block reference by default. Oops. If you don’t want to leak, you either need to take care to break the strong reference manually by NULLing out the factorial variable – good luck – or you need a weak reference.

So you retag the block variable as weak:

uintmax_t (^__block __weak factorial)(uintmax_t n) = …

Only now there’s never a strong reference to your block, so your block is eligible for deallocation right after it’s returned.

So you need both a strong and a weak reference to your block. And the block needs to be stored in the strong reference first, so you anchor it to this world. So maybe you do this:

uintmax_t (^__block __weak weakFactorial)(uintmax_t n);
uintmax_t (^factorial)(uintmax_t n) = …
weakFactorial = factorial;

That fixes it. It works. And it’s nice that we could drop the __block from the main factorial reference, since that stays stable; it’s only the weakFactorial self-reference that gets updated after the block first captures it.

But having to do this follow-on assignment is kind of ugly, and it’s pretty forgettable way down there at the end of your block, and a few revisions later they’ll get separated, and then maybe you’ll accidentally delete the weak assignment, and then you’ll have to track down a bug. Ugh.

So try this instead:

uintmax_t (^__block __weak weakFactorial)(uintmax_t n);
uintmax_t (^factorial)(uintmax_t n);
weakFactorial = factorial = …

You’re not going to forget that assignment. It’s still ugly, but that’s what happens at the edges of ARC, and it’s at least not too fragile. You could even snippetize it, if you’re into that sort of thing.

Caveat: Asynchronous Recursion

Justin Spahr-Summers rightly notes that this approach works only for synchronous recursion:

Once your strong reference goes out of scope, wouldn’t it just stop recursing because all that’s left is a weak reference? (Twitter)

It’s actually worse than that: you’ll get a segfault, because you try to invoke a NULL block as a function. If you have some sort of concurrent or asynchronous recursion – maybe invoking the block kicks off a gradual countdown – then you’ll need to handle the case where the block is dead and has been zeroed but doesn’t know it yet.

Use the standard trick of trying to obtain a strong reference and then testing whether that reference is nil or not to decide how to proceed:

block_t recurse = weakRef;
bool zeroed = !recurse;
if (zeroed) /* bail out */;
/* use |recurse| to reference yourself, so that you don't segfault */

If you want the recursion to run to completion rather than fizzling out, I imagine you can work out the appropriate juggling act.

This isn’t a problem for the common case of synchronous block handlers, where you set the handler and don’t change it, and the object owning the strong reference is guaranteed to outlive the handler block’s recursion, but it is something to watch out for as you get cleverer or otherwise start destroying references while the block might be running.