# Read: A tutorial on the universality and expressiveness of fold

By: Jeremy W. Sherman. Published: . Categories: reading-notes.

Hutton, Graham. A tutorial on the universality and expressiveness of fold. 1999. 16 pages (+ 2 more of references). Accessed 2015-08-18.

Fold (reduce, inject, cata) lets you take a list and swap out its constructors for a combining function and base value of your own devising, like swizzling in `+` for `cons`/`:` and `0` for `nil`/`[]`:

``````[1 ,  2 ,  3]
== EXPAND THE SUGAR ==>
(1 : (2 : (3 : [])))
== FOLD (+) 0 ==>
(1 + (2 + (3 + 0 )))
``````

Notice how all the `:` were rewritten into the provided function (`(+)`), and `[]` was rewritten into the provided value (`0`). Also notice how everything groups to the right; this `fold` operator is consequently sometimes called `foldr`, since it folds to the right.

## Universality

You can write `fold` as a recursive function:

``````fold f v [] = v
fold f v (x : xs) = x `f` (fold f v xs)
``````

or:

``````let g = fold f v in
(* Equation 1: *) g [] = v
(* Equation 2: *) g (x : xs) = f x (g xs)
``````

If you can mash a function `g` into being defined by those two equations, then you’ve solved for `f` and `v` such that `g = fold f v`, and you can write `g` without needing to explicitly recurse.

### The Universal Property of Fold

The universal property of `fold` is that you get a bi-implication between the two equations and `g = fold f v`:

• substitute in `fold f v` for `g`, and you’ve defined `fold`;
• prove the two equations, and you’ve got the makings of a proof by induction that your function is simply a fold.

“Taken as a whole, the universal property states that for finite lists the function `fold f v` is not just a solution to its defining equations, but in fact the unique solution” (p. 358 in the journal, which is p. 4 in the PDF).

#### Substitution: An aide-mémoire

The substitution bit makes it easy to remember the universal property: Write the obvious recursive definition of fold, then substitute away the `fold f v`.

#### Proofs and Definitions

The other bit is the part you’ll actually use. It lets you prove that a function is equivalent to a fold. To do so, you check that your function gives the same result in both the base case (equation 1) and the step case (equation 2). (For examples, check the section “Universality as a proof principle”.)

One case where this is handy is when it’s obvious what the fold does and less obvious what the other function does.

An arguably more useful case is when you have 2 different functions you want to prove equal. Rather than writing out an induction proof longhand, you can rewrite them both as folds; if the two folds are identical, then they’re equal.

It’s also useful when you want to rewrite a function you’ve written using explicit recursion to instead be a fold. In this case, you end up setting up an equality between some function you hope is a fold and those equations, then solving for `v` and then `f` using equational reasoning. (For examples, check the section “Universality as a definition principle”.)

### Fusion: Pushing function composition inside the fold

You can also use the universal property to derive the fusion property of fold, which gives the requirements to be met to push a function applied to a fold into the fold itself: `h . fold g w => fold fusedFunction fusedValue`. (There’s a subsection on that, too.)

#### Fusion as optimization

Fusion plays a small role in this paper, but it’s very useful for practical reasons of program optimization. There’s a good chunk of machinery in some functional language compilers dedicated to automatically fusing (“deforesting”) functions, since this reduces the overhead of communicating between parts of a program via intermediate values: with fusion, the values get consumed as they’re produced, rather than accumulated into a data structure and then handed off for the next step of processing.

## Expressiveness

### Tuples give you primitive recursion

Generating tuples lets you pull a fun trick where you rebuild the entire input in the second part of a 2-tuple and then throw it away. This is useful because it makes the original input available to the function you’re folding.

• This lets you define `dropWhile p xs` as a fold.
• This lets you define a generic primitive recursive function as a fold.
• Primitive recursion: `fold` is as powerful as any program you can write that has a bounded number of repetitions in any loop.

### First-class functions give you general recursion

Generating functions pushes the expressiveness from primitive recursion to general recursion, as demonstrated by writing the Ackermann function (evidently the standard example of a non-primitive recursive function) as a `fold` (using lists of length n to stand in for the natural number n).

• General recursion: `fold` is as powerful as any program you can write. It can perform unbounded iteration. Enjoy your halting problem.

### First-class functions give you foldl via foldr

Generating functions also lets you write `foldl` – the function that turns `[1, 2, 3]` into `(((0 + 1) + 2) + 3)`, i.e., which folds everything down to the left – using `fold`. Turns out the order `fold` processes the list in comes down to the function it’s provided.

#### Foldr is more powerful than foldl because strictness

You can’t write `fold` in terms of `foldl` because `foldl` is strict in the tail of its list. What that means in practice is that `foldr (\a b -> a) 0 [0..]` evaluates to 0 lickety-split, but `foldl (\b a -> a) 0 [0..]` runs forever (“diverges”).

## Questions I hesitate to call “frequently asked”

I don’t know that these are frequently asked, but they’re definitely questions I asked myself at some point in the past.

### Why is it called “fold”?

I explain calling it a “fold” or “reduction” to myself by thinking of the example that started this post, which collapses (folds up) a list of values to a single value. In that case, the list `[1, 2, 3]` collapses down to the single value `6`.

This explanation makes less sense when you’re building a structured value, like say reversing a list, or doubling each element, so that `[1, 2, 3]` goes to `[1, 1, 2, 2, 3, 3]`, but let’s just ignore that unfortunate fact.

### What’s up with the argument order for f in foldr vs foldl?

The arguments to the function provided to a fold reflect the fold’s left/right bias.

Say you want to fold a list of strings down to the sum of each string’s length.

If you use `foldr`, then it has a right-to-left bias, so you need the function `\ string sum -> (length string) + sum`. With `foldr`, the base case value appears on the right, and the iteration step value on the left.

`foldl`'s bias is for left-to-right, so its function has the base case value appearing on the left and the iteration step value on the right. With `foldl`, you’d need the function `\ sum string -> (length string) + sum`.

This has the fun result that the order of arguments to the function is flipped depending on if you’re dealing with a `foldl` or a `foldr`, as you can see with these side-by-side:

``````sumLengthsRTL :: [String] -> Int
sumLengthsRTL = foldr (\ string sum -> (length string) + sum) 0

sumLengthsLTR :: [String] -> Int
sumLengthsLTR = foldl (\ sum string -> (length string) + sum) 0
``````