Jeremy W. Sherman

stay a while, and listen

Beyond Our Ken

The more I poke around, the more convinced I become that actually knowing what a piece of software is supposed to do is truly rather rare and generally beyond mortal ken. Making it do what you think it should do is nearly beyond our grasp.

If we’re honest with ourselves, we need every tool we can get just to wrangle software into behaving. That means types, that means tests, and that means, yes, even: proofs.

And that also means that proofs need tests, too.

What drove this home was reading a couple papers related to combining proving and testing.

Types and Tests

I’m on record for arguing in favor of using both types and tests to their utmost in both Types Complement Tests Complement Types and Beyond Type Wars.

It’ll come as no surprise what I’m going to recommend here: Use proofs and tests. And also types.

(It’s even less surprising if you’ve run across the Curry-Howard isomorphism, which relates logical proofs and propositions to exhibiting an instance of a type – Propositions as Types – or, more broadly, the notion of computational Trinitarianism. There are some deep connections here, and we should wring them for every last ounce of help they can give us in crafting correct and elegant software.)

Use Proofs AND Tests

This time, it’s not gonna be me saying it, though.

Really, you should use both tests and proofs, not just one or just the other:

This also reinforces the general idea that testing and proving are synergistic activities, and gives us hope that a virtuous cycle between testing and proving can be achieved in a theorem prover. (Zoe Paraskevopoulou et al., “Foundational Property-Based Testing”, 2015)

If you don’t, you’re going to screw up. In small ways often, basically just tripping over your feet, but sometimes in big ways, where no-one can see how to bail you out:

Second, tests complement proofs. We encountered five papers in which explicitly claimed theorems are false as stated. […] In every case, though, rudimentary testing discovered errors missed with pencil-and-paper proofs.

Indeed, we claim that tests complement even machine-checked proofs. As one example, two of the POPLmark solutions that contain proofs of type soundness use call-by-name beta in violation of the specification (Crary and Gacek, personal communication). We believe unit testing would quickly reveal this error.

Even better, one can sometimes test propositions that cannot be validated via proof. […] Testing also removes another obstacle to proof, the requirement that we first state the proposition of interest. Due to its exploratory nature, testing can inadvertently falsify unstated but desired propositions, e.g., that threads block without busy waiting (section 4.4). This is especially true for system-level and randomized testing. To some degree, the same is true of proving, but testing seems to be more effective at covering a broad space of system behaviors. (Casey Klein et al., “Run Your Research: On the Effectiveness of Lightweight Mechanization”, 2012)

Use ALL The Tools

We don’t have to choose just tests.

We don’t have to choose just types.

We don’t have to choose just proofs.

We have an abundance of tools waiting for us to take them up and apply them to our problems. It’s simple and reassuring to reject a whole class of them out of hand; if we pick just one, perhaps we can convince ourselves of our expertise. And you can indeed get quite far with just one. But if you can stomach your own ignorance, you might find you can get even farther by striving to master all these many disciplines.

That Said, Tests Are a Really Mature Technology

Automated testing stopped being rocket science at least a decade ago. If you do nothing else, at least write some automated tests.

Humans suck at repeating mechanical tasks, we’re bad at documenting them, bad at following them, we get bored really easily, and we’re really slow. Be virtuously lazy and sic a computer on your testing, for everyone’s sake.

If you’re going to pick just one of these, pick automated testing, and work it for all it’s worth. (Quite a lot, honestly!)