When is a proof not a proof? When you think you’ve proven one thing, but actually, you’ve proven something else.
This example is interesting in part because it is actually not an example of a dependent type, so it’s something we can talk about almost entirely in terms of Swift, absent the computer-assisted theorem proving that is Coq’s special sauce. Tackling theorems in Swift is a perfect fit for a generative testing framework.
Putting the example in context
The book has introduced a
bool type as an example of defining
an enumeration type.
It then defines a function
negb that performs Boolean negation
to demonstrate function definition using pattern matching.
The book then proceeds to prove a couple theorems about the
negb function as part of teaching the reader how to do proofs using Coq.
Rewriting the example in Swift
boolis not a dependent type, so Swift and SwiftCheck are up to the task.
The book doesn’t get to dependent types till 5 chapters later.
bool is an enumeration type, and
negb relies on non-dependent
pattern matching, all of which Swift can do.
This means we can pretty directly translate the example definitions and
theorem from Gallina into Swift:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
I’ve translated the
negb_inverse theorem into
a SwiftCheck test
and provided a generator of arbitrary
for the test generator to use.
Running the test as-is gives:
1 2 3
You can run this and see for yourself by cloning my Negb project repository.
David’s concern is that,
“It’s up to the programmer to realize that we have actually not created all of the proofs required to prove correctness.”
This concern arises from his thinking that the proof for
negb_inverse goes through
even if you alter
negb so it always returns
The version of
negb he defines that always returns
actually already fails at the
without need to proceed to trying to prove
You can see this in practice by pasting the bogus
negb definition into
coqtop and then trying to complete the proof of the theorem,
but you can probably convince yourself without needing to do that
simply by considering the
The theorem’s claim is that
negb(negb(b)) == b.
In the false case, with the always-true
negb, we get:
1 2 3 4 5 6 7 8
If you flip the
case .True: return .False to
return .True in the Swift version of
you’ll see it quickly find this failing case
using the test, as well:
1 2 3 4 5 6
If you go the
you’ll be able to discharge the
true case with
but using the
simpl. tactic in the false case leaves you
at the end step of our reasoning above,
trying to prove that
true = false.
If you’re in
coqtop, you’ll also quickly notice that
the proof David copied in for
negb_inverse was incomplete.
destruct. leaves you with two subgoals left to prove,
corresponding to a version of the theorem with
instantiated at each case of the enum.
You can complete the proof via
which the book gets to a half-page later after some discussion.
Proving the right thing and human misapprehension
A sound system won’t draw bogus conclusions, but that doesn’t mean we can’t misinterpret our own claims, whether they’ve been proven true or not.
David’s core concern about knowing you’ve proved what you want to prove is a real problem.
The problem doesn’t appear where he thought it did, in a theorem that he knew shouldn’t be provable but he thought still was.
The problem shows up in the same place that it did with the dependently typed flipped Boolean discussion, where the human is giving names and external meaning to the mathematical expression. In a sound system, the proofs we provide will be correct, but we can still misunderstand what the theorem we have just proved really means.
If we’re building on top of that proof, that misunderstanding will undoubtedly come to light as we try to use it to prove new claims. We just won’t be able to make a chain of reason from what we want to prove to what we did prove, because our new goal will depend on what we thought we proved, which was something different.
If what we thought we proved was the final achievement of our development, though, then we might not run into trouble till we try to actually use it to do something. Then we’ll see it breakdown at runtime, when it does what it in fact tells us on its face it does, which just so happens not to be what we read it as doing.