Jeremy W. Sherman

stay a while, and listen

When is a proof not a proof?

When is a proof not a proof? When you think you’ve proven one thing, but actually, you’ve proven something else.

In the last post, I addressed a good question raised in the first half of David Owens II’s article “Dependent Types: I’m Missing Something”, where he addresses my discussion of dependent types.

This post looks at the latter half, which talks about an example drawn from early in the chapter “Introducing Inductive Types” of Adam Chlipala’s Certified Programming with Dependent Types.

This example is interesting in part because it is actually not an example of a dependent type, so it’s something we can talk about almost entirely in terms of Swift, absent the computer-assisted theorem proving that is Coq’s special sauce. Tackling theorems in Swift is a perfect fit for a generative testing framework.

Putting the example in context

The book has introduced a bool type as an example of defining an enumeration type. It then defines a function negb that performs Boolean negation to demonstrate function definition using pattern matching. The book then proceeds to prove a couple theorems about the negb function as part of teaching the reader how to do proofs using Coq.

Rewriting the example in Swift

bool is not a dependent type, so Swift and SwiftCheck are up to the task.

The book doesn’t get to dependent types till 5 chapters later. bool is an enumeration type, and negb relies on non-dependent pattern matching, all of which Swift can do. This means we can pretty directly translate the example definitions and theorem from Gallina into Swift:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
enum Bool_ {
  case True
  case False
}

func negb(b: Bool_) -> Bool_ {
  switch b {
  case .True: return .False
  case .False: return .True
  }
}

/// Generative testing of the `negb_inverse` theorem's claim.
func testNegbInverse() {
  property("negb is its own inverse") <- forAll { (b: Bool_) in
    return negb(negb(b)) == b
  }
}

extension Bool_: Arbitrary {
  static var arbitrary : Gen<Bool_> {
    return Gen.oneOf([Gen.pure(.True), Gen.pure(.False)])
  }
}

I’ve translated the negb_inverse theorem into a SwiftCheck test and provided a generator of arbitrary Bool_ values for the test generator to use.

Running the test as-is gives:

1
2
3
Test Case '-[NegbTests.NegbTests testTheoremNegbInverse]' started.
*** Passed 100 tests
.

You can run this and see for yourself by cloning my Negb project repository.

Mistaken proof

David’s concern is that, “It’s up to the programmer to realize that we have actually not created all of the proofs required to prove correctness.” This concern arises from his thinking that the proof for negb_inverse goes through even if you alter negb so it always returns true.

The version of negb he defines that always returns true actually already fails at the negb_inverse theorem, without need to proceed to trying to prove negb_ineq.

You can see this in practice by pasting the bogus negb definition into coqtop and then trying to complete the proof of the theorem, but you can probably convince yourself without needing to do that simply by considering the false case.

The theorem’s claim is that negb(negb(b)) == b. In the false case, with the always-true negb, we get:

1
2
3
4
5
6
7
8
negb(negb(b)) == b  // the negb_inverse claim
--> consider the case where b = false
negb(negb(false)) == false
--> evaluate the inner negb application
negb(true) == false
--> evaluate the remaining negb application
true == false
--> contradiction! <--

If you flip the case .True: return .False to return .True in the Swift version of negb, you’ll see it quickly find this failing case using the test, as well:

1
2
3
4
5
6
Test Case '-[NegbTests.NegbTests testTheoremNegbInverse]' started.
*** Failed! Proposition: negb is its own inverse
Falsifiable (after 1 test):
False
*** Passed 0 tests
.

If you go the coqtop route, you’ll be able to discharge the true case with simpl. trivial., but using the simpl. tactic in the false case leaves you at the end step of our reasoning above, trying to prove that true = false.

If you’re in coqtop, you’ll also quickly notice that the proof David copied in for negb_inverse was incomplete. destruct. leaves you with two subgoals left to prove, corresponding to a version of the theorem with b instantiated at each case of the enum. You can complete the proof via reflexitivity. Qed., which the book gets to a half-page later after some discussion.

Proving the right thing and human misapprehension

A sound system won’t draw bogus conclusions, but that doesn’t mean we can’t misinterpret our own claims, whether they’ve been proven true or not.

David’s core concern about knowing you’ve proved what you want to prove is a real problem.

The problem doesn’t appear where he thought it did, in a theorem that he knew shouldn’t be provable but he thought still was.

The problem shows up in the same place that it did with the dependently typed flipped Boolean discussion, where the human is giving names and external meaning to the mathematical expression. In a sound system, the proofs we provide will be correct, but we can still misunderstand what the theorem we have just proved really means.

If we’re building on top of that proof, that misunderstanding will undoubtedly come to light as we try to use it to prove new claims. We just won’t be able to make a chain of reason from what we want to prove to what we did prove, because our new goal will depend on what we thought we proved, which was something different.

If what we thought we proved was the final achievement of our development, though, then we might not run into trouble till we try to actually use it to do something. Then we’ll see it breakdown at runtime, when it does what it in fact tells us on its face it does, which just so happens not to be what we read it as doing.