gTHM: Does the World Need Another Theorem Prover?
If there is one thing the world does not need is another theorem prover. So why am I thinking of building gTHM, a grid-based theorem prover in Boyer-Moore tradition? In a nutshell, because the gTHM will use the grid to help find mechanical proofs. At the moment we are building a prototype, and we’ll use our experiences here to determine if further development is warranted.
I have been toying the idea of writing a new, grid-based theorem prover for some time. My first instinct, of course, was to consider grid-enabling ACL2, my theorem prover of choice. But this would be a massive undertaking for many reasons, one I was unwilling to commit to. Instead I started thinking of writing a smaller (OK, call it a toy) theorem prover. What would it look like?
I have settled on the following criteria:
- it is to be grid-enabled
- the object language should be Scheme, a LISP dialect
- it should support rewriting and induction
- it should be based on the Boyer-Moore “waterfall” architecture
- it should be small
The first choice is my main motivation for doing this project. I want to explore the viability of grid computing technology for theorem proving applications. The other choices are a reflection of my background, namely the Boyer-Moore school of theorem proving.
I chose Scheme as my base language because I find it to be a beautiful dialect of LISP. It does not have all the bells and whistles of Common LISP, but I consider that an advantage. I want a small language! gTHM is an experimental vehicle, so it does not make sense to choose a large language like Common LISP, as ACL2 does.
Scheme also provides some interesting possibilities for the future. If gTHM is successful, then many of Scheme’s features will present interesting challenges from the verification perspective: functions as first-class citizens, continuations, and arithmetic. Nqthm (the final version of the Boyer-Moore theorem prover) and ACL2 reason about strictly first-order dialects of LISP (though Nqthm’s v&c$ primitive is a powerful evaluator). But the legacy of LISP is one where functions can act on other functions, e.g., mapcar, apply, eval. In Scheme functions are first-class citizens. They can be stored in variables, they can be returned as the results of other functions, etc. So a full Scheme theorem prover should handle this.
Continuations present another challenge. Essentially a continuation is a program stack frozen in time in such a way that it can be restarted later. Calling a continuation is much like calling a function, except the continuation never returns. No doubt modeling these in a formal way will be very interesting.
Finally, there is arithmetic. LISP has traditionally offered support for exact arithmetic: There are no such things as overflows or rounding errors in pure LISP. ACL2 takes advantage of this by providing a mathematical presentation of numbers, i.e., the typical field axioms. Common LISP does include another set of numbers, the floating-point numbers. Although ACL2 has libraries dedicated to reasoning about floating-point numbers, these are not directly supported in the ACL2 language. Scheme throws one more quirk into this discussion by considering exact and inexact numbers, and mathematical operations (e.g., abs, max, min, +, -, *) operate on both of these. In fact, a variable may hold an exact number at one moment and an inexact one the next. This opens up wonderful opportunities for reasoning, especially considering my previous work with non-standard analysis in ACL2(r).
Naturally gTHM will not support any of these features in its first version; in fact, the language of gTHM will be a small subset of the language supported by THM, the original Boyer-Moore theorem prover! But with luck future versions will tackle these issues, and even some others like side-effects and (pseudo-)random numbers.