The Z3 Theorem Prover released under MIT license(github.com) |
The Z3 Theorem Prover released under MIT license(github.com) |
When I was prototyping a programming language with support for contracts (or refined types, e.g. `x : int if x > 0`), I tested a few different SMT solvers, including AltErgo, CVC and Yices (these seemed to be the most used open source solvers).
However, none of them were nearly as useful as Z3. In particular, they had no concept of a stack that allows the user to define different assertions locally, and then cancel them later. Without the stack, one would need to restart the solver and re-declare all variables (with all their constraints) for every function call with a constraint, compared to a single SMT script for each function verified in Z3.
Also, Z3 supports much more theories, in particular it supports non-linear integer arithmetics (which is an undecidable, incomplete theory), either by using some heuristics, or by translating the statements into the theory of real arithmetics (which is complete and decidable).
Also take a look at the "licensing" section to see why switching to the MIT license is a big deal. Even open source software projects like LLVM have had to worry about whether using Z3 was OK or not. Now they don't any more!
For example, a friend and I were playing a board game where each player had hit points, which they represented with two types of cards: type A had "5" on one side and "1" on the other, and type B had a "20" on one side and a "10" on the other. You'd represent your current hitpoints by displaying some number of these cards (out of 6 of the first type and 3 of the second type) face up, and adding up the totals. So, I wanted to find out: what's the smallest hit point value that I can't represent with some combination of cards? The traditional mathematical approach would be generating functions, but that would require at least pencil and paper. With Z3, I execute:
(declare-const a1 Int)
(declare-const a2 Int)
(declare-const a3 Int)
(declare-const a4 Int)
(declare-const a5 Int)
(declare-const a6 Int)
(declare-const b1 Int)
(declare-const b2 Int)
(declare-const b3 Int)
(assert (or (= a1 0) (= a1 1) (= a1 5)))
(assert (or (= a2 0) (= a2 1) (= a2 5)))
(assert (or (= a3 0) (= a3 1) (= a3 5)))
(assert (or (= a4 0) (= a4 1) (= a4 5)))
(assert (or (= a5 0) (= a5 1) (= a5 5)))
(assert (or (= a6 0) (= a6 1) (= a6 5)))
(assert (or (= b1 0) (= b1 10) (= b1 20)))
(assert (or (= b2 0) (= b2 10) (= b2 20)))
(assert (or (= b3 0) (= b3 10) (= b3 20)))
(assert (= (+ a1 a2 a3 a4 a5 a6 b1 b2 b3) i))
(check-sat)))))))
If you run this for different values of i (using push and pop if you're clever; I just called Z3 100 times), you can get a "sat" or "unsat" which tells you whether a particular total is reachable.My problem is as follows. I have a microcontroller which runs firmware. I want to verify various things about the hardware+firmware combination. For instance, I may be interested in proving that user mode firmware cannot change the MMU configuration.
There are many parts to this problem. First is verifying that the kernel doesn't expose routines that let you do these kinds of things. This is a kind of s/w verification problem, somewhat well studied. Second, you want to make sure there are no h/w bugs in the microcontroller itself which the user mode software can exploit to do bad stuff. This is traditional h/w verification, also well-studied. The third is that there aren't weird corners of the microcontroller ISA specification itself which can be exploited. This is also a kind of h/w bug albeit a specification bug.
The third part is where Z3 comes in because often, there isn't any specification, and if there is, it's a bunch of PDFs or word documents. What we want is to generate a formal specification, which you can examine using formal tools to prove that it satisfies certain properties. And then we want to prove that our implementation conforms to this specification. We're using some techniques from program synthesis with Z3 as the solver for this.
There is a whole field of research in program synthesis, much of which can be done efficiently on theorem-provers like Z3. You can read these slides[2] for a good overview.
[1]: http://jelv.is/chlorophyll.pdf
[2]: http://www.cs.berkeley.edu/~bodik/cs294/fa12/Lectures/L1/L1-...
Check out MS's Code Digger, which uses Z3 as well: http://research.microsoft.com/en-us/projects/codedigger/
One of the more famous places where these constraint solvers are used is to deal with dependency management in package systems. For example, zeroinstall uses a SAT solver to resolve dependency conflicts.
http://0install.net/solver.html
A sillier example is the time I used a sat solver to find the solutions for a flash puzzle game. The part I'm proudest of is that the sat solver could handle the highly symmetric cases that stumped the hand-written backtracking algorithms. And if I had used an SMT solver instead of SAT I would not have had to convert everything all the way doen to boolean formulas.
https://github.com/hugomg/hexiom
In the readme I explain a bit why the sat solver helped.
[1] http://www.microsoftstore.com/store/msus/en_AU/pdp/Microsoft...
It's cheap enough for a company to buy it without many signatures, but also on the upper scale of what people who "don't want Open Source" can pay.
It's probably just that the volume of sales isn't going to make sense when comparing to the size of the team.
Glitch-free compiling and installing under FreeBSD. The only problem was finding documentation. Rise4fun.com is mentioned in another post, the exact URL for getting started is http://rise4fun.com/Z3/tutorial/guide. Also, very useful information here-- http://www.smt-lib.org/
From the Microsoft Research page, I read:
> Z3 is a high-performance theorem prover being developed at Microsoft Research. Z3 supports linear real and integer arithmetic, fixed-size bit-vectors, extensional arrays, uninterpreted functions, and quantifiers. Z3 is integrated with a number of program analysis, testing, and verification tools from Microsoft Research.
And with my current knowledge (which is low on the matter) I can't reconcile arithmetic, arrays, quantifiers, etc. with general program verification...
Anyway, here's a great post on solving a Project Euler problem using Z3/SMT-lib, which may be inspiring for other novices like myself:
Maybe Idris or Coq integrate Z3.
It's all pretty nifty!
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.306...
HOL4 also has Z3 (and Yices) support, but I think the Z3 support is a little out of date (in particular, it errors immediately on start).
[1] http://www21.in.tum.de/~boehmes/proofrec.pdf
[2] http://stackoverflow.com/questions/11531589/difference-betwe...
I haven't run into that issue myself, but I've mostly written toy code for it.
Pro's/Con's of the two?
As for Z3, its strength is about contraint solving. You can specify a set of constraints and it will try to find if there is a viable solution that satisfies the constraints. It includes a SAT (boolean satisfiability)solver and beefs it up with support for some higher level things, like integer arithmetic, bit vectors, etc. This kind of constraint solver is very useful for finding bugs: for example, the notorious intel floating point bug was found out by asking a SAT solver to find an input where the processor's circuit produced the wrong value.
Pointers in SLAM2: Efficient evaluation of pointer
predicates with Z3 SMT Solver in SLAM2This is in pretty stark contrast to Prolog, which includes features which make it undecidable. Therefore you can make Prolog loop infinitely, while that is not possible in an SMT solver.
Imo it's a much nicer modeling language, especially for incremental formalization, in part because of implementing a nonmonotonic rather than classical logic. It also has a nice generative-programming style where you can define generative spaces via "choice rules", where the solver is explicitly given a free choice for some variables, subject to constraints, and returns a satisfying artifact/model. That allows using it to construct constrained generative systems (e.g. generative music [2], level generation [3]). But Z3 far outperforms ASP solvers on problems where variables range over large domains (like "32-bit integer") because of the MT part of SMT. There's some experimental work on trying to combine the approaches, using Z3 (or another SMT solver) as the solver backend for a logic-programming system [4], but I don't know how usable it is.
[1] http://potassco.sourceforge.net/
[2] http://arxiv.org/abs/1006.4948
[3] http://pcgbook.com/wp-content/uploads/chapter08.pdf
[4] http://peace.eas.asu.edu/joolee/papers/aspmt-system-jelia.pd...
The neat thing about Z3 is that it has efficient solvers for the more specific subproblems (linear programming, bit vectors, etc) that you can't have if you state everything using "general" constraint programming. At the same time, Z3 lets you work at a higher level of abstraction - for example, you can reason about bit verctors or arithmetic directly instead of having to encode it with boolean formulas.
http://research.microsoft.com/en-us/um/redmond/projects/z3/z... http://research.microsoft.com/en-us/um/people/leonardo/files... http://research.microsoft.com/en-us/events/z3dtu/dtu-jan4-z3...
- Unsatisfied (unsat)
- Satisified (sat)
- Unknown (unknown)
It was tongue-in-cheek, I enjoyed the positivity in your original comment.
I.e. mutate random values for a while, then use an SMT solver to try to find any remaining code paths that haven't been followed by AFL.
I have a feeling AFL would be faster for "simple" paths, but an SMT solver may be handy for getting those last few paths.
The tree of resolution deductions that SAT provers (modulo theories or otherwise) like Z3 produce is of course a proof in a valid proof system. Typically that is some variant of a propositional resolution proof system. The problem is that the reasoning is classical: to show A, falsity is deduced from (not A). This is not constructively valid, you can only deduce (not not A) if (not A) implies falsity.
So the resolution proof that Z3 outputs can be converted to a classical proof and expressed in classical proof systems like (Isabelle/)HOL, but not in constructive systems like Coq and Agda.
In this understanding, high-level (ASP, CSP) contrasts with low-level (SAT and SMT) for the same problem domain: just as with high- and low-level programming languages like Java are compiled down to low-level programming languages like x86 machine code, high-level formalism like ASP, CSP, ... can be compiled down to low-level SAT/SMT.
SAT/SMT is the assembly language of combinatorial problems because you have to state explicitly all forbidden solutions using CNF (conjuncive normal form). For examle if you have variables x, y and z, then the CNF ( x \/ y \/ not z ) /\ ( not x /\ not y ) has as models all maps from {x, y, z} to {0, 1} except {x=0, y=0, z=1}, {x=1, y=1, z=1} and {x=1, y=1, z=0}. The problem with using CNFs to enumerate forbidden solutions is that the formulae can be long (just like assembler programs can be long). With high-level formalisms like ASP you often get a much more succinct specification.
With a bit of squinting, you can even see first-order logic as a high-level formalism that compiles down to propositional logic (via skolemisation, resolution, unification, herbrand etc).
Would a SAT solver be useful for compiler technologies?
A compiler is basically solving the problem of: here is a sequence of things I want done; here is a description of what the machine can do; find a combination of the latter which achieves the former. The details are the tricky bit, particularly for unpleasantly asymmetric instruction sets like the old 8-bit architectures.
It would be so nice if I could just throw a description of the problem at some sort of machine reasoning system and have it Just Work...
Perhaps the most relevant example project here would be Rosette[4], which is basically a framework for easily writing synthesizers for different tasks. It would let you implement a description of your machine in Scheme (ie write a simple interpreter) and automatically generate a synthesizer for it that could compile programs in your language to formulas in a solver of some sort.
[1]: http://www.cs.berkeley.edu/~bodik/cs294/fa12/Lectures/L1/L1-...
[2]: http://jelv.is/chlorophyll.pdf
I'm afraid your paper is largely beyond me, though, but I'll go read up on program synthesis. Thanks!
But current SAT solvers aren't powerful enough to do so at much more than a few tens of instructions long. Still useful though.