Z3 SMT solver interactive tutorial

ahelwer · on Aug 27, 2016

If you're looking for a fun, substantial introductory Z3 project, try writing a program to compare two sets of firewall rules for functional equivalence. Stated another way: does there exist a packet which is accepted by one firewall but not the other?

I was lucky enough to be tasked with solving this problem in Azure! We were changing the method by which node firewall rules were automatically generated, and wanted a way to verify equivalence between the old and new versions for each node. The brute force approach was clearly out: we'd have had to check 2^72 IPv4 packets (32-bit source address field, 16-bit source port field, 16-bit destination port field, 8-bit protocol field). I bummed around trying to figure out a way to transform the firewall rules into some kind of minimal normalized form for direct comparison, but the Z3 approach turned out to be shockingly simple. When Z3 is really well-suited for a problem, it feels like straight up magic.

If you want to spoil the implementation details for yourself, it goes like this:

1) Set up bit vector variables of the proper size for each field in the IPv4 packet header.

2) Encode each firewall rule as a restriction on the variables from (1), then conjoin them to form a logical representation of a firewall.

3) Task Z3 with finding values for the variables in (1) for which one firewall is true, and the other is false.

Here's a simple example which you can run directly in the linked demo page, using two integer fields (source and destination) for simplicity:

  (declare-const src Int)
  (declare-const dst Int)
  (define-fun matches ((srcLower Int) (srcUpper Int) (dstLower Int) (dstUpper Int)) Bool
    (and (<= srcLower src) (<= src srcUpper) (<= dstLower dst) (<= dst dstUpper))
  )

  (define-fun firewall1 () Bool
    (and
      (matches 0 10 20 30)
      (not (matches 5 10 25 30))
    )
  )

  (define-fun firewall2 () Bool
    (and
      (matches 1 10 20 30)
      (not (matches 5 10 25 30))
    )
  )

  (assert (not (= firewall1 firewall2)))
  (check-sat)
  (get-model)

This finds the model [0, 20] - a "packet" which is accepted by one firewall but not the other.

In practice this solution ran very quickly - less than a second for firewalls with hundreds of rules. It also gave a counterexample if the firewalls were not equivalent, which could be matched with offending rules and was very useful for debugging.

carterschonwald · on Aug 28, 2016

I'm actually trying to adapt some of the work in the network optimized datalog and other azure/msr nsdi papers to a related problem and or set of problems for one of my prjects at work. I'd love to chat some time. [im at a large Megacorp still piloting "cloud", and I'm involved in some efforts to build some open tools to make config and security stuff nicer/saner for large user bases ]

csl · on Aug 27, 2016

This was running user-facing, in production? Do you have any experience with scaling Z3? I'm thinking large problems running on many machines, contra many small problems on many processes.

For some reason, I always imagined that SMT was used to verify stuff before putting it into production, but this opens up a whole new world of ideas for me.

ahelwer · on Aug 27, 2016

It wasn't running as any sort of persistent service, if that's what you mean. It just ran during deployments to check output of old & new firewall generation methods for equivalence.

Something like what you describe was implemented in an Azure system called SecGuru, which is sort of a generalized version of firewall comparison to check security rule enforcement. Here is the whitepaper: https://www.microsoft.com/en-us/research/wp-content/uploads/...

tikhonj · on Aug 28, 2016

If you want a fun example of using Z3, I gave a talk recently that walks you through verifying programs in a tiny little imperative language by compiling them to Z3 formulas. It's easier than it sounds!

It was recorded [1] and I have example code and additional resources (papers, etc) on my site [2].

[1]: https://youtube.com/watch?v=ruNFcH-KibY

[2]: http://jelv.is/talks/compose-2016/

If you want to do more, you can extend the same process to synthesizing programs, not just verifying them. It doesn't scale too far, but otherwise it's like magic.

pron · on Aug 28, 2016

Very nice! Check this out: https://youtu.be/G2kx1aFpT6M?t=28m

BTW, TLA+ also compiles to Z3 (and other solvers: CVC3, Yices, Zenon).

zielmicha · on Aug 27, 2016

Z3py (Z3 bindings for Python) is a much better interface for Z3 IMO.

http://www.cs.tau.ac.il/~msagiv/courses/asv/z3py/guide-examp...

eggy · on Aug 28, 2016

I like the Lisp/Scheme syntax better, but that is because I have studied Lisps far longer than Python, and I just find them easier for phrasing logic.

The Python looks clean and familiar too.

F* the theorem-proving, program-verification language from MS is ML-like, and exports to OCaml and F#.

Can the Z3 SMT solver be used for F*'s SMT needs? I haven't found a link yet.

ergl · on Aug 28, 2016

Yes, afaik F* uses Z3 internally.

I don't have any links, but I know a guy working on formalizing the translation from F* to Z3

eggy · on Sept 7, 2016

I guess since F* is similar to F# or ML you could have a pretty cohesive solver to production toolchain. Having OCaml as an output option is great too.

I was looking at Idris with the C backend as a way of formalizing my programming for production programming. This makes another set to evaluate before deciding. I realize they are quite different, but I am just experimenting at this moment.

pron · on Aug 28, 2016

Z3 (as well as other SMT solvers) is used by TLA+ (also MSR)

MereInterest · on Aug 27, 2016

And an amazing example of why operator overloading is not the evil that people make it out to be.

rdtsc · on Aug 27, 2016

I don't know why you've been downvoted. The interface does look neat. They did a good job with it.

Other example of such thing is Numberjack -- a constraint solve with multiple backends:

http://numberjack.ucc.ie/

Here is a send-more-money example solved in it:

http://numberjack.ucc.ie/examples/sendmoremoney

Sudoku:

https://github.com/eomahony/Numberjack/blob/master/examples/...

mettamage · on Aug 28, 2016

If Z3 is on here, I just want to mention Triton ( http://triton.quarkslab.com/ ) which is a master thesis project from a French (?) person and Pin from Intel ( https://software.intel.com/en-us/articles/pin-a-dynamic-bina... ). With Triton and Pin you can instrument binary software and gather SMT formulas. It basically allows you to know about the binary SMT style.

I had to use it in a practical lab to figure out the input of some hashing function (that would almost always hash to 0). By knowing all the constraints, solving it to know what the input should be becomes quite easy (at least easy compared to brute force).