If you're looking for a fun, substantial introductory Z3 project, try writing a program to compare two sets of firewall rules for functional equivalence. Stated another way: does there exist a packet which is accepted by one firewall but not the other?
I was lucky enough to be tasked with solving this problem in Azure! We were changing the method by which node firewall rules were automatically generated, and wanted a way to verify equivalence between the old and new versions for each node. The brute force approach was clearly out: we'd have had to check 2^72 IPv4 packets (32-bit source address field, 16-bit source port field, 16-bit destination port field, 8-bit protocol field). I bummed around trying to figure out a way to transform the firewall rules into some kind of minimal normalized form for direct comparison, but the Z3 approach turned out to be shockingly simple. When Z3 is really well-suited for a problem, it feels like straight up magic.
If you want to spoil the implementation details for yourself, it goes like this:
1) Set up bit vector variables of the proper size for each field in the IPv4 packet header.
2) Encode each firewall rule as a restriction on the variables from (1), then conjoin them to form a logical representation of a firewall.
3) Task Z3 with finding values for the variables in (1) for which one firewall is true, and the other is false.
Here's a simple example which you can run directly in the linked demo page, using two integer fields (source and destination) for simplicity:
This finds the model [0, 20] - a "packet" which is accepted by one firewall but not the other.
In practice this solution ran very quickly - less than a second for firewalls with hundreds of rules. It also gave a counterexample if the firewalls were not equivalent, which could be matched with offending rules and was very useful for debugging.
I'm actually trying to adapt some of the work in the network optimized datalog and other azure/msr nsdi papers to a related problem and or set of problems for one of my prjects at work. I'd love to chat some time. [im at a large Megacorp still piloting "cloud", and I'm involved in some efforts to build some open tools to make config and security stuff nicer/saner for large user bases ]
This was running user-facing, in production? Do you have any experience with scaling Z3? I'm thinking large problems running on many machines, contra many small problems on many processes.
For some reason, I always imagined that SMT was used to verify stuff before putting it into production, but this opens up a whole new world of ideas for me.
It wasn't running as any sort of persistent service, if that's what you mean. It just ran during deployments to check output of old & new firewall generation methods for equivalence.
Something like what you describe was implemented in an Azure system called SecGuru, which is sort of a generalized version of firewall comparison to check security rule enforcement. Here is the whitepaper: https://www.microsoft.com/en-us/research/wp-content/uploads/...
I was lucky enough to be tasked with solving this problem in Azure! We were changing the method by which node firewall rules were automatically generated, and wanted a way to verify equivalence between the old and new versions for each node. The brute force approach was clearly out: we'd have had to check 2^72 IPv4 packets (32-bit source address field, 16-bit source port field, 16-bit destination port field, 8-bit protocol field). I bummed around trying to figure out a way to transform the firewall rules into some kind of minimal normalized form for direct comparison, but the Z3 approach turned out to be shockingly simple. When Z3 is really well-suited for a problem, it feels like straight up magic.
If you want to spoil the implementation details for yourself, it goes like this:
1) Set up bit vector variables of the proper size for each field in the IPv4 packet header.
2) Encode each firewall rule as a restriction on the variables from (1), then conjoin them to form a logical representation of a firewall.
3) Task Z3 with finding values for the variables in (1) for which one firewall is true, and the other is false.
Here's a simple example which you can run directly in the linked demo page, using two integer fields (source and destination) for simplicity:
This finds the model [0, 20] - a "packet" which is accepted by one firewall but not the other.In practice this solution ran very quickly - less than a second for firewalls with hundreds of rules. It also gave a counterexample if the firewalls were not equivalent, which could be matched with offending rules and was very useful for debugging.