A CP Solution for XKCD NP Complete Restaurant Order

lake99 · on May 28, 2013

Of course, Prolog makes it easy too.

    ?- member(Fruit,[0,1,2,3,4,5,6,7]),
       member(Fries,[0,1,2,3,4,5,6,7]),
       member(Salad,[0,1,2,3,4,5]),
       member(Wings,[0,1,2,3,4]),
       member(Sticks,[0,1,2,3]),
       member(Sampler,[0,1,2]),
       1505 is Fruit*215 + Fries*275 + Salad*335 + Wings*355 + Sticks*420 + Sampler*580.
       Fruit = Sampler, Sampler = 1,
       Fries = Salad, Salad = Sticks, Sticks = 0,
       Wings = 2 ; % << END OF ANSWER #1
       Fruit = 7,
       Fries = Salad, Salad = Wings, Wings = Sticks, Sticks = Sampler, Sampler = 0 ; % << END OF ANSWER #2
       false.

matsemann · on May 28, 2013

As a person not heavily invested in Prolog, can you explain how that works? Is it just branching by choices from the different groups and it backtracks/fails back up to the last choice if constraints are not met?

kaoD · on May 28, 2013

Pretty much that, yeah.

http://en.wikipedia.org/wiki/Unification_%28computer_science...

swannodette · on May 28, 2013

In Clojure's core.logic it's also pretty succinct ;)

   (run* [mf ff ss hw ms sp]
     (fd/in mf ff ss hw ms sp (fd/interval 0 10))
     (fd/eq
       (= (+ (* mf 215) (* ff 275) (* ss 335)
             (* hw 355) (* ms 420) (* sp 580))
          1505)))

Also the post seems incorrect there seem to be two extra results that do not add up to $15.05?

amouat · on May 28, 2013

I wondered how long it would be until a core.logic version appeared :)

I think the original post has one of the prices wrong.

BTW thanks for working on this stuff swannodette - I hope to look into core.logic in the not too distant future...

EDIT: Yes, his price for salad is 3.25 instead of 3.35

simias · on May 28, 2013

I must say I'm not familiar at all with constraint programming, but it looks like something I've been wanting to know for a long time.

A practical problem I have right now is writing a device driver that needs to configure Phase Locked Loops in order to generate certain frequencies. However the setup is not straightforward, I have to configure a set of divisors, multipliers and some muxes. To make matters worse, at each step I have constraints for the possible frequency range at that point. And finally I sometimes have several of those PLL in series.

The only solution I have right now is using a C program that bruteforces all combinations, checks if the constraints are satisfied and outputs an array of configs for the subset of frequencies I need. If I need a new frequency I have to re-run my program and update the array. It works but is far from ideal.

Unfortunately all the tutorials I find online for this CP thing are using prolog or some third party library in C++, which is of course not practical for kernel programming...

If I could implement a clever solver that would run fast enough (unlike my bruteforce solution) I could just do it dynamically in the driver and it'd be much more elegant.

But I really don't know where to start. Any pointers/things to google for?

lifebeyondfife · on May 28, 2013

Are you proposing to write a simple constraint solver in C? If it's specific to your problem i.e. not too general, it's not the craziest idea I've ever heard.

I don't mind fielding questions over the basic theory if you want to email me (iain at my domain).

simias · on May 28, 2013

That's very nice of you! I'll gather my thoughts and send you an email soon. Thanks a lot.

ScottBurson · on May 28, 2013

I think the best advice I could give you is to be aware that what you're hoping for -- an algorithm with performance bounds tight enough that you can use it in a device driver -- probably doesn't exist. There is some chance, admittedly, that you will get lucky and find some structure in the problem that you can exploit to produce such an algorithm. But it's also entirely possible that there isn't any such structure, and the fact that you've resorted to brute force search to this point suggests there may not be.

Even if there isn't, there is still hope for ways to search the space faster. This problem sounds like something that an SMT solver might work well on. I would give that a try.

pvarangot · on May 28, 2013

There are Simplex implementations in C that you should be able to compile in almost any C compiler... GLPK is the first one that comes to my mind.

mipmap · on May 29, 2013

Here is a collection of open source Operations Research (CP and LP falls under the umbrella of Operations Research) projects: http://www.coin-or.org/projects/

There are many fantastic open source libraries in there to handle many types of optimization or assignment.

graycat · on May 28, 2013

Here is what 'constraint' programming is:

First we start with what 'mathematical' programming (optimization) is. For a reasonably general definition, let R denote the set of real numbers and m and n be positive integers. Let R^n be real n-dimensional space, that is, the set of n-tuples of real numbers. Let x be in R^n.

Next we have two functions. We have an 'objective' function f: R^n --> R, and we have a 'constraint' function g: R^n --> R^m. We can think of g as m functions mapping R^n --> R. Then the optimization problem is to find x so that f(x) is as large as possible while g(x) = 0 where here 0 is the point in R^m with all components 0. Or we can ask that g(x) <= 0 by which we mean that each of the m components of g(x) is <= 0 in R. In 'linear programming', functions f and g are both linear. So, function f is given by just n coefficients, and function g is given by an m x n matrix.

If there exists x in R^n so that g(x) = 0 (or <= 0 for that version of the problem), then x is a 'feasible' solution. If x is feasible and for any feasible solution u in R^n we have that f(x) >= f(u), then x is an 'optimal' solution.

Linear programming where we ask that all the components of x be integers is 'integer' linear programming (ILP) and is in NP-complete. Industry is awash in important ILP problems, e.g., airline scheduling. There is an enormous literature on practical means of solving ILP problems; Google Nemhauser long at Georgia Tech.

The problem may fail to be feasible. If the problem is feasible, it may fail to have an optimal solution.

Constraint programming is the same except we have no objective function and are looking for just feasible solutions.

Since in principle finding a feasible solution is as hard as finding an optimal solution, constraint programming is essentially the same subject as mathematical programming (optimization).

Constraint programming is nice when a planner is just trying to get some work done and knows that the cost will be the same for all alternatives. E.g., maybe yesterday they killed 5000 hogs, overnight chilled them down, today are cutting them into pieces and putting the pieces into boxes, and during the day want to back 18 wheel refrigerated trucks into the small loading dock in the right order to get all the boxes loaded and each truck with the boxes it needs for its deliveries. Once a real problem.

SAP in Germany has wanted to offer constraint programming as part of its software suite. Since constraint programming is essentially the same (mostly just different in how the user sees it and, thus, the user interface) as mathematical progamming, at one time SAP worked with C-PLEX, with some of the best software for linear and integer programming, the company of R. Bixby, long at Rice U.

graycat · on May 28, 2013

His restaurant problem is a special case of the knapsack problem where we have some items to carry, a value of each, and a weight of each, and a knapsack with a maximum weight we can carry, and we want to know what items to put in the knapsack to maximize the value but not exceed the maximum weight.

For the restaurant problem, for each item on the menu, just set both the value and the weight to the price, and for the maximum weight of the knapsack just use the budget.

Now, knapsack problems are in NP-complete, as I recall, but they are, in practice, among the easiest to solve. The usual recommendation for solving a large knapsack problem is via dynamic programming.

205guy · on May 28, 2013

Something is missing from this discussion. The sample problem in the xkcd comic is from a class of NP-complete problems, but being small, it has a solution in "pseudo-polynomial time"[1] that can be solved by a variety of tools, as demonstrated in these comments.

Even though this simple example can be solved, the general problem (from a set of N integers, find all subsets that add up to a given integer x) is nonetheless NP-complete. The method of computation (Excel, Prolog, constraint programming, Turing machine) does not change this fact.

[1] https://news.ycombinator.com/item?id=5780552

mrgoldenbrown · on May 28, 2013

Why 0..10? I think it would be instructional to mention how you chose these domains. Is minizinc smart enough to only go as high as necessary? If you had put 0..1000 would it have tested all 10001000100010001000*1000 combinations, or is it smart enough to ignore fruit > 7 and salad > 2?

lifebeyondfife · on May 28, 2013

There are three phases to Constraint Programming: (1) Model (2) Constraints (3) Search & Propagation.

The first step in solving a problem with CP is to come up with a representation for a solution. This involves deciding what your variables will represent (in this case, "How many orders of each appetiser") and what is their possible domain of values (in this case, "0 to 10" has been chosen, quite arbitrarily).

The solver itself is generally dumb. It will propagate some constraints sensibly but mostly it will just search over the entire space - equal to the size of the domain to the power of the number of variables. 11^6 = 1.8 million, quite a small CP problem.

There's a bit of an art in creating a good model for a CSP (Constraint Satisfaction Problem) to keep the search space as small as it needs to be.

EDIT: Just to be clear...

The way constraint solvers work is to make a decisions in a Depth First Search manner and judge the validity of the new state after each choice. So after fruit > 7 and salad > 2, bounds consistency on the constraint will rule out the possibility of reaching a solution from that node in the search tree.

mzl · on May 28, 2013

> The solver itself is generally dumb. It will propagate some constraints sensibly but mostly it will just search over the entire space - equal to the size of the domain to the power of the number of variables

This is false. The cornerstone of solving interesting constraint programming problems is the ability to effectively propagate constraints. Naturally, since constraint programming models NP-complete problems, a large part of all problems will take exponential time. However, when propagation does not help and the solver effectively searches through all combinations, that is a failure mode. It is precisely the cases where propagation reduces the amount of combinations explored that are interesting.

lifebeyondfife · on May 28, 2013

From the similarity of our answers below I see we've got a similar understanding of constraint programming.

For loosely constrained problems, yes, constraint programming will enumerate exponential combinations and that's not a good fit for this methodology. The point I was trying to guard against is that the solver isn't smart and therefore when asking, "Is it smart enought to know X?", I'd say it depends.

If the solver in question has some decent bounds propagation on the six or so binary arithmetic constraints in the problem, then no, it won't enumerate all 1001^6 combinations (assuming domains of size 1001).

A more helpful answer from me would have been, "It's unknown if it's smart enough, but the more constrained i.e. the more combinations your constraints reject, the better".

Seeing as there were four solutions out of 1.8 million states (in the domains of size 11 example), that's very constrained.

gkatsi · on May 28, 2013

I think it is a mistake to try to characterize when CP works based on constrainedness or on the number of solutions. In fact, being loosely constrained is typically used to imply that a problem is easy. The problem for CP is when a problem is NOT loosely constrained, but propagation is weak. This means that whatever reasoning we can do with the set of constraints, we cannot express as value prunings.

As for the number of solutions vs constrainedness: as far as I know, the only domain in which there exists a clear theoretical understanding of number of solutions vs difficulty is random problems, where we see an "easy-hard-less hard" pattern as we increase the density for a fixed size problem. What happens there is that there is an exponential number of solutions in the easy region which form a small (polynomial number) of very big clusters. As we get to the hard region, the number of solutions remains exponential, but the solution space shatters into an exponentially large number of clusters. The definition of a cluster in this case is: if two solutions have constant Hamming distance, they are in the same cluster. On the other hand, some unsatisfiable problems (so 0 solutions) are very very hard for CP and related approaches.

All this is just to say that the picture is much more complicated than you say and the best way to figure out if CP is a good fit for an application is to try it. People develop intuition over time of course, but the issue is far from understood.

CJefferson · on May 28, 2013

No, solvers do no-where near 1.8 million nodes.

The solver I work on (minion), which is not optimised for arithmetic problems, or easy problems, does 86 search nodes when finding all 4 solutions. It makes no differences if the domains are 0..10, or 0..100000, in terms of either speed or search size. I would be amazed if any CP solver doesn't tighten the bounds straight away to something like:

    {0..7},{0..5},{0..4},{0..4},{0..3},{0..2}

hartror · on May 28, 2013

Yup pure laziness as it was a toy problem. Though in my experiments with a gecode sudoku model its propagation was amazing, reducing the search space to a handful of branches. Another toy problem of course but one of a significantly larger size. Am I over estimating the power of propagation?

mzl · on May 28, 2013

Propagation power really depends on the problem. For some problems, propagation is amazingly good, and for some it does not do much at all.

One can characterize propagation algorithms for the amount of propagation they do. A domain consistent propagation algorithm will make all deductions that are possible to make in a certain state, removing all values for variables that are not a part of any solution to that constraint. Loosely, a bounds consistent propagation algorithm will do the same, but only for the bounds of the variables.

The constraint used in Sudoku (called all_different or distinct) has really good propagation algorithms (both bounds and domain consistent), and is one of the cornerstones of constraint programming.

lifebeyondfife · on May 28, 2013

No, you're not overestimating it but it very much depends on the problem and constraints. It's good to do the d^n search space check to give you a feel for the starting size of the problem but if your problem is heavily constrained i.e. the constraints make finding a solution hard, then propagation is very powerful.

Consider variables a[1:1000] and b[1:10] and constraint a + b = 10. A good solver will use propagation to reduce that to a[1:9] and b[1:9] because there is no support for a >= 10 or b >= 10. Even reducing one extra value from a variable's domain is reducing the search space by a factor so propagation over tight constraints is extremely powerful.

Most arithmetic constraints like those used in your program will use bounds consistency (find support for the lower and upper bound values). Special constraints such as "all different" have their own propagation algorithms that efficiently remove potential values from the domains of multiple variables. These are known as Global Constraints.

simias · on May 28, 2013

He answers this question in a comment: http://www.roryhart.net/code/xckd-np-complete-restaurant-ord...

Yup though the solver does this for us anyway. Constraint propagation allows the solver to reduce the search space, reducing the domains of variables before traversing the tree.

hartror · on May 28, 2013

It is "smart" enough. Before and then during searching the problem space the solver performs constraint propagation, which is a bunch of clever algorithms (the maths for I am yet to grok) that reduce the domains of variables. This allows it to reduce the problem space to something reasonable.

So why 10? I guess I was being lazy! But you are right I will do an update with some more in the domains.

http://en.m.wikipedia.org/wiki/Constraint_propagation

pliny · on May 28, 2013

You can set an upper limit that is total/price, I guess he chose 0..10 since it's obvious that 10 of anything is already too much, and I think minizinc tightens the bounds anyway, since that's something it can do independently of giving a complete solution.

illicium · on May 28, 2013

This is just a linear programming problem. One-liner solution in Mathematica, just for kicks:

    In[1]:= Solve[
        Rationalize[2.15 a + 2.75 b + 3.35 c + 3.55 d + 4.20 e + 5.80 f == 15.05] && 
        a >= 0 && b >= 0 && c >= 0 && d >= 0 && e >= 0 && f >= 0,
        {a, b, c, d, e, f}, Integers
    ]

    Out[1]= {{a -> 1, b -> 0, c -> 0, d -> 2, e -> 0, f -> 1},
             {a -> 7, b -> 0, c -> 0, d -> 0, e -> 0, f -> 0}}

davmre · on May 28, 2013

Linear programming as it's usually defined allows for the solution to involve arbitrary real numbers, and it's that flexibility that allows general LPs to be solved in polynomial time. The restriction to integers turns this into an integer programming problem, which is NP-Complete. Though that doesn't seem to bother Mathematica (which presumably has a similar sort of constraint solver working under the hood). :-)

stcredzero · on May 28, 2013

> So it’s not even that I figured out something clever. It’s just I have the patience to do the thing that no one would think of, or that anyone would ever bother to do. And sometimes that’s key. It’s just like I have that much free time.

http://www.maa.org/mathhorizons/MH-Sep2012_XKCD.html

krenoten · on May 28, 2013

Is the Z3 theorem prover an example of a CP tool? For those interested, Z3 is one of the world's fastest theorem provers and has been open sourced by Microsoft. Theorem provers have many interesting applications, but one of my favorites is using them to represent a program's state and reasoning about the exploitability of bugs. It has bindings to C, C++, .NET and Python. Check it out if it interests you: http://z3.codeplex.com/ or try it online at http://rise4fun.com/z3py

nitrogen · on May 28, 2013

Before you download Z3 and dive in, be aware that Z3 is not Open Source. It's licensed under a non-commercial license[0], which is not compatible with the Open Source definition[1]. Muddying the term Open Source will diminish its descriptive usefulness.

[0] http://z3.codeplex.com/SourceControl/latest#LICENSE.txt

[1] http://opensource.org/osd#fields-of-endeavor

colanderman · on May 28, 2013

Z3 is based on SMT (satisfiability-modulo-theories), not constraint programming. You can probably do some constraint programming in it, but that's not what it's designed for.

mistermcgruff · on May 28, 2013

Personally, I'd solve this using an integer programming approach, but CP is really cool.

The IBM CPLEX CP optimizer is best for this kind of stuff IMHO.

hartror · on May 28, 2013

We use CPLEX for LPs but I've not read much on the CPLEX CP solver. Maybe because it isn't free so the academics stick with the OSS ones? How does it compare to gecode?

mzl · on May 29, 2013

One of the driving forces behind the development of Gecode was that academics used to use the pre-cursor to IBM CP Optimizer, ILOG Solver, in research.

As for comparing Gecode with CP Optimizer, the largest difference IBM CP Optimizer is a closed-source commercial system, with paid support. CP Optimizer also has some (as far as I've seen) amazing automatic search heuristics, automatic symmetry breaking, and more advanced scheduling constraints. Gecode on the other hand is open for modifications, has nice parallel search built in, and can be embedded easily. Given a problem where both systems have the constraints, and the search heuristic is fixed, my guess is that they would be mostly equivalent in speed

hartror · on May 29, 2013

I have a cplex license on my work machine so that is another avenue to explore!

mzl · on May 29, 2013

Note that CPLEX and CP Optimizer are two completely different products. AFAIK, a license key for one of them can not be used for the other. But I've never actually used either myself, so don't take my word for it.

Also, since you seem to be interested in using Minizinc for modelling, you can use other back-ends than the built in one for solving models.

For some more code examples, see also Håkan Kjellerstrands blog (http://www.hakank.org/constraint_programming_blog/) which, among other things, contains numerous solutions to the xkcd problem suing different systems.

hartror · on May 29, 2013

I have tested out using gecode as a minizinc backend as I like the look of gint. I've been skimming hakank.org but it is a little advance for me at the moment.

Is there a way to get a minizinc model into cp optimizer?

mzl · on May 29, 2013

I have no idea if it is possible to use Minizinc models with CP Optimizer, but I would guess not. It doesn't feel like a feature that would be of business interest to IBM to develop.

As for Håkans pages, http://www.hakank.org/common_cp_models/#xkcd has a list of 21 implementations of the xkcd problem.

andrewcooke · on May 28, 2013

cp is fascinating. if people are interested, here's a write-up from a similar exploration, using choco (a constraint solver in java) to solve a problem on stackoverflow:

write-up - http://isti.bitbucket.org/2012/04/07/pipes-clojure-choco-5.h...

problem - http://stackoverflow.com/questions/9689436/backtracking-solu...

and another problem on stackoverflow that i was going to solve with cp, but which ended up being analytic - http://stackoverflow.com/questions/7076349/is-there-a-good-w...

nickknw · on May 29, 2013

I agree, it is! I did a light comparison[0] between a logic programming solution to sudoku (Prolog) and one using constraint programming (Scala + JaCoP). I thought Prolog was a little nicer because it was shorter and more declarative, but the JaCoP solution was still great compared to any non CP-solution.

[0] - http://nickknowlson.com/blog/2012/08/06/seven-languages-week...

bjourne · on May 28, 2013

Here is a brute force solution in factor:

    6 [ 10 iota >list ] replicate >list lcartesian-product* [ { 215 275 335 355 420 580 } [ * ] 2map sum 1505 = ] lfilter list>array .

Kind of cheating, but the solution space is so small that brute forcing it becomes viable.

woqe · on May 28, 2013

If anyone is interested in the problem in general, it is equivalent to finding solutions to a Diophantine equation. http://en.wikipedia.org/wiki/Diophantine_equation

gkatsi · on May 28, 2013

It is not, because it is a single linear equality and all variables are >= 0. This is not just nitpicking: solving Diophantine problems is in general undecidable, but this problem can be solved in pseudo-polynomial time (linear in the sum of all coefficients)

cantos · on May 28, 2013

The study of solutions to equations only in the positive integers, or other subsets of integers also falls under Diophantine equations, as far as I know.

Subset sum problems are obviously decidable since the number of values that must be tried to find a solution is finite. However, subset sum is NP complete even when restricted to positive integers [Garey and Johnson, p223] so there is no polynomial time algorithm.

gkatsi · on May 28, 2013

It is NP-complete in the weak sense: the reduction from, say, CNF-SAT, introduces very large coefficients (in the order of 2^n). The pseudo-polynomial algorithm has complexity linear in the magnitude of the coefficients, not in the size of their binary representation (hence pseudo). If the coefficients are small, that algorithm is efficient. See: http://en.wikipedia.org/wiki/Pseudo-polynomial_time

mariorz · on May 28, 2013

Some time ago I wrote a wrapper for PyGLPK and used the same problem as example:

https://github.com/mariorz/pytorovich/blob/master/examples/x...

igul222 · on May 28, 2013

Does anyone know what algorithm this CP implementation is using behind the scenes? I'd be a lot more impressed if it were using dynamic programming than plain brute force.

hartror · on May 28, 2013

Constraint propagation + depth first search.

rscale · on May 28, 2013

Excel makes this easy as well:

http://cl.ly/image/1X2S121l2C3O

No nerd cred, but it crosses the finish line.

swannodette · on May 28, 2013

Wow I did not know that Excel ships with a constraint solver and has since 1991 - http://www.utexas.edu/courses/lasdon/design3.htm

mistermcgruff · on May 28, 2013

It's not a constraint programming solver, but rather a linear programming solver. The algorithm the comment above used is "branch and bound" on top of the Simplex method to handle the integer constraints. Tutorial here: http://analyticsmadeskeezy.com/2012/08/29/optimal-blending-o... Wikipedia here: http://en.wikipedia.org/wiki/Linear_programming http://en.wikipedia.org/wiki/Simplex_algorithm http://en.wikipedia.org/wiki/Branch_and_bound

mistermcgruff · on May 28, 2013

Integer programming for the win! I freaking love Solver (well, OpenSolver.org I love...regular Solver I like with some distain mixed in)