An Intro to Integer Programming for Engineers: Simplified Bus Scheduling

An Intro to Integer Programming for Engineers: Simplified Bus Scheduling(blog.remix.com)

122 points by dget 9 years ago | 23 comments

glangdale 9 years ago |

Integer programming often seems magical. I remember early grad school and seeing "Optimal and Near-optimal Global Register Allocation Using 0–1 Integer Programming" by Goodwin et al, where the very crunchy problem of register allocation was solved largely by punting it to a ILP solver. Gotchas abounded but it was eye opening to see how many hard problems could be solved (or at least adequately approximated) in this fashion.

I think this is one of those techniques that every serious computer scientist should have in their toolbox. Someone who knew some statistics, a smattering of something like SMT (enough to drive Z3) and some linear programming could routinely resemble Gandalf at any shop that has people struggling with difficult problems.

The main thing is not to turn into a cookbook guy. I've known a few people whose ability to just build an actual algorithm to solve some interesting problem has atrophied since they reach for a ILP solver the minute anything gets difficult.

petters 9 years ago | |

Agreed that it seems magical. This is one of the areas where the open-source tools are way behind commercial solvers. Cplex and Gurobi are really impressive pieces of software.

repsilat 9 years ago | | |

Doubly magical now that computers (and solvers) are so fast. A few times I've thought, "Oh, I could reduce this to a max-flow problem, and write/dig up an algorithm to solve it" before realising it'd just be easier to write it as a linear program (or an integer program "just in case".)

And then if weird constraints come along that broke the max-flow reduction, I could usually shoehorn that into the formulation.

Can't remember how to write Dijkstra's algorithm? Meh, integer programming it is. Want to write a sudoku solver and you can't be arsed with dancing links or backtracking or prolog or... Whatever, CBC will do it. It'll be nasty and a hell of a lot slower than doing it "the right way" in a fast language, but it might not be slower than doing it in Python or Javascript.

graycat 9 years ago | | |

C-PLEX and Gurobi -- R. Bixby.

Alphasite_ 9 years ago | |

Whats the distinction between Integer Programming and Constraint Programming? From a cutlery glance they appear to be the same solution to the same problem...

barrkel 9 years ago | | |

Integer programming is effectively searching the edge of a convex polygon (polytope for higher dimensions) for an optimal value function, defined in terms of the coordinates of the points in space. The inequalities are planes that subdivide the solution space into permitted and non-permitted domains.

Constraint programming generates lots of potential solutions (combinatorially) and prunes the search tree to make the large numbers tractable.

The intuitions behind the two techniques are quite different.

optimali 9 years ago |

I strongly encourage using Julia for anyone applying mathematical optimization. It is (through JuMP[1], for example) one of the areas in which it really stands out.

[1] https://jump.readthedocs.io/en/latest/quickstart.html

graycat 9 years ago |

What the OP is describing is, except for the coding, some now classic applied math, i.e., operations research. That applied math got used much less often than one might have expected because the (A) data gathering was too much pain, expense, and botheration, (B) there was too much software to write, and writing it was too clumsy and expensive, (C) the computing was too slow and too expensive, and (D) even when got a good solution ready for production, the production situation commonly changed so fast that the work of (A)-(C) could not change and revise fast enough to keep up. And, of course, the custom, one-shot software was vulnerable to bugs. Net, in practice, a lot of projects failed. Mostly successful projects needed big bucks and lots of unusually insightful sponsorship high in a big organization.

But, now (A)-(D) are no longer so difficult. This should be the beginning of a new Golden Age for such work.

Sure, since the results of such optimization can look darned smart, smarter than the average human, might call the work artificial intelligence. Really, though, the work is mostly just some classic applied math now enabled in practice by the progress in computer hard/software.

The OP is a nice introduction to vehicle routing via integer linear programming (ILP) set partitioning. For the linear programming and the case of ILP, I'll give a quick view below. But, now, let's just dig in:

Here is an explanation of the secret approach, technique, trick that can work for vehicle routing and many other problems: The real problems can have some just awful non-linear cost functions, just absurdly tricky constraints, e.g., from labor contract work rules, equipment maintenance schedules, something can't do near point A near lunch, even handle some random things, even responding to them in real-time (dynamically), etc. yet can still have a good shot at getting a least cost solution or nearly so. The "nearly so" part can mean save a lot of money not available otherwise. When there is randomness, then try to get least expected cost.

So, first, the trick is to do the work in two steps.

The first step call evaluation and the second, optimization.

From 50,000 feet up, all the tricky, non-linear, goofy stuff gets handled by essentially enumeration in the first part leaving some relatively simple data for the optimization in the second part.

In practice, this first step typically needs a lot of data on, say, the streets of a city and requires writing some software unique to the specific problem. The second step, the optimization, may require deriving some math and/or writing some unique software, but the hope is that the step can be done just by routine application of some existing optimization software.

The OP mentions the now famous optimization software Gurobi from R. Bixby and maybe some people from Georgia Tech, e.g., from George Nemhauser and Ellis Johnson (long at IBM Research and behind IBM's Optimization Subroutine Library (OSL) and its application to crew scheduling at American Airlines).

First Step.

Suppose you are in Chicago and have 20,000 packages to deliver and 300 trucks. Okay, what trucks deliver what packages to make all the deliveries on time, not overload any trucks, and minimize the cost of driving the trucks? You do have for each package the GPS coordinates and street address. And you have a lot of data on the streets, where and when traffic is heavy during the day, etc.

Okay, let's make some obvious, likely doable progress: Of those 20,000 packages, maybe have only 15,000 unique addresses. So, for each address, bundle all the packages that go to that address. Then regard the problem as visiting 15,000 addresses instead of delivering 20,000 packages.

So, you write some software to enumerate. The enumeration results in a collection of candidate routes, stops, and packages to be delivered for a single truck. For each of those candidates, you adjust the order in which the stops are made to minimize cost -- so here get some early, first-cut, simple optimization. You keep only those candidates that get the packages delivered on time, meet other criteria, etc. You may have 1 million candidate single truck routes. For each of the the candidates, you find the (expected) operating cost.

So, suppose you have n = 1 million candidate single truck routes.

Also you have m = 15,000 addresses to visit.

So, you have a table with m = 15,000 rows and n = 1 million columns. Each column is for some one candidate route. Each row is for some one address. In each column there is a 1 in the row of each address that candidate route visits and a 0 otherwise. One more row at the top of the table is, for each column, the operating cost of that candidate route.

So, you have a table of 0's and 1's with m = 15,000 rows and n = 1 million columns. You have a row with 1 million costs, one cost for each column.

Again, you have 300 trucks. So, you want to pick, from the n columns, some <= 300 columns so that all the m addresses get served and the total costs of the columns selected is minimized. That is, if add the columns as column vectors, then get all 1's.

Second Step.

Well, consider variables x_i for i = 1 to n = 1 million. Then we want x_i = 1 if we use the route in column i and 0 otherwise. Let the cost of the route in column i be c_i. We want the total cost (TeX notation):

z(x) = sum_{i = 1}^n x_i c_i

to be minimized. So, right, we take the big table of m = 15,000 rows and n = 1 million columns and call it m x n matrix A = [a_{ij}]. We let m x 1 column vector b have all 1's. We regard x as n x 1 where in row j = 1 to n is x_j. Then, we get linear program

minimize z(x)

subject to

Ax = b

x >= 0

So, this is a case of linear programming.

Except in our problem we have one more constraint -- each x_i is 0 or 1, and in this case our problem is 0-1 integer linear programming.

Linear Programming.

In linear programming with n variables, with the real numbers R, we go into the n-dimensional vector space R^n. The

Ax = b

x >= 0

are the constraints, and the set of all x that satisfies those is the feasible region F, a subset of Rn.

In R^n, a closed half space* is a plane and everything on some one side of it.

Then F can be regarded as an intersection of m closed half spaces. So, F has flat sides, straight edges, and some sharp points (extreme points).

Well, if there is an optimal solution, then there is an optimal solution at at least one of those extreme points. So, the famous Dantzig simplex algorithm looks for optimal solutions in iterations were each iteration starts at an extreme point, moves along an edge, and stops at the next extreme point. That's for the geometric view; the algebraic view is a tweak of the standard Gauss elimination algorithm.

Linear programming and the simplex and other algorithms have a huge collection of nice properties, including some surprisingly good performance both in practice and in theory.

But, asking for each x_i to be an integer is in principle and usually in practice a huge difference and gives us a problem in NP-complete -- at one time, this was a huge, bitter surprise.

Warning: At one time, the field of the applied math of optimization in operations research sometimes had an attitude, that is, placed a quasi-religious importance on optimal solutions and was contemptuous of any solutions even 10 cents short of optimal. Well, that attitude was costly for all concerned. Instead of all the concentration on saving the last 10 cents, consider saving the first $1 million. Commonly in practice, we can get close to optimality, may be able to show that we are within 1% of optimality, so close the rest wouldn't even buy a nice dinner, and see no way in less than two weeks more of computer time to try to get an optimal solution.

So, concentrate on the big, fat doughnut, not the hole.

How to solve ILP problems is a huge subject -- e.g., can start with George Nemhauser -- but a major fraction of the techniques exploit some surprisingly nice properties of the simplex algorithm. Right, likely the best known approach is the tree search technique of branch and bound.

techwizrd 9 years ago |

I took a number of Operations Research courses in undergrad, and one course was almost entirely on Linear Programming and Integer Programming. I strongly recommend engineers to get familiar with them because they are powerful tools for solving a whole class of really tricky problems.

ninjamayo 9 years ago |

Nice post but sorry guys, never been a fan of integer programming. Too involved with genetic algorithms