Hello everybody, welcome back to our unit on Linear Programming. Today, we're finally going to get to an actual algorithm to solve linear programs. In particular we're going to talk about the simplex method, which is basically the oldest algorithm for solving linear programs. And as it turns out, it's still one of the most efficient. Now unfortunately, as we'll see, the runtime of this algorithm isn't quite as good as we would like, but it's still pretty reasonable for many contexts. So first off, remembering in our lecture last time, where actually this is going to solve a specific formulation of linear programming. And in particular, it's going to solve optimization from a starting point. So if you want to use this method to solve an actual full optimization problem, you'll have to remember how you do that based on the last lecture. So, what's the idea here? We start at some vertex of our polytope, and we know that the optimum is at another vertex. So what we're going to do is, we're going to just find path between vertices that sort of as we go, that our objective will get better and better and better until we reach the optimum. So, how do vertices work? Well, you get a vertex of your polytope when you look at the intersection of n of the defining equations. And you take n of your defining equations, you make them all tight, they intersect at a vertex, and you can solve for that vertex using Gaussian elimination. Now if you relax one of these equations, then instead of having a zero dimensional thing, you have a one dimensional thing. And you end up with an edge, you get a set of points with the form p + tw where t is non-negative. The constraint that you relaxed requires the t now be non-negative. So this gives you an edge and it continues, t being zero all the way up to the whatever, until you violate some other constraint in your linear program. And then you get another vertex at the other end of the edge. Now if v dot w is bigger than 0, if you follow this edge, you get a larger value of the objective at the new vertex. So here's the pseudocode for the simplex algorithm, it's actually pretty easy. You start at a vertex p and you repeat the following. You look over each equation, passing through p and you relax that equation to get an edge. If, when you travel along that edge, it improves your objective, you replace p by the vertex you find at the other end of that edge. And then you break, you go back to for each equation running through p. If, however, there was no improvement, you tried every edge going out of the p, none of them did any better, you actually know that you're at the optimal vertex and you return p. Now to look at what does it mean to go to the other end of this edge, it's basically what we said before. Your vertex p was defined by n equations. You relax one equation and now the general solution of these n- 1 equations is a point at the form p + tw over real numbers t, you solve for this using Gaussian elimination. The inequality that you relaxed requires, say, that t be bigger than or equal to 0. And each other inequality in the system might put other bounds on which t are valid to have solutions to your equation. Now some of them might put upper bounds on t. They might say t is at most 7, or at most 10, or whatever. And of those upper bounds you take the smallest of them and call that t0. Then p + t0 is the vertex that you will get at the other end of that edge. Now, to show that this is correct, we need to prove the following theorem. If p is a vertex that is not optimal, then there's actually some vertex adjacent to p that does better. And this means that we will keep finding better and better vertices until we find the best one, the optimal, and then we can return that. Now the proof of this isn't so hard, you've got a vertex p. It's the intersection of n equations, E1 through En. Now what you'd like to do to prove this is optimal, is you want to use sort of the dual program. You want to find a positive linear combination of E1 through En to find an upper bound, x dot v is at most whatever. Now you, of course, can't always do this, but it's actually going to be the case that you can basically always write x dot v less than or equal to something as some linear combination of these constraints. Possibly using negative coefficients. Now if all the coefficients were positive, of course, we have an action solution that the dual program that p is optimum. However, if some of the coefficients were negative, you could actually show that if you relaxed the equation with a negative coefficient, then that actually will give you an edge where if you move along that edge your optimum gets better. And so that proves the theorem. If we're not at the best point ever, then we can find an edge to follow that does better. Okay, so how long does the simplex algorithm take? Well, basically, taking one step on this path isn't so bad. It's a nice polynomial time thing involving trying a bunch of edges and doing some Gaussian elimination. But we do have to take this path that goes from wherever we started to the optimum. And how long the algorithm takes will depend on how long that path is. And unfortunately, the path might be somewhat long. So we'll suppose that we have the sort of almost cube like thing as sort of the polytope defined by this linear system. We're trying to optimize the height, we're trying to go up as far as possible. We start at the marked vertex and we're going to use the simplex method to travel to other vertices, increasing the height as we go. Now the question, of course, is what's the longest path that we might need to take? What's the longest number of steps that we might need to take in order for the simplex algorithm to complete on this example? Well, unfortunately, it might take as many as seven steps to find the optimum. Because it's possible that we can take this path as shown that actually passes through every single vertex in the polytope. Now some paths are better, some other paths could take as few as three steps but seven is big. And, in fact, if you do an n dimensional version of this, you might have only two inequalities that take actually two to the n steps to get where you're going. And this really isn't so good. And so it turns out the runtime of simplex is proportional to the path length. Now the path length in practice is very often very reasonable. However, you can always find there are some unusual examples where the path length is actually exponential. And this means that simplex algorithms sometimes takes quite a long time to finish. Now there's one other technical problem, this is degeneracy. So in this analysis, we assumed that only n hyperplanes intersect at a vertex, and that's not always the case. For example, in the picture that we have here, we've got a pyramid and the vertex of that pyramid is on four of the defining hyperplanes. Even though you're in only three dimensions. If you have some of these degenerate vertices, it's actually a little bit hard to solve the system because we don't know which equation we're supposed to relax in order to follow an action. We'd actually have to relax two of our equations to get to an edge and we wouldn't know which ones to use. So there's actually a fix for this, which is not very hard. Which is, if you take all of your equations and tweak them by just a tiny, tiny bit, this basically doesn't change your solution at all. But it avoids having any of these degenerate intersections. Now, in fact, if you're willing to be a little bit more sophisticated, you can run a version of this fix that doesn't involve actually changing things at all. You can sort of make infinitesimal changes, changes that are sort of only formally there. So you number your constraints 1, 2, 3, 4, 5, etc. You strengthen the first constraint by epsilon and the next by epsilon squared and the next by epsilon cubed. Where epsilon is just some incredibly tiny number, sort of so tiny that it jsut doesn't matter for anything. Then in practice, when you want to solve the system, you don't actually need to change any of your equations. If you're at a degenerate point, you need to keep track of which n equations you are really on, which are really the n equations defining this point. And then when you're travelling along an edge that hits a degenerate point, you need to figure out which is the new equation you're actually at. And, for this, you should always add the lowest numbered constraint at the new corner. So if you hit a new corner that actually has three hyperplanes passing through it, you pick the lowest numbered of them to be the real one. Now when you do this, you in fact, might have some edges that pass from a degenerate corner to itself. And this is fine as long as you keep track of which n hyperplanes you're actually on at any given time. And you make sure you only actually use that edge if it improves your objective. If the edge that you followed, which should have had zero length, was in a direction at least that made things better for you. But when you do this, we have a nice algorithm called the simplex method. It solves linear programs by moving between adjacent vertices trying to hit an optimum. It actually works pretty well in practice but potentially is exponential time. And if you're really worried about that, come see the next optional lecture that we're doing on the ellipsoid algorithm, which is a much more modern technique for solving linear programs.