[MUSIC] Welcome back to the course on approximation algorithms. In this chapter, we're going to study a new problem, the multiway cut problem, and develop a new interesting linear program and relaxation and use it to design an original version of randomized rounding. What is the multiway cut problem? It's defined by its input and its output. The input is a graph where the edges of the graph has some weights, and in this graph, there's some special vertices called terminals. We have k of them, where k is fixed. Maybe two, maybe three, maybe six as in the case of this picture. Six special vertices called terminals. The goal is to cut those terminals from one another. How? But cutting edges. So, the output is a subset of edges, subset F, such that if you removed those edges from the graph, all the terminals are disconnected from one another, and your goal is to minimize the weight of the edges you cut. So, that's the multiway cut problem. There's a special case which is quite familiar. When k is equal to 2, k=2, 2 terminals, a and b. Disconnect a from b using edges of minimum weight. That's a cut, a cut disconnecting a from b, min cut. That's the famous min cut problem, which, as we know, is polynomial time, can be solved in polynomial time. Next case, k = 3. Three terminals, a, b, c, that must be disconnected from one another by a minimum weight set of edges. That problem is already NP-hard, already for k = 3. This is not really surprising because it's one of many cases of problems in minimal optimization where the case k = 2 is simple, and k = 3 or more is hard and NP-hard to solve exactly. So here's a simple algorithm for k = 3. We take our three terminals, a, b, c. Let's compute the minimum way to cut vertex a from vertices b and c. The minimum way to disconnect a from b and c. Here it is. If you cut the graph this way, this is a minimum way, minimum weight set of edges that disconnects the graph. It's a partition of the vertices into two subsets, such that the edges across to cut have the minimum weight, and it disconnects vertex a from vertices b and c. Second, we solve the same problem, isolating vertex b. Find the minimum cut separating vertex b from vertices a and c. A partition on the graph into vertices on the b side and the vertices on the a or c side. Third, similarly, you compute the minimum cut separating vertex C from the other two vertices. The other two terminals. And now what do you do? You take this three cuts, you choose the two cheapest ones and you output their union. Here it is. In this case, it turns out that the red and the purple cuts are the cheapest of the three, and so, this is your solution. Okay, so that is the algorithm. Very simple. Now, is this algorithm a working algorithm? Can it be solved in polynomial time? Let's see. We need to find as a sub routine a way to separate vertex a from vertices b and c in polynomial time. This can be done easily by a reduction to the mincut problem. Here is how. You create a new vertex called s. And other new vertex called t. You connect s to a with an edge of infinite weight. You connect vertices b and c to t, with edges of infinite weight. In this new graph, you solve the mincut problem, mincut separating s from t. This minimum cut will never use the infinite with edges because that could be too costly. So it will use only edges from the inside of the graph, and so it will give you a solution that has a on the same side as s. b and c on the same side as t, therefore it will separate a from bc. In fact, the cuts are in exact one to one correspondence. The cuts from s to t and from a to bc. So, we can solve this in polynomial time. Is the output correct? Is it a three way cut? Does it separate the three terminals from one another? Well, let's see. Here we take the red and the purple solutions. The red cut separates vertex a from both b and c. So now, there's no connection from a to b and no connection from b to c. There could still be connections between b and c, but thanks to the purple cut that separates c from the other two vertices, c is disconnected from b. And so the solution is a three-way cut. It is a correct solution to the problem. Finally, how good is it? What is the cost of the output? We said over three cuts separating a from bc, b from ac, and c from ab, the output cost is the two cheapest of these three terms. So its cost is at most, two-thirds of the sum. The cost of the output of this simple algorithm is at most two-thirds times the sum of the mincuts. So, we have upperbounded the cost of the output of the algorithm. To continue with the analysis as usual, what we need to do is lowerbound the cost of OPT. Consider OPT. OPT is greater than or equal to the mincut from a to bc. Why? Because the absolution separates all three terminals from one another. So it does separate terminal a from the other two terminals. So it is a feasible cut from a to bc, and therefore, the value of OPT is at least the mincut from a to bc. Similarly, it's at least the mincut from b to ac. And at least the mincut from c to ab. So, if it's at least each of these three terms, then it's at least their average. At least one-third of the sum of the mincuts. The output is at most two thirds of the sum of the mincuts. OPT is at least one third of the sum on the mincuts, therefore the algorithm is a two approximation. We have proof that our solution is a two approximation. Can you do better? You can do better if you do your analysis a little bit more carefully. Let's see. Let's go directly to an extension for k general. We have k terminals, a1, a2, ak. The natural extension of the algorithm is to output the k-1 smallest of all k mincuts where the ith one separates terminal i for all the other terminals. Again, here again, this can be done in polynomial time, because to compute the mincut separating terminal ai from the other terminals, we connect, or solve s with an infinite edge to ai. Sync t with infinite edges to all the other terminals. And then we solve mincut from s to t. So this is a polylimal time algorithm that gives you k way cut, a meltaway cut. And what we need to do is analyze it's value. Let's see. Similarly, we output the k- 1 smallest of k cuts. The cost of this output can be upper bounded by what it would be on average if we had a random choice of k- 1 cuts among k. And so, the cost of the output is at most k minus 1 over k, times the sum of the mincuts. The sum of over every i of the value of the mincut separating terminal ai from all of the other terminals. That's the upper bound on the cost of the output. Now let's do a lower bound on the cost of OPT. Let's analyze OPT. The optimal solution, I claim, defines some cuts separating for each I, terminal AI, from all the other terminals. Indeed consider an edge e, which is part of the OPT solution. Why is e part of OPT? Because if we left e in the graph, then two terminals would be connected. There would be two terminals that would be connected. That's why OPT has to remove edge e. So it must be that, edge e, once we remove it, it separates some terminal ai from some terminal aj. Next, put this at e at a set of edges F(I) and in a set of edges F(J). Each edge of OPT this way, we put it into two sets, F(I). In F(j). If this edge connects, if edge e connects a vertex which is in the cluster of ai, with a vertex that is in the cluster of terminal j. That is the definition of F(i) and F(j). In other words, define F(i) as the set of edges of OPT that separate terminal ai from some terminal aj for j different from i. Every edge e in OPT belongs to two sets, F(i) and F(j). So, if we sum the cost of all these F(i)'s, every edge appears twice. So, we get exactly 2 times OPT. So this gives us a way to express the value of OPT, 2 times OPT is equal to the sum over i of the cost of F(i). Moving on. How do we analyze the cost of F(i)? F(i). F(i) consists of all the edges of OPT that separate terminal ai from some other terminal. So F(i) separates ai from the rest of the terminals. And so the cost of F(i) is at least the minimum cost to separate ai from the rest. So the cost of (Fi) is greater than or equal to the Mincut. Mincut of ai from the other terminals. Let's do this, take this and sum now. When we sum, we get 2 times OPT, which is the sum of the cost of fi is at least the sum over i of all the min cuts. 2 times OPT is at least the sum over i of the mincut separating terminal ai from the other terminals. Now all we have to do is combine this with our upper bound on the cost of the output of the algorithm. The cost of the output is at most k minus 1 over k times the sum of mincuts. That, we proved. Now this sum of the mincuts is at most 2 times OPT. So what does this give you? It shows that the algorithm is a 2 times 1 minus 1 over k approximation. The algorithm is an approximation with an approximation ratio of 2 times 1 minus 1 over k. K = 2. What does this give us? 2 times 1 minus one-half. That's 1. 1 times OPT. The algorithm is optimal. Indeed, it's just computing a mincut. K = 3. 1 minus 1 over k, that's 1 minus one-third. That's two-thirds. 2 times two-thirds, this is a four thirds approximation. So actually, the knowledge assessed earlier was not right. Actually, by doing this more careful analysis, we were able to prove that the algorithm is a four thirds approximation for k = 3. And a 2 times 1 minus 1 over k approximation for k general. Now what we're going to do is we're gonna try to design a better algorithm relying on an interesting LP relaxation.