0:06

Now we will look a Kruskal's algorithm for computing the MST this is an old algorithm

Â that dates back over 50 years but it is an effective way to solve the problem.

Â The idea is very simple. We are going to take the edges and we are

Â going to sort them by weight. And we are going to consider ''em in order

Â of ascending weight. And the algorithm is simply add the next

Â edge to the tree. Unless that edge would create a cycle.

Â If it, if the edge creates cycle, ignore it and go on to the next one.

Â Let's look at a demo of Kruskal's algorithm.

Â Okay. So we have the edges sorted by weight and

Â we're going to take them in ascending order of weight.

Â The smallest edge is in the MST. And so we add that to the tree it doesn't

Â create a cycle. Next smallest edge is two, three.

Â That doesn't create a cycle. Next one is one, seven that's fine.

Â Next one is zero, two. Next one, five, seven.

Â Now when we get to one, three that one creates a cycle and the algorithm says,

Â creates a cycle just ignore it. Next one is one, five that one also

Â creates a cycle. one, five, seven. Again, just ignore it.

Â Two, seven creates a cycle not in the MST. Four, five that, the next smallest edge

Â does not create a cycle so we add it to the MST.

Â One, two creates a cycle. Four, seven creates a cycle.

Â Zero, four creates a cycle. And finally we get to six, two which does

Â not create a cycle, so we add that to the tree.

Â At this point we have an MST we could check that we have V - one edges and stop

Â or we can just keep going and see that the other edges in the graph create a cycle.

Â And when we're done considering all the edges then we have computed the minimum

Â spanning tree. That's with Kruskal's Algorithm.

Â Consider the edges in descending order of weight, add the next edge to T, unless

Â doing so would create a cycle. Now let's look at what Kruskal's algorithm

Â does on a very large graph. You get a good feeling for what it's

Â doing. It's taking the small edges and they

Â coalless together in little clusters and eventually the edges get longer and longer

Â and they connect together the clusters. Initially it is broken up into the very

Â small edges and after a while if you look at the larger clusters eventually an edge

Â will come that will connect them together. It's not going to be so easy to see how

Â the algorithm ends. Though there's some, long edge there.

Â That's the longest edge in the MST that'll finally connect the graph.

Â 3:08

This simulation really gives a good feeling for Kruskal's algorithm.

Â In fact when we first had personal computers I was fortunate to be in Xerox

Â Park. And I remember very well in the late 70s,

Â This was one of the first algorithms that I wanted to see visualized on a personal

Â computer. And I, I wrote a program at that time that

Â produced pretty much this image it's a very interesting way to look at MST

Â algorithms. Alright, so it's easy to implement we have

Â to prove that it computes the MST of course I and we've done a lot of work

Â setting that up. And.

Â 3:54

We do so just by proving that its a special case of the greedy MST algorithm.

Â So what does that mean? Well.

Â It, suppose that crosco's algorithm. Colors a given edge black.

Â So say it's VW. So we'll define a cut.

Â That, is the set of vertices that are connected to V.

Â So it might be just a V, but if theres any black edges connecting V to other vertices

Â we put all of those in the cut. So for that cut there's no black crossing

Â edge. Cuz it's a, it's a component.

Â And the other thing is that there's no crossing edge with lower weight than VW.

Â And why is that? Well we haven't examined any of those

Â edges yet, and they're all longer because we are considering the edges in increasing

Â order of their weight. So this new edge that connects V somewhere

Â else is a crossing edge of minimal weight. And so therefore the algorithm always

Â finds a cut and colors a crossing edge of minimal weight for that cut black.

Â And it's an instance of the Greedy Algorithm.

Â Now we still have a bit away from having an implementation of Kruskal's algorithm.

Â So the question is, how do we know whether adding a new edge to the tree will cause a

Â cycle? So, you might think about how difficult

Â that is. I've, I've got a tree, that's represented

Â as the set of edges or however. And I have a new edge, and I want to know

Â whether it creates a cycle or not. How, how difficult is that going to be?

Â Well, it turns out way back in the first lecture of algorithms part one we had a

Â way. What one thing we could do is just run DFS

Â to check if we can get to W from V. That'd take time proportional to V.

Â But the union find data structure that we did in the first lecture absolutely does

Â the job. It's just testing whether this edge

Â connects anybody in, in the equivalence class corresponding to V with anybody in

Â the equivalence class corresponding to Y, for to W and if it did, it creates a

Â cycle. So union find is exactly what we need.

Â 6:36

So we're going to what we're going to do is maintain a set for each of the

Â connected components in the spanning tree, that we built up so far and so if the new

Â edge connects vertices that are in the same set, then that means it would create

Â a cycle. And otherwise we're going to be adding the

Â edge to the tree. So then we just do the union operation and

Â merge the set containing V with the set containing W.

Â Those are the two things that we need to do and those are the two things that union

Â find does for us. So this leads to a very compact

Â implementation of Kruskal's Algorithm. Now to consider the edges in order, we'll

Â use a, a priority queue, a modern data structure.

Â So let's take a look the, the every line of this implementation of Kruskal's

Â algorithm. We are going to have a queue of edges.

Â That's how we are going to represent the MST.

Â It could be a stack or bag, but, but we'll use a queue.

Â And then, the, Cruskill MST algorithm takes a graph.

Â And it's going to compute the MST in this MST.

Â And to do that, it's going to use a priority queue, a min priority queue.

Â A minimum oriented priority queue of edges.

Â So we'll build that minimum, priority queue.

Â Now edges are comparable. We had a compare to a method so our, our,

Â 8:16

generic priority queue code is going to work fine.

Â And so, what we'll do is we'll put all the edges in the graph into the priority

Â queue. So that's building a priority queue.

Â Containing all the edges in the graph. Alternatively, we could put the edges in

Â an array and sort the array, but priority queues.

Â A more elegant way to express this, and is a way to look at more efficient algorithms

Â as well. Okay, so that's priority queue, so that's

Â the first data structure from part one that we're going to make use of.

Â And then the second one is the union find data structure.

Â So we're going to build a union find data structure with for vertices because the

Â spanning force divides the vertices into equivalence classes.

Â So then we go into the main loop. And the main loop stops in two conditions.

Â We run out of edges is one, where we'd have a minimum spinning force.

Â Or the other condition is we get to V - one edges in the MST.

Â So as long as neither one of those is true, then we've got an edge in the

Â priority queue, and we want to take the smallest one.

Â We want to get its vertices V and W either and other.

Â And then we want to check using the union find algorithm if they're connected.

Â If they're connected we don't want to do anything, if they're not connected then we

Â want to merge them, and put that edge onto the MST.

Â Thats it. And then when the that's the full

Â implementation and then we have the edges method for the client to return the MST.

Â It's quite a compact implementation of a classic algorithm.

Â Using the basic data structures that we built up in algorithm part one data

Â structures and algorithms of priority queue.

Â And union find gives a, a, really fine implementation of this MST algorithm.

Â So what about the running time of this algorithm?

Â Well it's not hard to see that it's going to compute the MST in time proportional to

Â e log e. It's linearithmic in the number of edges.

Â Which means that we can use it for huge graphs.

Â So this is just, this table is just a proof that, summarizes the costs.

Â We're going to first build the priority queue.

Â And, we can do that in linear time using bottom-up, construction method.

Â We have to, every edge comes off the priority Q.

Â So there's E of them, and it takes log E per operation.

Â Snd then union operations, every vertex is involved in one, and connected operations,

Â every edge is involved in one. So the total time is dominated by the E

Â log E for the priority queue operations. One thing to note is that, if the edges

Â come in, in sorted order, for some reason, it's almost linear.

Â The order of growth is E log star of E. And, actually we don't have to always,

Â sort'em. We can, in real life situations, we can

Â stop when we get the V-1 edges in the MST. When we have those v - one edges we can

Â stop and usually that's for practical situations that's way before we see all

Â the edges in the graph. So that's Kruskal's algorithm for

Â computing the MST.

Â