0:13

We have a full scientific understanding of the

properties of these algorithms, and they've been developed

as practical system sorts and application sorts that

have been heavily used over the past 50 years.

In fact Quicksort, which we'll consider

next time, was honored as one of the top

10 algorithms of the 20th century in science and engineering.

0:57

The idea is very simple.

What we're going to do is divide an array into two halves.

Recursively, recursively sort each of the halves.

And then merge the result.

That's the over-view of Mergesort.

1:18

John Von Norman realized that the development of

the EDVAC, his EDVAC computer, one of the first

general purpose computers that is going to need

a sorting method and he came up with Mergesort.

He's widely accredited as being the inventor of Mergesort.

1:49

So, we've got an array A and its first half is sorted and its second half is

sorted and the computation we need to perform is to replace that with the sorted

array where those two sub-halves are merged together.

Let's look at a demo.

[COUGH]

The method that we're going to use is based

on taking an auxiliary array to hold the data.

This is a, one of the easiest ways to implement the merge.

So the first thing we do is copy everything over to the auxiliary array.

2:27

Now, once that's done, what we'll want to do is copy

back to the original array to get it in sorted order.

In order to do that, we're going to maintain three indices.

I, the current entry in the left half, j, the current entry on the right half and

k, the current entry in the sorted result. [COUGH] so the first thing we

do is, take the smaller of the two entries pointed to by i and j, and compare

those, and take the smallest one, and move that one to be the next item output.

And whichever one is taken, we increment its pointer.

3:10

Now we compare the minimum again, again, the one pointed group

by j is smaller, so we move that one to k.

Increment that pointer j and also increment k.

3:50

Now the one pointed to my i, the G

is smallest so move that and increment i and k.

Move the M up and increment i and k. Now the last element in the left sub array

is the one that's going to get moved next. And now

that first subarray is exhausted so really all we need to do is take

the rest of the elements from the right part and move them back in.

4:22

That's an abstract in-place merge for taking the two sorted sub-halves

of an array using an auxiliary array, move them out, and

then put them back in in sorted order. Alright, so here's the

code for merging, which is quite straightforward from the demo.

4:42

We first in order to sort an array of

comparables in this implementation we pass a link to

the auxiliary array, in as well. And we have three arguments

lo, mid, and hi.

So lo is the first part of the array to be sorted.

Mid's the midpoint that divides the first

part from the second, so our conditions are

that from lo to mid is sorted, and from mid plus 1 to hi is sorted.

5:21

And then that sets up for

this four loop that accomplishes the merge.

We start our I pointer at the left heart on the left half.

The J pointer on the left part of the right half.

That's mid plus one.

And we start the k Pointer at the beginning lo.

And for every value of k what we're most

often doing is comparing whether aux of j is less than aux

of i.

5:56

If it's greater we move the element i over in increment i.

And then in both cases, we increment a,

not imple, increment k, and that implements the merge.

If the i pointer is exhausted, then we just move over the j, next jth element.

If the j pointer is exhausted

we move over the next ith element. So every time we're moving

a new element into k and that's the code that impelements the

abstract in place merge. Now with this code,

we're also introducing the idea of making assertions just to make it

easier to debug our code and to have confidence that it's correct.

6:40

In this case, this insertion just says we want to

be sure that a of lo to mid assorted and that mid plus

one to high is sorted before our code and then we

want to check that, the whole thing is sorted after our code.

6:59

And generally programmers, Java programmers know that it's a good idea to

try to do these assertions.

Not only does it help detect bugs, but it

also documents what the code is supposed to do.

And that merge code is a good example of this.

If you put at the beginning of the code what you expect

in the, in the form of an assertion, which is code itself.

And you put at the end of the code what you

think it's going to do, again in the form of an assertion.

You're both testing that these conditions

hold, and also telling someone reading the code, what you're trying to do with it.

7:36

So Java is just an assert statement. It takes it, boolean condition.

In this case, we're using that method is sorted that we were before.

That returns true if the ported is sorted and false if it's not.

And what assert will do is it will throw an exception

unless that condition is true.

7:58

Now the thing about assertions in Java is

that you can enable or disable them at runtime.

And that's really important, because it means you can

put them into your code to check while developing.

But it doesn't incur any extra cost at all in production code.

So by default, insertions are disabled.

Something goes wrong

8:32

So, the best practice is to use insertions just as we did in that

example with merge and to assume that

they're not going to be there in production codes.

You shouldn't use them for the things like checking

if the input is the way you like it.

Alright, so with that merge implementation, then

the sort implementation is a quite simple, recursive procedure shown here.

8:57

So we use the merge procedure we just showed, and then our sort procedure.

It's recursive so, checks that we have something to do first.

Then it computes the value of the midpoint same way as we did

for a binary search. Sort the first half.

Sort the second half, and then merge them together.

9:30

Now, it's important to not create the auxiliary array in the re in

the recursive routine because that could lead

to extensive cost of extra array creation.

And you'll sometimes see Mergesort performing poorly because of that bug.

Otherwise this is a very straight forward implementation.

And it's actually a prototype for algorithm design

that we'll see come up again and again.

It's called divide and conquer.

Solve a problem

by dividing it into two halves, solving the two halves,

and then putting the solutions together to get the appropriate answer.

10:10

[COUGH] here's a trace of what Mergesort does and if you haven't studied a

recursive program before it's worthwhile studying this thing in, in some detail.

This gives exactly what happens during each of the calls to merge.

We start out with a big problem to solve but we divide it in half,

then we divide that one in half, and then we divide that one in half.

And the very first thing

that we actually do is just compare

and exchange if necessary the first two elements.

And then we do the same thing for the next two elements.

Then merge those two together to get the first four done.

And then we do the same thing for the next four in the array.

So now we have two sorted sub-arrays at size four.

And we merge those together to get one of size eight.

And then we do the same thing on the right,

and eventually we have two eights that we merge together

to get the final result.

Very instructive to study this trace to

really understand what this recursive algorithm is doing.

11:59

So you can run a Mergesort on huge problems.

It's a very efficient algorithm. And so, for example,

what this table shows, if you were to try to use a insertion sort for a huge

file, say a file with a billion elements, on

your PC it'd take a few centuries to finish.

12:23

Even on a super computer, if you're using insertion

sort nowadays it'd maybe take a week or more.

But if you have a good algorithm like

Mergesort, and you're trying to do a billion items,

you can do it in just less than half an hour on your PC.

And a supercomputer can do it in an instant.

And smaller problems only take an instant even on your PC.

So you can spend a lot of money or a lot of time, or you can use a good algorithm.

And that's one of our main themes in this course.

A good algorithm is much more effective than spending

money or time wasting money or time using a bad one.

13:06

So let's look at the analysis of Mergesort, that's a bit of math but very

instructive because this really shows the power of the divide and conquer method.

And allow us to take a problem that was taking us quadratic time with methods

like insertion and selection sort, and get it done in, in log N time with Mergesort.

So that's the proposition Mergesort uses at most N lg N

compares and 6 N lg N array accesses to sort any array of size N.

13:42

And the way to prove this proposition is to from

examining the code, to write down what's called a recurrence relation.

And all that is, it's a mathematical reflection of what's going on in the code.

If we're sorting N items then let C of N denote

the number of compares that we need to sort the N items.

14:08

In order to get that done, we're sorting the left half and the right half and

this notation ceiling of N over 2 and floor of N over 2 that's the N over

2 round up and N over 2 round down, that's the size of the two sub-arrays, and

we're going to call the same routine for that

size, so the number of compares you need to.

For that is C of N over 2, ceiling of N over 2

for the left and ceiling of, floor of N over 2 for the right.

And then for the merge, we need at least, at most N compares.

14:41

If neither one exhausts, we need exactly N compares.

And so and that's true as long as N is bigger than 1.

If there's only one thing, we're not

doing any compares at all. So this is a mathematical formula that we

derive by examining the code but it completely describes mathematically

what we an upper bound on the number of compares that are going to be needed.

And similarly for the number of array accesses, if you count up the number of

times you're accessing an array for a merge you could be at most six in.

15:40

So if you have this recurrence [COUGH] which

is similar to the ones that we're talking about.

It's exactly the same when N is a power of 2 let's, let's look at this one.

If

D of N is 2D of N over 2 plus N with D of 1 equals 0, then D of N equals N log N.

We'll look at three proofs of that, just assuming that N is a power of 2.

If N is a power of 2, then N over 2

is also a power of two, so the recurrence makes sense.

16:11

So this is just a graphical representation if we

want to compute D of N we want to compute D

of N over 2 twice. So that's 2 and then the extra cost

for the merge is N, but if we're going to do this twice then we have 2N over 2.

So let's, we have 2N over 2s and then for

each one of these we have divided into N over

4s and each one of those 4N over 4s has an extra cross for the merge of N over 4.

Well 2N over 2 is N, 4N over 4 is N and we keep going down, doing that til

we get down to D of 2 and we always for the extra cross for the merge, we have N.

And how many stages do we have here?

Well, it's the number of times you divide N by 2 to get down to 2.

That's exactly log base 2 of N, so the grand total of

all the costs for the merge, which is where the compares are,

is log N times N, N log N. It's kind of a graphical

proof or a proof by picture that that recurrence has that solution.

17:32

So then this is D of N over N equals D of N over 2 over N over 2 plus 1.

So it's dividing by N.

So now, this is a recurrence that telescopes.

The first term on the right hand side is exactly the same

as the left hand side so we can apply the same formula.

And all it does is divides by 2 again and then throws out another 1.

And we keep doing that until

we get down to D of 1 which is 0.

And when we've done that, we've thrown out lg N 1s.

So we get D of N over N equals log N, or D of N equals N log N.

That's another proof by expansion.

18:14

Or using either one of those techniques you could just get the idea that D of N is

close to Log N or you can write a program to expand the recurrence and find that.

And then once we have the idea that D of N

equals N lg N, we can plug back into the original formula.

With the inductive hypothesis that D of N equals

N lg N, we want to show that D of 2N

equals 2N lg 2N, using the recurrence D of 2N equals 2D of N plus throw out the 2N.

Plugging in N log N we get the desired result.

We use this same idea on our initial recurrences

for comparison array accesses to show that the running,

the number of comparison array accesses is proportional to N log N for Mergesort.

So that's the running time Mergesort is fast

other thing that we usually want to know is memory.

And one of Mergesort's characteristics is that in

practical applications, it uses extra space proportional to N.

That is, we need that extra auxiliary array for the last merge.

19:51

And search and selection in shellsort are in place, they don't use any extra memory.

But Mergesort you can only sort really half of what you can

fit in memory, because you need that auxiliary array for the other half.

If you want, again, if you think that the things we're studying

are easy, think about the idea of actually doing an in-place merge.

People have come up with methods for getting this done.

So it's theoretically possible, but the methods are

generally too complex to be useful in practice and

their not used.

But there could be out there some easy way to doing in place merge.

That's another great algorithm waiting to be discovered.

Now there's a, a number of practical improvements that we can

use to make Mergesort even more efficient than the simple one

that we've looked at and we'll take a look of those

because they're examples of techniques that we can use for other algorithms.

First thing is that Mergesort is too complicated to use for tiny arrays.

So say the subarrays are only of two, or three, or four there's too

much overhead with the recursive calls and so forth to get that done efficiently.

And what's worse is, the recursive nature of the sort definitely

means that there's going to be lots of subarrays to be sorted.

So, one improvement that we can make is to use insertion sort, and just

cut off and use insertion sort which is simple and efficient for small subarrays.

So that's adding this one line of code to Mergesort will make it quite a bit faster.

Maybe 20% faster.

21:29

The second improvement that we can make that'll improve the performance

for cases when the array is partly sorted, is to just

stop if it's already sorted.

And that's going to happen in the case where the biggest element in the

first half is less or equal to the smallest item in the second half.

That means it's done.

So that's easy.

We just put a test in the recursive Mergesort for that,

through this one line of code, to check whether we're done.

That way, for example, if you were to call

Mergesort for an array that's already in order it would

just do this test every time and it would be done in linear time.

That's pretty helpful although not, not totally helpful

but there's a lot of situations where that's helpful.

22:18

The other thing that's possible to do and it's a little mind bending so recommended

only for experts. Is to save a

little bit of time you don't really have to copy over into the auxiliary array.

You can kind of switch the role of the input

and the auxiliary array every time you make a recursive call.

You still need that array but

you can set up the code in this way which [COUGH]

sort, to sort an array, put the result in the other one.

To merge an array, put the result back in the first one.

So it's recursive argument switchery to get the job done.

And it's effective, it means you don't have to actually

move items, and that saves a little bit of time.

23:06

So here's a visualization

of what the practical Mergesort might look like, and this is with

big cutoff to small subfiles. So you

got a visual feeling of how this sort gets the job done.

So it's the first little chunck and then the next little

chunk and then merges those together, and so forth and so on.

It's a good visual representation of

how Mergesort gets its job done. That's the basic Mergesort algorithm

that we're going to look at different versions of in the next.