0:18

In the theory of algorithms, when they're looking at upper bounds

on worst case performance, they use these notations big O,

big omega, and big theta to try to capture the order of growth, and

that's what they use for classifying algorithms.

If g(N) = O(f(N)), it means that the ratio of g(N) to

f(N) is bounded from above as N goes to infinity.

And if it's omega, it means it's bounded from below.

And if it's theta, it means that it's bounded from both above and below.

So this one says there's a constant such that g(N) is less than that

constant times f(N).

Now, this one says there's two constants that it's in between.

So that allows classification according to functional growth,

as I mentioned, merge sort is N log N, and quick sort is N squared.

1:14

So, that's the notation that you often see

throughout the literature describing the performance of algorithms.

But as I mentioned, big O-notation is dangerous,

and I'll have more to say about that in just a minute.

So it's not scientific to use the big O-notation to try to compare algorithms.

If you say the running time is big O(N to the c),

that's not of any use for predicting performance.

It's an upper bound on the worst case.

The actual performance may be much better than the worst case.

And it could be that even the actual bound is less than what’s given

by the big O-notation.

It's fine for a first cut of classifying algorithms, but not useful for comparing.

What we use instead is what’s called the tilde-notation, and so

what we'll typically say is the running time of an algorithm is ~,

a constant, times, say, some function of N, where N is the input size.

That does provide an effective path for predicting performance, and

I'll show some examples of that later on.

So we don't use the common big O, big theta, omega notation very much,

except in a specific technical sense that I'll talk about later on,

when we talk about asymptotic approximations.

So [COUGH] big O-notation is useful for a lot of reasons,

and it dates back a few centuries, and we do use it in math.

But it's a common error to thing that the big O-notation is useful for

predicting performance.

And I just want to make sure to nip that problem in the bud right away.

So this is what often happens to me when I give talks around the world on this topic.

Typical exchange in, say, the Q&A for my talk,

depending on how formal it is, I'll say okay,

big O notation is dangerous, you can't use it to predict performance.

And somebody will ask or shout out, but

an algorithm that is big O(N log N) surely beats when it's big O(n squared).

And then I trot out and say the quick sort, merge sort example and say well,

not by the definition, big O is just an upper bound on the worst case.

And then they'll say, well, so use the theta notation,

which says that it's in-between.

And I say well, that maybe gets rid of the upper bound,

but you're still typically bounding the worst case.

Is your imput a worst case?

And even with that logic and we've got a compelling example like quick sort versus

merge sort, usually what happens is the questioner whispers to one of his

colleagues well, I'd still use the N log N algorithm, wouldn't you?

[LAUGH] And actually, such people usually don't program much and

shouldn't be recommending what practitioners do.

But surely, we can do better than this.

That's part of what analytic combinatorics is all about.

There's another idea that's out there as well, and

that's the concept of a galactic algorithm.

And I found this on a friend's blog, Dick Lipton,

who said let's define an algorithm that will never be used as being galactic.

And why will it never be used?

Because no one would ever notice any effect about this algorithm in

this galaxy.

4:46

Because any savings is going to happen for an input size so

large that it couldn't happen in this galaxy.

So, an example of a galactic algorithm, and

this one is actually maybe planetary or something.

It's actually close to the real world,

is Chazelle's linear time triangulation algorithm.

So the problem is to find a way to triangulate a polygon in linear time.

It was a surprising result and a theoretical tour de

force to prove that it was possible to solve this problem in linear time.

But the method is much too complicated for anyone to implement.

And if anyone did implement it,

the cost of the implementation would definitely exceed any savings,

until N is so large that it would take another galaxy to deal with it.

5:43

And [COUGH] this is an interesting situation.

I think one of the problems is that so many algorithms that

are out there that are being published in the literature in this category.

After Lipton introduced the concept,

one of the contributors to his blog estimated that something like 75 to 95% of

the papers in the theoretical computer science conferences are galactic.

And I think the problem is that practitioners aren't necessarily aware

that they're galactic, and they maybe try to use them.

When a relatively simple analysis would say, there's no point in a practitioner

taking a look at this, the paper should have asterisks on them or something.

So I think it's okay for

basic research to drive the agenda, and there's nothing wrong with

trying to find the algorithm with the best upper bound and worst case performance.

But we have to do something about the common error,

where people think that a galactic algorithm is actually useful in practice,

there's a lot of denial out there.

Here's another thing that often happens to me, this was an actual exchange

with a very prominent theoretical computer scientist a few years ago.

And he said in a talk, well, the algorithm A that actually is

a pretty straightforward algorithm that's in widespread use is a bad algorithm.

Google and other Internet providers should be interested in my new algorithm,

algorithm B.

[COUGH] And so in the question and answer, I said, well,

what's the matter with algorithm A?

7:41

By the way, we all know that log log N is less than 6 in this universe.

That's 2 to the 64th, and so if N is 2 to the 64th, it's less than 6.

And so not only that, it's just an upper bound.

Not only that, your algorithm is so complicated,

it's certain to run 10 to 100 times faster in any conceivable real world situation.

Why should Google care about algorithm B, as you said?

And then the response was, well, I like algorithm B, I don't care about Google.

And again, that's fine to do research for the intellectual challenge,

just don't say that as some practitioner, I should be interested in it.

Surely, we can do better than that.

8:25

So what I want to talk about specifically is the scientific approach,

say the modern rendition of what Knuth taught us.

That is used for many, many algorithms in the fourth edition of my Algorithms book,

which is co-authored with Kevin Wayne.

So this is described in a lot of detail with examples in Section 1.4 of the book.

Again, as Knuth said, we start with a complete implementation that we can test.

And then therefore, we'll be able to

make hypotheses about the performance of that implementation and test them.

And then what we're going to do is maybe try to avoid some of the detail in Knuth.

And we're going to analyze the algorithm by trying to find

an abstract operation that is in the inner loop.

That is, it gets executed more times than any other operations.

10:07

So, what we'll do then is, once we have the hypothesis,

we're going to validate the hypothesis by, first of all,

we want to generate some large inputs according to the model.

So, for example, we'll look at sorting algorithms,

where the model is that the things are randomly ordered and distinct.

And so it's easy to write a program to generate large,

randomly ordered, distinct files.

And then, we can just run the program for a large input to calculate a.

And actually, we can even skip that step.

I'll show you that in a minute.

But we definitely can get a that way, an estimate of a.

And then that gives us a model that we can use to predict performance for

even larger inputs and check our hypothesis.

11:12

That's actually the hardest part that's often overlooked nowadays.

And then as in any application of a scientific method,

we'll refine and repeat and revise the model and

the algorithm as necessary, or as we learn much about the problem.

As I mentioned earlier, one of the great things that happens with this process is

that often, we learn things about the algorithm that

enable us to develop improved versions by doing this.

We have many, many examples of this for sorting and searching algorithms and

graph algorithms and string processing and data compression.

And many, many other applications, where we successfully apply this

approach to develop good implementations and understand their performance.

Now what this course is going to be all about is this analysis of the frequency

execution.

That's where the math is.

And so, but I want to give this full context,

so people can understand why we're going through this trouble.

12:44

So that's the notation, so we're just using tilde and that's good.

That simplifies things a bit, and

then we're just left with these basic components.

So, one of the basic components of algorithm analysis involves programming.

So, people who do analysis of algorithms need to be comfortable with

implementing algorithms, running them, and be able to at least count operations.

So you need a good implementation.

Now maybe somebody did the implementation, but still, you want to have experience

with it on your own computer to test various hypotheses that you might have.

Then we've got the mathematical challenge, and that's, again,

going to be most of our emphasis in this course,

where we develop a model, analyze the algorithm in the model.

And what we're going to talk about in this course is that really,

we need to do the math.

And that's the kind of math that I'm going to be talking about very soon,

in just a minute.

And then there's the scientific process of running the algorithm to solve a real

problem and checking for agreement with the model.

And so you can't do that,

you can't really compare two algorithms until you've done all the rest of this.

14:11

Now there's definitely potential drawbacks still

in [COUGH] the theory of algorithms,

it still goes places where we can't go.

Number one is that model might not have a realistic model, and

that's actually a challenge in every scientific discipline.

Now we do have a big advantage in computer science because we

can randomize to make the model actually totally valid, and

I'll show an example in just a minute.

So if we don't have a realistic model,

maybe we can get a realistic model by randomizing.

And that's a very powerful concept that if you don't get if you're doing science that

involves going to the moon or killing monkeys or something.

15:01

That's also a big challenge in any scientific discipline,

statistical physics, and many other areas,

new developments in mathematics were involved in order to do the science.

And that's certainly going to be it here, and really,

that's what analytic combinatorics is.

A calculus for analysis of algorithms is the true motivation for this course.

Thirdly, it might be that the experiments are too difficult

to really validate the model.

15:31

And again,

that's not much of a point compared to any other scientific discipline.

No problem for us to run an algorithm a million times, and we do it a lot.

Whereas if you had to feed a million mice, you might have some restrictions or

go to a million planets or whatever else.

But it is a road block for

a lot of people trying to study algorithms because actually, they're trying

to study galactic algorithms that might be way too difficult to implement.

And if you can't implement it, why try to analyze it?

And it's fine for the intellectual challenge,

but again, there are people out there that are thinking

that the algorithms that you're developing are useful in practice.

And really, they should be validated scientifically

before the poor working programmer is faced with them.

So that's an outline of our approach to the analysis of algorithms.