0:00

[MUSIC]

To round off our first week of this MOOC,

I'd just like to give you at this stage, a road map, a destination of travel.

What are the sorts of things you're going to be looking at in the rest of

the course?

Now up till this point, we've not really gone into any heavy probability and

statistics.

You've not really been looking at a formula, and really calculating things.

But this was all about to try and sort of get you into the right kind of mindset,

in terms of thinking about uncertainty, and hence our requirement to try and

quantify that uncertainty.

So going forward, we clearly need to pick up on the probability and

statistical theory a little bit more.

As well as seeing many applications.

So moving forward, in the next week,

you'll be looking at quantifying uncertainty using probabilities.

Now you've seen a little taste of this in that Monty Hall problem at the very

beginning.

We looked at coming up with a couple of probability distributions for

winning the sports car.

But how do we actually come up with these probabilities?

How do we place a numerical value, an assessment,

a quantified assessment about how likely something is to occur?

Well, there are different ways to coming up with probabilities.

It could be a subjective guess.

It could be through experimentation.

It could be using some theoretical considerations.

So in the second week, you're going to look at how you can actually attach

numerical probabilities to particular outcomes.

1:36

Also, you're going to see a more of a formalization of that Monty Hall problem.

So it's all very well having, let's say, a probability distribution of outcomes

as a starting point, but what happens when you receive new information?

When you see that box or door B being opened, and

a goat being revealed, you learned something.

So how would you incorporate what you learned into your probability

assessment of different outcomes?

Well, you'll get to see this form of so-called Bayesian updating.

And see more formally how we can update our probabilistic beliefs,

as we receive new information, new data about the world.

2:18

And in the previous session when we're introducing the concept

of assumptions in models, I mentioned this normal distribution.

While I'd like to formally introduce you to that normal distribution, and

why it's so important, arguably the most important distribution in statistics.

But at the same time consider more of a distribution zoo, because there's

not just the normal distribution, there are many others as well.

Each designed to model different characteristics or

different phenomena in the real world.

So we're going to have a bit of a tour of the different kinds of probability

distributions out there, as well as the relationships which exist between them.

Going on, we then look at some simple descriptive statistics.

Trying to describe the world in the statistical way, in the statistical sense.

3:12

I mentioned in a previous recording about this sort of flash crash.

There was a dramatic depreciation in pound sterling.

So if you look at some financial time series data.

Be it an exchange rate, be it a share price.

How could we summarize that sort of graph, that you may see?

Well, there itself is a summary, some examples there of data visualization.

But we also want to summarize things numerically as well.

What is the average level of something?

What is the average income, say, in a country?

What is the expected return on a particular stock?

So we'll introduce our so-called measure of central tendency, mean,

median and mode.

Now these are useful summary statistics, but by no means are they exhaustive.

Now if you look at two stocks, they may have the same expected return, but

the risks associated with them could vary considerably.

So we're going to need various measures of dispersion, measures of spread.

So things like standard deviations and

variances, they're going to be formally introduced too.

4:24

So having considered those simple descriptive statistics,

we need to move on to the domain of statistical inference.

Indeed data itself, where does it come from?

How do we collect data?

How much data is going to be collected electronically?

But if, lets say, we wanted to consider people's attitudes or opinions

to something, then we're going to have to go into some kind of survey design.

If you want to conduct an opinion poll to predict who's going to win some election,

say, well,

that data's going to have to be done through an appropriate form of sampling.

So the idea is our achieved goal is to come up with a representative sample.

Sounds easy enough, but perhaps easier said than done as we shall see.

5:09

So having considered where the data could have come from, or we could then use this

to perhaps estimate some characteristics of interest.

Based on a sample of voters I may wish to estimate the level of support for

a particular political party, or maybe a Presidential candidate.

Estimates though have a lot of uncertainty attached to them.

They're effectively a best guess, and those best guesses could be wrong.

So if we wanted to make decision base on our estimates of

certain real world phenomena, we would ideally like some

sense about just how uncertain those estimates are.

And hence how much weight should we put on these particular estimates or forecasts.

So some issues of point estimation trying to estimate a single value

moving into interval estimation.

Breaking down or

extending your best guess into a range of likely values of something.

Now what might that something be?

It could be the level of support for a political party, or

maybe its the estimation of something called a parameter in one of your models.

More on that to come.

6:23

Moving on, we then consider something called hypothesis testing.

And this is where you will see more about the to p or not to p.

So in the introductory recording, I've mentioned briefly about how people

come up with theories or claims about some aspect of the real world.

Now we already only limited by our imaginations about

what kinds of theories and claims could be made.

And we'll see some examples at the appropriate time.

But when you have some theory, some claim, should you believe it?

Well, I'm someone who cares about evidence.

And if someone says something and

make some claim, I'd like to see some evidence to back up that claim.

Alternatively, maybe we can find evidence to refute some claim.

So how do we do that?

Well, hypothesis testing assists us, and we'll see a few examples

about how we can make some decision based on the outcomes of our hypothesis testing.

But please be conscious that really, we don't have these certainties in life.

And even when we draw a conclusion from a hypothesis test,

we need to recognize that we could be wrong.

I don't offer you certainties, we know it's an uncertain world.

What I can offer you, is to try and make the optimal decisions in a probabilistic

fashion, such that, sometimes you will be unlucky and conclude something

based on the evidence, but maybe it turns out a wrong judgement call.

But hopefully,

you'll take away from this course that you can be right far more often than you

are wrong, such that in the long run you tend to win far more often than you lose.

And then, the final week of this course is just to sort of peak your interest

a little bit further about how you could take your probability and

statistic studies to a greater level.

Whereby we consider a variety of real world applications,

which will draw in sort of various degrees and

aspects that we see in the earlier weeks in the course.

And we'll see a broad cross-section of applications, which I would urge you to

then, consider in a greater depth and take further courses on them longer term.

So that's sort of our direction of travel.

So I very much look forward to seeing you in week two, where we proceed

to looking at quantifying uncertainty through the use of probabilities.

[MUSIC]