0:00

[MUSIC]

We round off week three with a further look at this concept of variants, and

I'd also like to extend our discussion from the previous

session on the normal distribution as well.

So, variants.

When we introduce this as S squared, a few sessions ago,

we were looking at the sample variance of a set of data.

Now what I'd like to consider here is the so called population variance, i.e.,

the variance of a theoretical probability distribution.

So, if we backtrack a little bit and think back to our week two,

where we introduced some simple probability distributions.

Let's return to the concept of the score on a fair die.

So remember for that, we had our sample space, the possible values which this

variable x could take, namely those positive integers, one, two, three, four,

five, and six, and we said, if it was a fair die, those six outcomes were each

equally likely, and we develop the probability distribution, and

we assign the probability of 1/6 to each of those six possible outcomes.

We then introduced the concept of the expectation of X.

We viewed this as an average, effectively a mean, but

this was a mean with respect to some population or theoretical distribution.

So remember, we consider the expectation of X as a probability-weighted average,

whereby we took each value of X,

multiplied it by its respective probability, and added them all together.

So there we found that the expectation of X, where X was the score on a fair die,

was equal to 3.5.

We also noted that this was never an observable value in any

single role of the die, rather we view this as a long run average.

So, distinguish the expectation of X, which we might denote by the Greek

letter mu, to indicate a population or theoretical mean, and

contrast that with the sample mean, X bar, which we've seen this week, which is

the mean just of a set of our observations drawn from some wider population.

So having introduced X bar, we then also considered the sample variance

S squared as a measure of dispersion, but again with respect to the sample.

So I think now we are in a position to work out the equivalent

concept of the variance, but at the theoretical level,

the so called population variance with respect to a probability distribution.

So remember, S squared, I sort of instructed it to you, and

to think of it like an average.

The average square deviation about the mean, and we had our formula,

of course the mean we're talking about here, was the sample mean X bar.

What if we want to work out the variants for a theoretical distribution?

We really want the same kind of concept i.e., we need an average i.e.,

an expectation of the square deviation about the mean.

So whereas previously our expectation was the expectation of X,

the expectation of that random variable, we still require an expectation,

but now the expectation of X minus mu, all squared, i.e.,

the expected squared deviation of X about the mean.

So just as E of X was a probability-weighted average,

the expectation of X minus mu,

all squared, is also a probability-weighted average.

It's just that now,

we don't multiply the X values by their corresponding probabilities, rather

we multiply the X minus mu squared values by their corresponding probabilities.

So let's revisit the score on a fair die and

calculate the true variance of such a score.

So we know the values of X are one, two, three, four, five and six.

We've already determined that the expectation of X,

which hereafter, we can denote by mu, was 3.5.

So for each value of X, for example, the 1, we subtract mu, so

(1- 3.5) square that value,

and do a similar operation on the other remaining five values.

We then multiply each of these by the corresponding probabilities of occurrence,

but of course, as this was an assumed fair die,

each of those scores has the same probability of occurence of 1/6.

So you multiply each on of these with 1/6, and

add them all together, and doing so you will get a total of 2.92,

and this represents the variance for the score on a fair die.

If you wanted to we could take the positive square root and

consider the standard deviation of the score on the fair die, but

do be conscious of the notation being applied.

So sigma squared will correspond to a population variance and

it's positive square root, Sigma the population standard deviation,

and to be clear conceptually between the distinctions of those,

which have been derived from theoretical probability distribution with

their sample counterparts of the sample variance S squared and

the sample standard deviation S.

Now we're going to make much more use of these different means, and

variances, and standard deviations as we progress to

the statistical inference part of the course over the next couple of weeks.

But perhaps just a nice way to round off our week three, is to revisit the normal

distribution, because now we have perhaps a clearer understanding about what mu,

the population mean, and Sigma squared, the population variance, represent.

So we mentioned in the previous section that really there's an infinite number

of different normal distributions, each characterized by different combinations

of values for those parameters of mu and Sigma squared.

Now it would be helpful if we could perhaps have some kind of standardised

normal distribution.

One where it's very easy to relate to.

Well such a distribution exists, called the standard normal distribution.

Now because this is so special, we will assign it its own special letter of Z.

So whenever you come across the letter Z, in sort of statistical courses,

think in term of standardized variables.

Now why on Earth are these things of any great importance to us?

Well, first of all, let's define what we mean by a standardized variable.

This is one which has a mean of 0, and variance of 1, and of course,

given the standard deviation as the positive square root of the variance,

if the variance is 1, by extension, so too, is the standard deviation.

So in notation we might say Z, as a standard normal variable,

is distributed as a normal distribution, with a mean of 0, that's the value for

mu in this special case, and a variance sigma squared special value of 1.

So why are standardized variables, of use to us?

Well, we've previously have mentioned the concept of an outlier.

Remember when we were comparing means and medians, and

which one might be a preferable measure of central tendency, what we did note,

that means, we are very sensitive to the inclusion of any outliers.

But, as yet, we haven't really offered any sort of formal definition of

what an outlier might be, other than it's a sort of extreme observation.

9:06

If we now extended this from not just one standard deviation from the mean,

but to two standard deviations of them mean,

that's now going to capture approximately 95% of the total area,

under the curve, and if we went one standard deviation further, and

hence considered the mean, plus or minus 3 standard deviations for

a normal distribution this captures about 99.7% of the total area under the curve.

Hence, it is very unlikely to observe a normal random variable,

an observation which is beyond free standard deviations of the mean.

But now let's consider this in standardized terms, i.e.,

we have a standardized variable, i.e., its mean is zero, and

its standard deviation is equal to one.

So if mu is zero, and sigma is equal to one on a standardized, i.e.,

a Z scale there's some very simple numbers we just need to remember.

So, for most sort of symmetric continuous distributions,

such as the normal, then we said,

within one standard deviation of the mean we get about 68% probability.

So on a standardized basis, with the mean of zero, and a standard deviation of one,

that equates on a Z set scale to being between -1 and +1.

So there should be about 68% chance of being between -1 and +1.

Now, extending it to being within two standard deviations from the mean,

well, on a standardized basis this means being between plus and minus 2,

and of course if we consider three standard deviations from the mean

that equates to being between up plus or minus 3.

So if we convert things to a standardized variable then immediately

we can decide whether an observation is an extreme value or not.

So for example, if we looked at the returns on a stock, or

maybe movements of an exchange rate.

Let's say a stock moved by 3.6% in a particular day.

Now is this a dramatic movement or a less dramatic movement?

Well, it's quite hard to judge just by considering that return of 3.6%,

because it depends in terms of the context of the distribution that what is its mean,

what is its standard deviation?

We would really need to know that to judge how extreme such a movement might be.

But if we now converted such a percentage change to a standardized basis,

i.e., take the original observation, subtract the mean,

divide by the standard deviation, and express it on a standardized Z

scale then immediately we can see whether or not we have an extreme observation.

Because then we simply compare that Z value to the range between minus one and

plus one, about a 68% chance of such an event occurring,

between minus two and plus two, about a 95% chance of that occurring, and

between plus or minus three, roughly a 99.7% chance of that occurring.

So if you were told that you had an event occurring on a standardized basis of let's

say four or five, so that means being sort of four or five standard deviations

beyond the mean, and this will correspond to an extremely rare event, and something

which we may wish to call an outlier, indeed perhaps an extreme outlier.

So if you're looking at comparing different variables,

which are measured on very different scales, in order to facilitate a sort of

easy comparison of them by doing this standardization transformation,

by getting them on to the same scale, it allows you to compare apples with apples,

rather than comparing apples with oranges.

So going into our week four of the course, we'll be doing more statistical

inference when we're going to be drawing on some of this theoretical knowledge.

So join me for that.

[MUSIC]