0:00

[MUSIC]

In this section, we consider the different types of inferential error which could

occur in hypothesis testing.

So remember, hypothesis testing is about making a binary decision

between two competing hypotheses.

Our so-called null hypothesis, H0, and our alternative hypothesis, H1.

So again, we'll just briefly relate this to the legal analogy we saw in

the previous section, before considering more formal statistical types of tests.

So, of our two hypotheses, H0, that the defendant is not guilty, and

the alternative H1 that the defendant is guilty.

These are mutually exclusive and collectively exhaustive.

Now we've heard that phrase mentioned previously when we were

looking at different types of random sampling.

So remember, mutually exclusive means only one of the things can happen at any one

time, and collectively exhaustive means at least one of the things can happen.

So with regards to our two statements, the defendant in a criminal trial is either

guilty or not guilty, they cannot be both, nor can they be anything else.

So, of course the defendant will know whether he or she is truly guilty or

not guilty.

The jury's who making the decision of guilt or otherwise does not know for sure.

They have to base their decision on some incomplete information.

Namely, their decision is purely based on the evidence presented to them at

the criminal trial.

Of course, the evidence is going to come from two sides,

the prosecution trying to convince the jury of the defendant's guilt.

There'll also be the defense's case trying to convince the jury that

the defendant is not guilty of the alleged offense.

So the jury will ultimately retire once the evidence has all been

presented in the court, and will return a verdict.

So remember, it's a binary decision.

So the jury will either return a verdict of not guilty or one of guilty.

Now with regards to our hypothesis testing a lexicon in language,

these decisions we will refer to as either not rejecting the null hypothesis or

rejecting the null hypothesis.

Note that in both cases, the decision is with the respect to the null hypothesis,

H0, rather the alternative hypothesis.

The jury is not testing whether the defendant is guilty,

they are testing whether the defendant is not guilty.

And they continue to believe in the defendant being not guilty until

the evidence becomes overwhelming, i.e., beyond a reasonable doubt.

Such that it becomes so inconsistent with the null hypothesis of being not

guilty that they subsequently reject that hypothesis.

So what you see on the screen is a little sort of decision matrix,

representing all possible outcomes to this jury trial.

So remember we have our two mutually exclusive and collectively exhaustive

hypotheses, which means one of those rows is the correct row.

Either the defendant is not guilty or he or she is guilty.

However, we, as in the jury, the decision makers, do not know for

sure whether or not the defendant is guilty.

All that the jury can observe is the decision that the jury reaches, which we

partition again, into mutually exclusive and collectively exhaustive options.

I.e, the jury returns a verdict of not guilty, and hence,

they do not reject that null hypothesis.

Or otherwise, they will reject that null hypothesis and return a verdict of guilty.

So all we will observe is if you're like the column we're in of this matrix,

not the row.

So you can see that there are four possible outcomes,

four different places where we could end up.

Some of the time, we draw the right conclusion,

some of the time we draw an incorrect conclusion.

For example, if the null hypothesis is true and the jury does not reject that

null hypothesis, they have reached the correct decision, hence this is good.

That means the defendant is not guilty of the crime and

the jury returns a verdict of not guilty.

So it would be socially desirable if we ever ended up in that particular

situation.

Of course, there's another type of correct decision, whereby the defendant is guilty.

And hence, the alternative hypothesis is true and

the jury returns a verdict of guilty.

Namely, they have rejected the null hypothesis of being not guilty.

So in real world terms,

this would equate to the guilty defendant being found guilty and going to prison.

Again, that's a socially desirable outcome your justice has prevailed.

Now in an ideal world,

we would always end up in one of those two correct decision boxes.

Namely, if someone is on trial and they're guilty, they go to prison, and

if they were truly innocent they are acquitted.

But as we know, decisions can be incorrect decisions.

A jury can only base their conclusion based on the available

evidence presented to them in court.

So this gives rise to two types of inferential error.

And we're going to distinguish these as a so-called type I errors,

and also type II errors.

So what is a type I error?

This is when the null hypothesis is true, but

the decision is made to reject that null hypothesis.

So again, keeping with this legal analogy that would equate to

someone who is innocent being found guilty of the crime.

Clearly this is socially undesirable, certainly for

the defendant because they are going to prison for something they didn't do.

Of course, does any good come out of this miscarriage of justice?

Well, if, let's say, you were the police or the prosecution,

you have secured a conviction.

A wrong conviction, maybe, but nonetheless,

you have a conviction, and you can tick the box that the case is closed.

So they may be happy.

And of course, if someone's been wrongly convicted for a crime,

the police are not going to be looking for the true perpetrator.

So he or she would have clearly gotten away with that offense.

So that would be a Type I error.

What's a Type II error?

Well, the situation whereby the person is guilty of the crime but

the jury returns a verdict of not guilty, i.e., the alternative hypothesis is true,

however the null hypothesis is not rejected by mistake.

So this would equate to a guilty person walking free and

not going to prison for the crime.

So there are two distinct types of inferential errors which could occur.

So here's a rhetorical question for you.

Which is worse a Type I error or Type II error?

I say rhetorical because this is arguably a subjective question to answer.

Nonetheless, it's often considered and said that it is better to let

100 guilty people walk free than to convict a single innocent person.

Now, you personally may agree or disagree with that way of thinking, but

we tend to apply this kind of mentality even in our statistical world of testing.

Namely, we tend to consider a Type 1 error to be more

problematic than a Type II error.

But I stress this is a subjective view point.

So given two types of error could occur,

let's now start to bring in some probability to this.

So why do we have some probabilistic assessment about how likely a Type

1 error, and also a Type 2 error, how likely these are to occur.

So what we might really be interested in knowing is, what is the probability

of a Type I error, and similarly, what is the probability of a Type II error.

Also in our statistical world of testing, we really need arguably

a more objective equivalent of beyond a reasonable doubt.

So if we assume, as we shall, that a Type I error is more problematic,

more concerning than a Type II error, then what we tend to do is to control for

the probability of committing a Type I error.

And we do that through something called the significance level.

Now you'll see much more of this,

as indeed you decide to pursue your studies into hypothesis testing.

We'll simply look at very superficial look at this.

But none the less, the significance level,

we typically denote by the Greek letter alpha.

So this is a probability, and we already know that probabilities lie over

the unit interval between zero and one.

So the significance level acts as our threshold, the maximum,

our maximum tolerance on the probability of committing a Type 1 error.

So numerically, what value should alpha take?

Well sadly, yet again, we're into somewhat subjective territory.

Nonetheless there are some common conventions which I will

make you aware of.

The default in hypothesis testing is to set a 5% significance level.

So in decimal form, this would equate to an alpha of 0.05.

Now in fact, there is a bit of a link here between significance levels and

confidence levels, which we mentioned at the tail end of week four,

when we attacked confidence intervals.

There we offered 95% confidence intervals,

also perhaps 99, or indeed 90% confidence intervals.

Well if we think of now,

the significance level as the complement of the confidence level,

then the 95% level of confidence really equates to a 5% level of significance.

And this is really the default choice practitioners tend to take.

So we now have a statistical equivalent of beyond a reasonable doubt,

the significance level alpha, 5% the convention.

But just as we could have 99% confidence, this would equate to 1% significance,

or alternatively, 90% confidence, which would equate to 10% significance.

So we're going to have as a alpha,

our chosen threshold on the probability of committing a Type I error.

Just for completeness,

we may assign let's say a beta to the probability of committing a Type II error.

However, our main focus is going to be controlling for

the risk of committing a Type 1 error.

So to continue, we'll be how do we actually make this binary decision

of rejecting or not rejecting a null hypothesis in a statistical world?

Well this is now where we're heading towards the to p, or not to p territory,

where the p of the p or not to p refers to something called a p-value.

In the next section we're going to consider what p-values are, and

indeed they are going to be related to the significance

levels,which we just mentioned.

[MUSIC]