0:01

In this video, we will continue with the HIV testing example to introduce

the concept of Bayes factors.

Earlier, we introduced the concept of priors and posteriors.

The prior odds is defined as the ratio of the prior probabilities

assigned to the hypotheses or models we're considering.

So if there are two competing hypotheses being considered,

then the prior odds of hypothesis one to hypothesis two can be defined as

O of H1 to H2, which is equal to the probability of H1 over probability of H2.

Similarly, the posterior odds is the ratio of the two posterior probabilities of this

hypotheses.

That is PO of H1 to H2 is the probability of H1 given data

divided by the probability of H2 given data.

Using Bayes rule, we can rewrite the posterior probabilities as the probability

of the data given the hypothesis times the prior for

that hypothesis divided by the probability of data.

The probability of data in both the numerator and denominator cancels, and

we can reorganize this as the ratio of the data given H1 and

data given H2 times the ratio of the prior probabilities of these hypothesis.

The first quantity, the ratio of the probabilities of data given these two

hypotheses is defined as the Bayes factor.

And the second quantity is the prior odds that we saw earlier.

In other words, the posterior odds is the product of the bayes factor and

the prior odds for these two hypotheses.

The Bayes factor quantifies the evidence of data arising from hypothesis one

versus hypothesis two.

In a discrete case, this is simply the ratio of the likelihoods of the observed

data under the two hypotheses or models.

However, in a continuous case, it's the ratio of the marginal likelihoods.

In this way,

we are considering all possible values of the model parameters theta.

In this video, we will stick with the simpler discrete case.

And in upcoming videos,

we will revisit calculating Bayes factors for more complicated models.

Let's return to the HIV testing example from earlier,

where our patient had tested positive in the ELISA.

Remember that our hypotheses, our patient does not have HIV, and patient has HIV.

The prior probabilities we place on these hypothesis came from the prevalence of

HIV at the time in the general population.

We were told that the prevalence of HIV in the population was 1.48 out of 1000,

hence the prior probability assigned to hypothesis 2 is 0.00148.

And the prior assigned to hypothesis 1 is simply the complement of this.

Hence, the prior odds can be calculated as the ratio of these two values,

which comes out to approximately 674.68.

We also calculated posterior probabilities of these hypotheses given a positive

result.

These were approximately 0.88 and 0.12.

We'll hold on to more decimal places in our calculations to avoid rounding

errors later.

Hence, the posterior odds is approximately 7.25,

then we can calculate the Bayes factor as the ratio of the posterior

odds to prior odds which comes out to approximately 0.0108.

Note that in this simple discrete case the Bayes factor, it simplifies to

the ratio of the likelihoods of the observed data under the two hypotheses.

Remember that the true positive rate of the test was 0.93 and

the false positive rate was 0.01.

Using these two the base factor also comes out to approximately 0.0108.

3:38

So now that we calculated the Bayes factor, the next natural question is,

what does this number mean?

A commonly used scale for interpreting Bayes factors is proposed by Jeffreys and

it's as follows.

If the Bayes factor is between one and

three, the evidence against H2 is not worth a bare mention.

If it's 3 to 20 the evidence is positive.

If it's 30 to 150 the evidence is strong and

if it's greater than 150 the evidence is very strong.

It might have caught youre attention that the base factor we calculated does

not even appear on the scale.

To obtain a base factor value on the scale we will need to change the order of our

hypotheses and calculate the base factor for hypotheses two to hypothesis one.

And look for evidence against hypothesis one instead.

We can calculate the Bayes factor as a reciprocal of the Bayes factor for

hypothesis one to hypothesis two.

For our data, this comes out to approximately 93.

Hence, evidence against hypothesis one,

which states that the patient does not have HIV is strong.

This means that even though the posterior for

having HIV given a positive result was low, we would still decide

according to the scale based on a positive elisa that the patient has HIV.

An intuitive way of thinking about this is to consider not only the posteriors, but

also the priors assigned to this hypotheses.

Bayes factor is the ratio of the posterior odds to prior odds.

While 12% is a low posterior probability for

having HIV given a positive ELISA result, this value is still much higher

than the overall prevalence of HIV in the population.

In other words, the prior probability for that hypothesis.

5:22

Another commonly used scale for

interpreting Bayes factors is proposed by Kass and Raftery, and

it deals with the natural logarithm of the calculated Bayes factor.

Reporting of the log scale can be helpful for

numerical accuracy reasons when the likelihoods are very small.

Taking two times the natural logarithm of the Bayes factor we calculated earlier,

we would end up with the same decision that the evidence

against hypothesis two is strong.

To recap, in this video we defined prior odds, posterior odds, and Bayes factors.

We learned about scales by which we can interpret these values for

model selection.

We also re-emphasize that in Bayesian testing,

the order in which we evaluate the models of hypotheses does not matter.

Since the Bayes factor of hypothesis two versus hypothesis one is

simply the reciprocal of the Bayes factor for hypothesis one versus hypothesis two.