0:01

Hi, my name is Brian Caffo. This is Mathematical Biostatistics

Bootcamp, lecture ten on T Confidence intervals.

So, in this lecture, we're going to go through group T intervals whereas the last

lecture we did T intervals for a single mean or you could do those intervals for a

group where the observations were paired. But now we're gonna talk about instances

where we have two independent groups. We'll briefly talk about a method that

construct a likely hood and then we'll talk about what you do if you have unequal

variances. Let me motivate the problem a little bit.

Suppose that we want to compare the mean blood pressure between two groups in a

randomized trial, those who received treatment to those who received placebo.

Unlike last week, where people would have had to have been matched, say, comparing

the same person before and after receiving a treatment, these groups are entirely

independent. The group that received the treatment and

the group that received the placebo. So we can't use the same procedure.

We can't take pairwise differences between measurements.

In fact, they might have different sample sizes in the two groups, and then we

definitely couldn't do it. So this lecture, we're going to talk about

ways for investigating the differences in the population means between groups when

we have independent samples. But we'll see that the methodology works

out to be very similar to what we did last week, the motivating ideas will be nearly

identical. So let's go through some assumptions that

we're going to use for our first variation of the t interval.

So our first collection is the X1 to Xnx is a collection of IID random variables.

They have some mean and they have some variance.

And Y1 to ny are IID normal. And they have a different mean but the

same variance. Right now we're, going to assume the

variance between the two groups is the same.

So we might think of X as the treaty group and Y as the control group.

Or X is one group, and Y is another group. So let's let X bar, Y bar, Sx and Sy be

this means in standard deviations for the two groups.

And we're, our goal is to estimate, say, the difference.

Mu X - mu Y or of course, you could do mu Y - mu X. and look at the negative of the

answer. We would like to estimate that.

But we'd like to have a confidence interval to quantify our uncertainty in

estimating that parameter. So the obvious estimator of saying mu Y -

mu X is Y bar - X bar. I think everyone would agree that the

interval needs to be centered at that point, or that point has to be central in

the construction of the interval. But we also need to figure out some way to

create a confidence interval to incorporate our uncertainty.

Well let's think can we do something that's along the lines of estimate plus or

minus a T quantile times a standard deviation.

Well, we want a standard error of this estimator Y bar - X bar.

If you turn to the calculations, and I would hope that everyone in this class

could do this calculation at this point, under the assumptions that we've made, the

variances of Y bar - X bar works out to be sigma2<i>1 squared times one / nx + one /</i>

ny. And there's a really good estimator of

that entity in this setting. In fact it's a maximum likelihood

estimator, or close to it. And that's the pool variants estimate,

Ssubp^2.. And that works out to be nx - one Sx^2 +

ny - one Sy^2 / nx + ny - two and this works out to be good estimator of sigma

squared. Let's talk about this estimator really

quickly. If you take nx - one and you divide it by

nx + ny - two you get a number that's between zero and one.

And if you take ny - one and you divide it by nx + ny - two you get one minus that

number. You can check the calculations to make

sure that I'm right about that but I'm right.

So, this estimator, ssubp^2 is nothing other than a weighted average of the two

group variances, right? So, it's a weighted average of the

variance for group x plus the variance for group Y.

If nx and ny are equal, if you have the same sample size in both groups, then you

can calculate nx - one over nx + ny - two works out to be 0.5 in which case the

pooled variance estimate works out to be the arithmetic average of the two

variances. On the other hand, if the group x contains

a lot more data. Right?

Nx - one is a lot larger than ny - one. Then nx -

One over this, denominator is going to be much bigger, and you'll get a much bigger

weight on Sx^2 / Sy^2. And in that case, the weighted average

does exactly what you would hope, is it takes whichever of the two groups that has

more measurements associated with it and weights the variance estimate from that

group more heavily which is exactly what you would hope.

There is more data. This variance estimate is going to be

estimated a little bit better since it has more data.

So it makes sense that a good estimator would place more weight.

And so that's basically what this pulled variance estimate is, it's other than an

average. It's just say a, what we called simplicial

average rather than a arithmetic average. Okay so just to reiterate some of these

points. The pooled estimator is a mixture of the

group variances placing bigger weight on whichever one has the larger sample sizes.

If the sample sizes are the same, it's really easy.

All you have to do is average the two variances.

And then the pooled estimate is unbiased. We can show that really quickly.

If you take the expect of value of ssubp^2 you just use the fact that the both of the

individual group variance estimators are unbiased and then you wind up with the

result. I'm not going to show this, cuz it's kind

of complicated to do. But the pool variance estimate turns out

to be independent of Y bar - X bar. But the reason is, if you stomach this

fact that I didn't show before, that X bar is independent of Sx and Y bar is

independent of Sy. Well then X bar - Y bar is going to be

independent of Sx - Sy because all of the collections of things are independent,

then because of that, it should be independent of any function of Sx and Sy

which ssubp^2 is a function of Sx and Sy. So I'm not going to dwell on this point

but take it as given that Y bar - X bar is independent of the co-variance estimate.

And hopefully you can kind of get a sense where I might be going with this

calculation, what I'd like to do is create a T confidence interval.

And remember, what did I need to create a T confidence interval?

I needed to figure out a way to get a standard normal and divide it by the

square root of a Chi-squared2 divided by its degrees of freedom, an independent

Chi-squared2. So, I'm hoping that some function of the

pool variance will be Chi-squared.2. And I just stated without proof that it'll

be independent of the difference in sample means.

Well, it turns out you know, another fact that I'm not going to prove, but one that

you can certainly take to the bank, is that the sum of independent Chi-squared2

random variables is again Chi-Squared And the degrees of freedom just add up.

So let's take nx + ny - two times the pooled variance divided by sigma squared.

Well, that works out to just be nx - one times the X group variance, divided by

sigma squared, and Y - one times the Y group variance, divided by sigma squared,

and we know from before that this first term is Chi-squared with nx - one degrees

of freedom. The second term is Chi-squared with ny -

one degrees of freedom. And so if you believe my fact above, that

the sum of two independent Chi-squared is again Chi-squared with the degrees of

freedom added, That would mean that when we add this

Chi-squared2 with nx - one degrees of freedom and this chi squared with ny - one

degrees of freedom, We get a Chi-squared2 with nx + ny - two

degrees of freedom. And of course we're happy assuming that

the two Chi-squared2 are independent because the entire presumption of

everything we're talking about is that the two groups we're looking at are

independent. This is sort of independent group

analysis. We're assuming that group X and group Y

are independent. Okay.

So now we can construct our t, T statistic.

So we take Y bar - X bar, subtract off its mean, mu Y - mu X and divide by its

standard error. Sigma times one / nx + one / ny square

root. And then divide the whole thing by nx + ny

- two ssupb^2 over sigma squared, which is a Chi-squared and then that is divided by

its degrees of freedom nx + plus ny - two. So if you look at that, that top part is a

standard normal so the original data for the two groups are Gaussian, so that we

know that the sample means are Gaussian, so that we know the difference in the

sample means is Gaussian. And if we take a Gaussian, and subtract

off it's mean and divide by its standard deviation, we wind up with a standard

normal. So the top is a standard normal.

We're stating that the top is independent of the bottom.

And then the bottom we know is the square root of a Chi-squared divided by two

degrees of freedom. So the whole thing has to be a T random

variable with nx + ny - two degrees of freedom.

And then if you collect terms and work with the arithmetic a little bit,

You see that this left hand side works out to be Y bar - X bar, -mu Y - mu X, whole

thing divided by ssubp times one / nx + one / ny square root, which is basically

just the statistic we'd like to use, which is the observed difference in means minus

the population difference in means divided by the standard error; but with sigma

replaced with our data estimate of sigma, so sigma replaced by ssubp..

10:34

And again, notice the form of this is estimator minus true value divided by

standard error, estimated standard error again.

And then, I'm hoping that you should be able to use the same logic from the

previous lecture in how we constructed that confidence interval to just say,

okay, well, the confidence interval for the difference in means is now just turn

through these same calculations, and we get Y bar minus X bar, plus or minus the

appropriate T quantile, times the standard error.

Okay let me repeat that again. The estimate plus or minus the appropriate

T quantile times the standard error. So again, the interval works out to be the

estimate plus or minus the appropriate quantile from the appropriate distribution

times the standard error. Okay. Remember a big assumption in this is

that there is equal variances in the groups.

And we'll talk about that later. I guess one thing I'd like to mention

about the equal variance assumption now, while we're on it,

Is there are actual tests for equality of variances between independent groups.

They work out to be F tests. But those tests are kind of notoriously

bad. So I think some textbooks will do things

like suggest testing equality of the variances.

If the variances are equal, then do this confidence interval and if they're

unequal, do the confidence interval that we're going to talk about in a slide or

two. But I don't like that procedure at all.

I think you should look at graphs, look at the data, and make assessments as to

whether or not the variants are equal or unequal and use that to decide.

If you really must estimate the ratio of the variances in the groups, then I would

suggest you, using bootstrapping would be my suggested technique for doing it,

unless the, maybe the sample sizes are very small.

But another safe thing to do is just always assume that the variances are

unequal. So, if you're worried about this

assumption, you just always do the conservative thing and assume that the

variances are unequal. It's little maybe above the discussion in

this class and how to get a likelihood for mu Y minus mu X.

But it turns out that getting a likelihood for mu Y - mu X divided by sigma, which is

still a single parameter, is very easy. And the reason is that this statistic, Y,

y bar X, x bar divided by its standard error.

That follows a distribution which is called a non-centrality distribution and

the non-centrality parameter depends on mu Y - mu X over sigma and then something

involving the n's that we know. And so you can use this fact to create a

likelihood, not a profile likelihood or anything like that, just an honest to

goodness likelihood for Mu Y - Mu X over Sigma.

I should say what Mu Y - Mu X over sigma is we, I think of it as kind of an effect

size type measurement, but it's the difference in the means standardized

relative to the inter-group standard deviation.

So it's the difference in the means and standard deviation units which is a very

useful thing if you want to calibrate your difference in the means across studies.

Right? If you want to say oh, well this

difference in the means is kind of big. Well, what does big mean?

You know, in one case, it's measured in inches and the other cases, it's measured

in tons or something and, and in other cases, it's measured in centimeters.

So if you look at different experiments, the units are all different, the context

is very different, and it's impossible to compare a say, mu Y - mu X across

experiments. But there is some hope for comparing mu Y

- mu X over sigma across experiments, cuz you've gotten rid of the units, and

everything is expressed in intra-groups, standard deviation units.

So it's a meaningful parameter and it's easy to create a likelihood.

So I'll show you how to do it, but say this is not a tremendously common

technique. On the other hand, the comparison of the

two groups using a standard T confidence interval with the pooled variance is an

extremely common technique. So let's go through an example of actually

constructing this interval. And by the way, this is just a special

case of what's called Annova estimation, just where you happen to have two groups.

Rosner has this great book called Fundamentals of Biostatistics and I got

this example from page 304 of his textbook.

And he looked at an example where they were comparing systolic blood pressure for

eight oral contraceptive users versus 21 controlled.

There was some concern, I guess, in the study, over whether or not oral

contraceptive use increased the systolic blood pressure measured in millimeters or

mercury. So the X bar for the oral contraceptive

users was 132.86, the standard deviation for the oral contraceptive users was

15.34, the mean systolic blood pressure for the control subjects was 127.44 and,

and so on. So the pooled variance estimate takes the

weighted average of the two variances, and so you'll see the formula right here, it

works out to be 307.8. One of the most easiest mistakes if you

happen to be doing these calculations by hand is to forget and to pool these

standard deviations instead of the variances.

You should pool the variance, not the standard deviations.

So if I'd took out these squares, I'd get the wrong answer.

I would get a pooled standard deviation instead of a pooled variance.

But if you'd really mess up if you treated that number as if it was a variance

because it would be on the wrong order of magnitude unit.

So the biggest mistake you can make is to not square those things.

One way to check, when you do this, by the way, is it's an average, right?

So in this case it's going to be an average of 15.34^22 and 18.23^22.

So it has to be between those two numbers. The 15.34^22 and 18.23^22.

So, so then when you square root it, that number has to be between 15.34 and 18.23.

So if that hasn't happened, you've really screwed up.

Okay, so we've got our pooled variance, and if we square root that, we get our

pooled standard deviation. And then we need the appropriate T

quantiles. So we need the T quantile for 97.5 if we

want a 95% confidence interval. And we need 27 degrees of freedom, which

is, if I'm doing my arithmetic correctly, 821-2.

+ 21 - two, and in R, you can just get this

number as the QT. Q standing for quantile, and t, T standing

for T distribution 0.975 cuz we want the 97.5th percentile and degrees of freedom

equals 27 it'll just return 2.052, maybe, plus some other decimal places.

And so the interval is 132.86 - 127.44 plus or minus our T quantile times our

pooled variance yimes one / eight + one / 21.

This works out to be -9.52 to 20.36. One of the most important things to look

for because it's a difference in means is whether or not the interval contains zero,

right? Because if the interval contains zero,

then that would say a reasonable estimate for the difference in blood pressures

between the two groups is that they're identical.

And, in this case it does contain zeroes. So it, you know, there's evidence to say

that there is no difference. In other words that this oral

contraceptive use doesn't appear to be presenting evidence that there's an

associated increase in blood pressure. Turns out that whether evaluating whether

or not zero in this interval turns out to be equivalent to a 2-sided hypothesis test

and we'll talk about hypothesis test later on.

So you have to be careful in how you interpret hypotheses test.

Right now, we might as well just say -9.5 to 20.4 say is reasonable accounting in

the measurements for comparing the average systolic blood pressure between oral

contraceptive users and controls. Now, by the way, another thing to keep

track of whenever you create these intervals is what order you've subtracted

things in, right? In this case, we did contraceptive users

minus controls. I think you should pick a rule and stick

with it, you know? So I always use, say, treated minus

control. And it doesn't matter.

Of course, you just get the negative of the interval if you do it the other

direction. But in interpreting it let's say this

interval was entirely above zero then you would be saying that oral contraceptive

users had an estimated high systolic blood pressure than controls.

But if you forgot and thought you had subtracted controls minus oral

contraceptives you would get the opposite interpretation.

So any rate my point being just remember what order you subtract things in because

it's an easy mistake to make. Here on the next slide I just have a

likelihood plot for the effect size using the non-central T distribution.

And I got, you know, a rough idea of the range of values to plot, by the way.

As I took my confidence interval -9.52 to 20.36.

And I just divided it by the pooled standard deviation.

And that gives me about -0.54 to 1.16. And so you know, I plotted I think from

19:46

-1.5 to positive 1.5. So you know, I got like, at least a rough

idea of the range of things to plot from looking at the interval.

This by the way, -0.54 to 1.16 is not a valid interval for the effect size.

Because we haven't accounted for the uncertainty in estimating this ssubp

Here in the denominator, later on we'll talk about a, an effective way for

generating confidence intervals for nearly any statistic you can dream of using

bootstrapping So finally, I just want to briefly mention what you do if you're not

willing to assume the variances are equal. Well we an calculate the variant in the

unequal case. Y bar - X bar is still, of course, normal.

Its mean is still, of course, mu Y - mu X, but now its variance is changed.

We can't factor out the sigma2, squared so it works out to just be sigma squared X /

nx + sigma2 squared Y / ny. And the statistic we'd like to use to generate our

confidence interval would be Y bar minus X bar - mu Y minus mu X divided by this

variance with the estimated variances plugged in, Sx^2 / nx, plus Sy^2 over ny,

all raised to the one-half power. Unfortunately, that doesn't exactly follow

a T distribution. But, there's a smart idea.

Right. And the, the idea was, well, it's maybe

not a T distribution. But people could simulate and kind of

figure out what is distribution looks like.

And they said, well, it looks an awful lot like a T distribution, but you know, we

can't seem to get the degrees of freedom exactly right to make it perfectly a T

distribution. And they said, well, why don't we find

like the best degrees of freedom to make this look like a T distribution.

And they said well, you know, we could have that degrees of freedom depend on the

data on the variances for example. And we could have it be a fractional

degrees of, not have to be a whole number. And everyone said, that's great, that's a

great idea, why don't we do that? And, so they came up with this crazy

formula for the degrees of freedom by trying to figure out the best degrees of

freedom that makes it look like a T distribution.

And that's what you can do. You basically, evaluate the statistic with

the imperical variances plugged in, in the denominator.

And then, act like it's T distributed with these, kind of, impossible to remember but

easy to plug into degrees of freedom formula.

And that's all you have to do. And that confidence interval works really,

really well. So our confidence interval is just going

to be Y bar - X bar plus or minus the appropriate T quantile with these crazy

degrees of freedom, 0.975 quantile for a 95% interval, times the estimated standard

error and then we have a T interval. So let's go through that really quickly

for our example. We're gonna compare our eight oral

contraceptive units to versus our 21 controls.

I just re-put all the numbers here just to remind you.

In this case if you directly plug SOC and SC into this formula from the previous

page where you have you know, Sx^22, in this case, would be Soc^22.

And sy^22 would be Sc^22. And nx would be noc, and ny would be nc.

Just plug those directly into that formula, and you get 15.04 degrees of

freedom. And that, if I plugged into the formula

correctly, Maybe everyone double checked me.

I think, with 10,000 people double checking me, we should get it right.

The T quantile for that turns out to be 2.13.

So you just construct the interval. Difference in the means plus or minus the

appropriate T quantile plus the variance. 15.34^2 / eight + 18.23^2 / 21, square

root the whole thing and you get this confidence interval right here.

So you interpret the confidence interval in the same way, you're obviously

interested in whether zero in this case whether 0's in the confidence interval.

And if you want kind of a safe thing to do, you just would always do this interval

instead of assuming equal variance. Well, that was a quick lecture today.

And, I hope using the kind of thought process from this lecture and the previous

one, that you should be able to sort of create confidence intervals at a whim now.

If there's any case where you can figure out what the standard error is of a

statistic, then you'd more or less think, well, I'll, I'll get a confidence interval

by taking estimate plus or minus some quantile from some distribution times the

standard error. And maybe that quantile will usually

either be a T quantile or a standard normal quantile.

And I think you'll notice that the vast majority of confidence intervals that we

cover in this class and the vast majority of confidence intervals that you encounter

in practice will be exactly of this form. So I look forward to seeing you next time,

and next time we are going to have a light lecture.

We're going to talk about plotting. .