Hi, my name is Brian Caffo. This is Mathematical Biostatistics Bootcamp, lecture ten on T Confidence intervals. So, in this lecture, we're going to go through group T intervals whereas the last lecture we did T intervals for a single mean or you could do those intervals for a group where the observations were paired. But now we're gonna talk about instances where we have two independent groups. We'll briefly talk about a method that construct a likely hood and then we'll talk about what you do if you have unequal variances. Let me motivate the problem a little bit. Suppose that we want to compare the mean blood pressure between two groups in a randomized trial, those who received treatment to those who received placebo. Unlike last week, where people would have had to have been matched, say, comparing the same person before and after receiving a treatment, these groups are entirely independent. The group that received the treatment and the group that received the placebo. So we can't use the same procedure. We can't take pairwise differences between measurements. In fact, they might have different sample sizes in the two groups, and then we definitely couldn't do it. So this lecture, we're going to talk about ways for investigating the differences in the population means between groups when we have independent samples. But we'll see that the methodology works out to be very similar to what we did last week, the motivating ideas will be nearly identical. So let's go through some assumptions that we're going to use for our first variation of the t interval. So our first collection is the X1 to Xnx is a collection of IID random variables. They have some mean and they have some variance. And Y1 to ny are IID normal. And they have a different mean but the same variance. Right now we're, going to assume the variance between the two groups is the same. So we might think of X as the treaty group and Y as the control group. Or X is one group, and Y is another group. So let's let X bar, Y bar, Sx and Sy be this means in standard deviations for the two groups. And we're, our goal is to estimate, say, the difference. Mu X - mu Y or of course, you could do mu Y - mu X. and look at the negative of the answer. We would like to estimate that. But we'd like to have a confidence interval to quantify our uncertainty in estimating that parameter. So the obvious estimator of saying mu Y - mu X is Y bar - X bar. I think everyone would agree that the interval needs to be centered at that point, or that point has to be central in the construction of the interval. But we also need to figure out some way to create a confidence interval to incorporate our uncertainty. Well let's think can we do something that's along the lines of estimate plus or minus a T quantile times a standard deviation. Well, we want a standard error of this estimator Y bar - X bar. If you turn to the calculations, and I would hope that everyone in this class could do this calculation at this point, under the assumptions that we've made, the variances of Y bar - X bar works out to be sigma2<i>1 squared times one / nx + one /</i> ny. And there's a really good estimator of that entity in this setting. In fact it's a maximum likelihood estimator, or close to it. And that's the pool variants estimate, Ssubp^2.. And that works out to be nx - one Sx^2 + ny - one Sy^2 / nx + ny - two and this works out to be good estimator of sigma squared. Let's talk about this estimator really quickly. If you take nx - one and you divide it by nx + ny - two you get a number that's between zero and one. And if you take ny - one and you divide it by nx + ny - two you get one minus that number. You can check the calculations to make sure that I'm right about that but I'm right. So, this estimator, ssubp^2 is nothing other than a weighted average of the two group variances, right? So, it's a weighted average of the variance for group x plus the variance for group Y. If nx and ny are equal, if you have the same sample size in both groups, then you can calculate nx - one over nx + ny - two works out to be 0.5 in which case the pooled variance estimate works out to be the arithmetic average of the two variances. On the other hand, if the group x contains a lot more data. Right? Nx - one is a lot larger than ny - one. Then nx - One over this, denominator is going to be much bigger, and you'll get a much bigger weight on Sx^2 / Sy^2. And in that case, the weighted average does exactly what you would hope, is it takes whichever of the two groups that has more measurements associated with it and weights the variance estimate from that group more heavily which is exactly what you would hope. There is more data. This variance estimate is going to be estimated a little bit better since it has more data. So it makes sense that a good estimator would place more weight. And so that's basically what this pulled variance estimate is, it's other than an average. It's just say a, what we called simplicial average rather than a arithmetic average. Okay so just to reiterate some of these points. The pooled estimator is a mixture of the group variances placing bigger weight on whichever one has the larger sample sizes. If the sample sizes are the same, it's really easy. All you have to do is average the two variances. And then the pooled estimate is unbiased. We can show that really quickly. If you take the expect of value of ssubp^2 you just use the fact that the both of the individual group variance estimators are unbiased and then you wind up with the result. I'm not going to show this, cuz it's kind of complicated to do. But the pool variance estimate turns out to be independent of Y bar - X bar. But the reason is, if you stomach this fact that I didn't show before, that X bar is independent of Sx and Y bar is independent of Sy. Well then X bar - Y bar is going to be independent of Sx - Sy because all of the collections of things are independent, then because of that, it should be independent of any function of Sx and Sy which ssubp^2 is a function of Sx and Sy. So I'm not going to dwell on this point but take it as given that Y bar - X bar is independent of the co-variance estimate. And hopefully you can kind of get a sense where I might be going with this calculation, what I'd like to do is create a T confidence interval. And remember, what did I need to create a T confidence interval? I needed to figure out a way to get a standard normal and divide it by the square root of a Chi-squared2 divided by its degrees of freedom, an independent Chi-squared2. So, I'm hoping that some function of the pool variance will be Chi-squared.2. And I just stated without proof that it'll be independent of the difference in sample means. Well, it turns out you know, another fact that I'm not going to prove, but one that you can certainly take to the bank, is that the sum of independent Chi-squared2 random variables is again Chi-Squared And the degrees of freedom just add up. So let's take nx + ny - two times the pooled variance divided by sigma squared. Well, that works out to just be nx - one times the X group variance, divided by sigma squared, and Y - one times the Y group variance, divided by sigma squared, and we know from before that this first term is Chi-squared with nx - one degrees of freedom. The second term is Chi-squared with ny - one degrees of freedom. And so if you believe my fact above, that the sum of two independent Chi-squared is again Chi-squared with the degrees of freedom added, That would mean that when we add this Chi-squared2 with nx - one degrees of freedom and this chi squared with ny - one degrees of freedom, We get a Chi-squared2 with nx + ny - two degrees of freedom. And of course we're happy assuming that the two Chi-squared2 are independent because the entire presumption of everything we're talking about is that the two groups we're looking at are independent. This is sort of independent group analysis. We're assuming that group X and group Y are independent. Okay. So now we can construct our t, T statistic. So we take Y bar - X bar, subtract off its mean, mu Y - mu X and divide by its standard error. Sigma times one / nx + one / ny square root. And then divide the whole thing by nx + ny - two ssupb^2 over sigma squared, which is a Chi-squared and then that is divided by its degrees of freedom nx + plus ny - two. So if you look at that, that top part is a standard normal so the original data for the two groups are Gaussian, so that we know that the sample means are Gaussian, so that we know the difference in the sample means is Gaussian. And if we take a Gaussian, and subtract off it's mean and divide by its standard deviation, we wind up with a standard normal. So the top is a standard normal. We're stating that the top is independent of the bottom. And then the bottom we know is the square root of a Chi-squared divided by two degrees of freedom. So the whole thing has to be a T random variable with nx + ny - two degrees of freedom. And then if you collect terms and work with the arithmetic a little bit, You see that this left hand side works out to be Y bar - X bar, -mu Y - mu X, whole thing divided by ssubp times one / nx + one / ny square root, which is basically just the statistic we'd like to use, which is the observed difference in means minus the population difference in means divided by the standard error; but with sigma replaced with our data estimate of sigma, so sigma replaced by ssubp.. Okay. So it's just like before, where we took our statistic that we normally had replaced the unknown standard deviation with its estimate. And what would normally be a Gaussian random variable turns into a T random variable. And again, notice the form of this is estimator minus true value divided by standard error, estimated standard error again. And then, I'm hoping that you should be able to use the same logic from the previous lecture in how we constructed that confidence interval to just say, okay, well, the confidence interval for the difference in means is now just turn through these same calculations, and we get Y bar minus X bar, plus or minus the appropriate T quantile, times the standard error. Okay let me repeat that again. The estimate plus or minus the appropriate T quantile times the standard error. So again, the interval works out to be the estimate plus or minus the appropriate quantile from the appropriate distribution times the standard error. Okay. Remember a big assumption in this is that there is equal variances in the groups. And we'll talk about that later. I guess one thing I'd like to mention about the equal variance assumption now, while we're on it, Is there are actual tests for equality of variances between independent groups. They work out to be F tests. But those tests are kind of notoriously bad. So I think some textbooks will do things like suggest testing equality of the variances. If the variances are equal, then do this confidence interval and if they're unequal, do the confidence interval that we're going to talk about in a slide or two. But I don't like that procedure at all. I think you should look at graphs, look at the data, and make assessments as to whether or not the variants are equal or unequal and use that to decide. If you really must estimate the ratio of the variances in the groups, then I would suggest you, using bootstrapping would be my suggested technique for doing it, unless the, maybe the sample sizes are very small. But another safe thing to do is just always assume that the variances are unequal. So, if you're worried about this assumption, you just always do the conservative thing and assume that the variances are unequal. It's little maybe above the discussion in this class and how to get a likelihood for mu Y minus mu X. But it turns out that getting a likelihood for mu Y - mu X divided by sigma, which is still a single parameter, is very easy. And the reason is that this statistic, Y, y bar X, x bar divided by its standard error. That follows a distribution which is called a non-centrality distribution and the non-centrality parameter depends on mu Y - mu X over sigma and then something involving the n's that we know. And so you can use this fact to create a likelihood, not a profile likelihood or anything like that, just an honest to goodness likelihood for Mu Y - Mu X over Sigma. I should say what Mu Y - Mu X over sigma is we, I think of it as kind of an effect size type measurement, but it's the difference in the means standardized relative to the inter-group standard deviation. So it's the difference in the means and standard deviation units which is a very useful thing if you want to calibrate your difference in the means across studies. Right? If you want to say oh, well this difference in the means is kind of big. Well, what does big mean? You know, in one case, it's measured in inches and the other cases, it's measured in tons or something and, and in other cases, it's measured in centimeters. So if you look at different experiments, the units are all different, the context is very different, and it's impossible to compare a say, mu Y - mu X across experiments. But there is some hope for comparing mu Y - mu X over sigma across experiments, cuz you've gotten rid of the units, and everything is expressed in intra-groups, standard deviation units. So it's a meaningful parameter and it's easy to create a likelihood. So I'll show you how to do it, but say this is not a tremendously common technique. On the other hand, the comparison of the two groups using a standard T confidence interval with the pooled variance is an extremely common technique. So let's go through an example of actually constructing this interval. And by the way, this is just a special case of what's called Annova estimation, just where you happen to have two groups. Rosner has this great book called Fundamentals of Biostatistics and I got this example from page 304 of his textbook. And he looked at an example where they were comparing systolic blood pressure for eight oral contraceptive users versus 21 controlled. There was some concern, I guess, in the study, over whether or not oral contraceptive use increased the systolic blood pressure measured in millimeters or mercury. So the X bar for the oral contraceptive users was 132.86, the standard deviation for the oral contraceptive users was 15.34, the mean systolic blood pressure for the control subjects was 127.44 and, and so on. So the pooled variance estimate takes the weighted average of the two variances, and so you'll see the formula right here, it works out to be 307.8. One of the most easiest mistakes if you happen to be doing these calculations by hand is to forget and to pool these standard deviations instead of the variances. You should pool the variance, not the standard deviations. So if I'd took out these squares, I'd get the wrong answer. I would get a pooled standard deviation instead of a pooled variance. But if you'd really mess up if you treated that number as if it was a variance because it would be on the wrong order of magnitude unit. So the biggest mistake you can make is to not square those things. One way to check, when you do this, by the way, is it's an average, right? So in this case it's going to be an average of 15.34^22 and 18.23^22. So it has to be between those two numbers. The 15.34^22 and 18.23^22. So, so then when you square root it, that number has to be between 15.34 and 18.23. So if that hasn't happened, you've really screwed up. Okay, so we've got our pooled variance, and if we square root that, we get our pooled standard deviation. And then we need the appropriate T quantiles. So we need the T quantile for 97.5 if we want a 95% confidence interval. And we need 27 degrees of freedom, which is, if I'm doing my arithmetic correctly, 821-2. + 21 - two, and in R, you can just get this number as the QT. Q standing for quantile, and t, T standing for T distribution 0.975 cuz we want the 97.5th percentile and degrees of freedom equals 27 it'll just return 2.052, maybe, plus some other decimal places. And so the interval is 132.86 - 127.44 plus or minus our T quantile times our pooled variance yimes one / eight + one / 21. This works out to be -9.52 to 20.36. One of the most important things to look for because it's a difference in means is whether or not the interval contains zero, right? Because if the interval contains zero, then that would say a reasonable estimate for the difference in blood pressures between the two groups is that they're identical. And, in this case it does contain zeroes. So it, you know, there's evidence to say that there is no difference. In other words that this oral contraceptive use doesn't appear to be presenting evidence that there's an associated increase in blood pressure. Turns out that whether evaluating whether or not zero in this interval turns out to be equivalent to a 2-sided hypothesis test and we'll talk about hypothesis test later on. So you have to be careful in how you interpret hypotheses test. Right now, we might as well just say -9.5 to 20.4 say is reasonable accounting in the measurements for comparing the average systolic blood pressure between oral contraceptive users and controls. Now, by the way, another thing to keep track of whenever you create these intervals is what order you've subtracted things in, right? In this case, we did contraceptive users minus controls. I think you should pick a rule and stick with it, you know? So I always use, say, treated minus control. And it doesn't matter. Of course, you just get the negative of the interval if you do it the other direction. But in interpreting it let's say this interval was entirely above zero then you would be saying that oral contraceptive users had an estimated high systolic blood pressure than controls. But if you forgot and thought you had subtracted controls minus oral contraceptives you would get the opposite interpretation. So any rate my point being just remember what order you subtract things in because it's an easy mistake to make. Here on the next slide I just have a likelihood plot for the effect size using the non-central T distribution. And I got, you know, a rough idea of the range of values to plot, by the way. As I took my confidence interval -9.52 to 20.36. And I just divided it by the pooled standard deviation. And that gives me about -0.54 to 1.16. And so you know, I plotted I think from -1.5 to positive 1.5. So you know, I got like, at least a rough idea of the range of things to plot from looking at the interval. This by the way, -0.54 to 1.16 is not a valid interval for the effect size. Because we haven't accounted for the uncertainty in estimating this ssubp Here in the denominator, later on we'll talk about a, an effective way for generating confidence intervals for nearly any statistic you can dream of using bootstrapping So finally, I just want to briefly mention what you do if you're not willing to assume the variances are equal. Well we an calculate the variant in the unequal case. Y bar - X bar is still, of course, normal. Its mean is still, of course, mu Y - mu X, but now its variance is changed. We can't factor out the sigma2, squared so it works out to just be sigma squared X / nx + sigma2 squared Y / ny. And the statistic we'd like to use to generate our confidence interval would be Y bar minus X bar - mu Y minus mu X divided by this variance with the estimated variances plugged in, Sx^2 / nx, plus Sy^2 over ny, all raised to the one-half power. Unfortunately, that doesn't exactly follow a T distribution. But, there's a smart idea. Right. And the, the idea was, well, it's maybe not a T distribution. But people could simulate and kind of figure out what is distribution looks like. And they said, well, it looks an awful lot like a T distribution, but you know, we can't seem to get the degrees of freedom exactly right to make it perfectly a T distribution. And they said, well, why don't we find like the best degrees of freedom to make this look like a T distribution. And they said well, you know, we could have that degrees of freedom depend on the data on the variances for example. And we could have it be a fractional degrees of, not have to be a whole number. And everyone said, that's great, that's a great idea, why don't we do that? And, so they came up with this crazy formula for the degrees of freedom by trying to figure out the best degrees of freedom that makes it look like a T distribution. And that's what you can do. You basically, evaluate the statistic with the imperical variances plugged in, in the denominator. And then, act like it's T distributed with these, kind of, impossible to remember but easy to plug into degrees of freedom formula. And that's all you have to do. And that confidence interval works really, really well. So our confidence interval is just going to be Y bar - X bar plus or minus the appropriate T quantile with these crazy degrees of freedom, 0.975 quantile for a 95% interval, times the estimated standard error and then we have a T interval. So let's go through that really quickly for our example. We're gonna compare our eight oral contraceptive units to versus our 21 controls. I just re-put all the numbers here just to remind you. In this case if you directly plug SOC and SC into this formula from the previous page where you have you know, Sx^22, in this case, would be Soc^22. And sy^22 would be Sc^22. And nx would be noc, and ny would be nc. Just plug those directly into that formula, and you get 15.04 degrees of freedom. And that, if I plugged into the formula correctly, Maybe everyone double checked me. I think, with 10,000 people double checking me, we should get it right. The T quantile for that turns out to be 2.13. So you just construct the interval. Difference in the means plus or minus the appropriate T quantile plus the variance. 15.34^2 / eight + 18.23^2 / 21, square root the whole thing and you get this confidence interval right here. So you interpret the confidence interval in the same way, you're obviously interested in whether zero in this case whether 0's in the confidence interval. And if you want kind of a safe thing to do, you just would always do this interval instead of assuming equal variance. Well, that was a quick lecture today. And, I hope using the kind of thought process from this lecture and the previous one, that you should be able to sort of create confidence intervals at a whim now. If there's any case where you can figure out what the standard error is of a statistic, then you'd more or less think, well, I'll, I'll get a confidence interval by taking estimate plus or minus some quantile from some distribution times the standard error. And maybe that quantile will usually either be a T quantile or a standard normal quantile. And I think you'll notice that the vast majority of confidence intervals that we cover in this class and the vast majority of confidence intervals that you encounter in practice will be exactly of this form. So I look forward to seeing you next time, and next time we are going to have a light lecture. We're going to talk about plotting. .