So let's look at some additional examples to highlight some of the ideas from lecture set eight, and first, I want to go back to an example we used, but talk about it in a little further detail. This was the example of a paired comparison, the pre-post study on ten women and their blood pressures before and after using oral contraceptive for three months. And again, the goal of this study was to see what, if any, changes in average blood pressure were associated with oral contraceptive use in women 16 to 49 years old. So you may recall the data looked like this. We had a sample of the women's blood pressure before they started taking the oral contraceptives, and the sample of oral contraceptive blood pressures for the same ten women. So we had two separately samples, but both consisted on the same ten women. So what we were able to do was take these 2 samples of data and convert them to a single sample of differences and we ultimately summarize that sample of differences by taking its mean of 4.8 milimeters of mercury. Standard deviation of these ten differences. And the standard deviation and the variation in these 10 changes for each of the 10 was 4.6 millimeters of mercury. We could have summarized the before and after sample separately as well. And if we did that and took the means for both, the mean blood pressure before starting oral contraceptives was 115.6 millimeters of mercury. And after 3 months of oral contraceptives, it was a 120.4 millimeters of mercury. So if we actually took the after mean, 120.4, and subtracted the before mean of 115.6, we would get the 4.8 millimeters of mercury that we got by first taking the differences and averaging them. So it doesn't matter what order we do it in. We take the means of the two samples and take the difference. Or we take the differences and average them, we get the same average difference. However with standard deviation is not the same story. Here, the standard deviations of the blood pressure measured before taking our contraceptor was 10.3 millimeters of mercury and after taking our contraceptors. And there's no easy way to connect this two measurements before and after and after standard deviations and figure out what the resulting standard deviation of these paired differences is. So when we computed the standard error for estimated mean difference, we used the paired approach where we looked at that sample of differences. And based our standard error, based on the standard deviation of the differences and the number of pairs in our sample, ten. And so we took the standard deviation of those 10 differences, 4.6 millimeters of mercury divide it by the square root of the sample size, the square root of 10. And this estimated standard error's 1.5 millimeters of mercury. What if we had ignored the pairing node and treated this as an unpaired study and computed our standard error for that mean difference, which turns up to be the same value whether we do the after mean minus the before, or do the individual differences and average them? We showed that before. But what if we treated this like unpaired and compute the standard error of this mean difference using a formula for unpaired differences where we use the information on the variability in each of the two samples? If we did this, we get a much, much larger estimate for the standard error than we did when we recognized the pairing. And what's going on here, if we ignore the pairing, we're potentially double counting a lot of uncertainty in the sample measurements if they are connected and correlated, especially if they are strongly correlated. And that's what we have in these data. So I'll show you on the next slide, a graphic, it's called a scatter plot, and we'll be working with these more in Public Health Statistics Two. But what I've plotted, each one of these points represents one person. And what I've plotted on the X axis is their before measurement, before oral contraceptives, and then on the Y axis, they're after measurement. And you can see that these track pretty closely. Women who had lower blood pressure measurements before using oral contraceptives, lower than the rest, tended to have lower after using oral contraceptives compared to the other women with higher measurements before, tended to have higher measurements after. So there's a lot of shared information in these two measurements in their variability. And if we ignore the fact that some fo the variation we're seeing after is the same as before we fail to back that out, we end up double-counting a lot of uncertainty, and overestimating in this case, by a large amount, the standard error of the mean difference. And so we get a very different confidence interval, if we took the mean difference of 4.8, and added and subtracted 2 times 1.5 millimeters of mercury, like we did for the pairing. Versus if we took that 4.8 plus or minus 2 times 5.9 millimeters mercury. It would be a lot wider if we incorrectly did not account for that pairing. So that's just an FYI. The more correlated these pair measurements are, the more the discrepancy between these two standard error estimates. If there were really no relationship between these women's measurements after oral contraceptives and before, these two standard error computations would be a lot more similar. That's just a heads up on what would happen if we ignored the pairing in a paired study. Let's look at another example of unpaired comparisons where we have three groups and designate one as the reference group for our comparisons. This is a neat study and behavioral sort of intervention where they randomized participants in a study dinner, there were 303 participants, and they came to the restaurant. And they were randomly assigned to one of three menu tips, just a standard menu without any calorie labeling for the different menu items, a menu that included calorie labels for the various menu items, and a menu that not only had the calorie labels for the various menu items, but in also, information in the menu about the recommended daily caloric intake for an average adult. So calorie labels plus information. And then they looked at food choices and intake during and after the study dinner, they was remeasured. And ultimately what they found, when they looked at calories consumed during and after the study dinner, when they look at those combined, participants in the group that got the menus with calorie labels plus information about recommended daily intake, consumed, on average 250 fewer calories than those in the other 2 groups. Here's a neat bar chart which shows the differences in the calories ordered in the three different menu groups, the calories consumed at the meal and the calories consumed after the meal. So this is interesting. Just looking at this, you can see that, so this darkly shaded bar was the group that got nothing in their menus. No calorie labels. The mid-shade here is those who got calorie labels only, and the unshaded is those who got calorie labels plus information. So these are just the summary statistics, the average number of calories ordered by each of the three groups. And you can see that it was similar amounts ordered by both sets who got at least the calorie labels, and these were lesser than the group that got nothing in their menus. But by the time we get to the actual amount of calories consumed, during and after the meal, the group that got no information and the group that got only calorie labels were neck and neck for the average consumption. And it was the group that got the calorie labels plus the information that had a notably lower mean consumption. So if we were actually make the group that got no labels, no information in their menus, the reference group and compute the mean differences in calories consumed during and after the meal for the other 2 groups, we look at the difference in those who got just calorie labels, compared to no labels, it was -5.0. Go back, and it was 1,625 calories on average, for the calorie labels, compared to 1,630 calories on average for the no-labels group. And if I compute the confidence interval for that, based on data in the study, it ranges from -226.7 calories lesser for those, on average, to those who got the labels, to over 216 calories more, on average. So a confidence interval, a difference that relatively close to zero and a confidence interval that spans negative and positive possibilities for the mean difference, and it includes zero. So functionally from a statistical perspective, there's no difference in the average calorie labels consumed. The average calories consumed between these groups and scientifically that difference of five calories on average is not particularly meaningful even if it were real. However, we looked at the calorie labels plus information group compared to the same reference of no labels, this is the 250 calorie difference they were speaking of in the abstract, it's 250 less calories on average. The confidence interval goes from -454.7 calories on average to -45.3, it's rather wide but this interval does not include 0. But anyway, again, in order to make these comparisons with the confidence intervals, we can designate one of the groups as the reference and compute the differences between the other groups relative to that same reference. So let's look at another study, an example of binary outcomes that we didn't look in this lecture set. But this was a randomized trial on low dosed aspirin supplementation given to women 45 years or older. This was done because another study had shown benefits for men in the same age group in regards to lowering the risk of cardiovascular disease. They wanted to see if that was replicated in women. So they randomly assigned nearly 40,000 initially healthy women who had not had any cardiovascular events, who were 45 years or older. They randomized and they received 100 milligrams of aspirin on alternate days, or a placebo, and then monitored them for 10 years for the first major cardiovascular event. Here are the results of the 19,934 women randomize. The aspirin group, there were 477 cases of cardiovascular disease that developed in the follow-up period, 2.4% of the women in that group compared with 2.6%, should be 0.026, 2.6% of the women who did not receive aspirin. So it's slightly lower risk of developing cardiovascular disease in a ten-year follow up in the group that received aspirin. And when we actually computed the summary statistics and preconference limits on them regardless of whether they looked at the risk difference, the relative risk or the odds ratio which all agreed in showing if we were doing it in the direction of aspirin compared to placebo, they all agreed in terms of the direction of the difference. But they also agreed in terms of the statistical significance. And you look at all three of these confidence intervals, and they include their respective null values for difference. The null value's 0, and that's included in this interval. And for the ratios, the respective null value is 1. And that's included in both the intervals as we would expect for the relative risk and for the odds ratio. And so ultimately, the scientific conclusion was that it's not worth it to advise women to supplement their diets with aspirin, because it has no real effect after accounting for sample variolation. And there's certainly minor but nevertheless side effects of taking aspirin on a long-term basis. Finally, I want to talk about and just remind you of issues related to doing things on the log scale versus the ratio scale. By looking at what happens to the confidence intervals, above and beyond the estimated measure of association, when we change the direction of comparison for binary outcomes. So we'll use our favorite seminal study on HIV mother-to-infant transmission, where mothers were rendozmied receive placebo or a AZT during pregnancy. And we saw substantially lesser transmissions to children whose mothers received AZT. So if we compare things in the direction of AZT placebo on the ratio scale like we did in the previous lecture section of this lecture set, we get the following. You may recall the relative risk of transmission to children from others who were given AZT compared to placebo was 0.32, a substantial reduction. And the log of that relative risk, that estimated relative risk was -1.14. The odd ratio was similarly striking, the relative odds of transmission for children whose mothers were given the AZT was 0.27. On the log scale that's -1.31. Had we done any other direction, though, we initially compared in the direction of, not AZT compared to placebo, but placebo compared to AZT? If we had done that, the relative risk and odds ratios look substantially different from their counterparts in the other direction. But you'll note on the log scale, the log values are equivalent in absolute value but just differ in side. So if we flip the direction of the relative risk to the placebo compared to AZT, it's 3.1 now, but the log of this relative risk is 1.4, which is just the opposite of the log relative risk in the other direction. And the same thing goes for the odds ratio, when we flip the direction of comparison. It doesn't matter what direction of comparison we choose. When we get to the log scale, the standard error would be the same. In terms of the setup, the only difference in this two-by-two table would be if we wanted to compare things to the placebo to AZT, we'd just switch these columns if we wanted to, Which was the order of this columns but you'll note that doing so, changes nothing, nothing about the values that go into the standard error computations either the log relative risk or the log odds ratio. And we still get the same pieces. So the standard error of this is invariant to the direction of comparison. It's the same whether we go AZT to placebo or placebo to AZT. And so, ultimately let's see what happens when we compare the resulting measures association in both direction. With the risk difference, if we were comparing AZT to placebo, we've seen that was a reduction of 15% with a confidence interval -0.22 to -0.08. We've already talked about that. If we did that in the opposite direction placebo compared to AZT, these would just change signs. It would be a risk difference of positive 15%. The confidence interval go in from 0.08 to 0.22. So exact same conclusion, but the presentation is in terms of the opposite values. For the relative risk and odds ratio, that same holds for the log confidence intervals. The log confidence intervals are simply the opposite of each other depending on the direction of comparison as were the estimated log relative risk. So we're going in the direction AZT to placebo. The log relative risk was -1.14. With the confidence interval, it's -1.74, -0.54. Going the other direction, it's exactly the opposite. The absolute values are the same, the signs are just different. But when we exponentiate those back, we get very different results on the risk scales, and on the ratio scales. And on the ratio scale, this is what I was pointing out, the precision of these two associated relative risks looks very different, because the confidence interval's a lot wider going in the direction of placebo to AZT, than it is for AZT to the placebo. But that's again because of the unequal ranges of values for positive versus negative associations in the ratio scale. You'll see on the log scale, these confidence intervals are of the same exact width. And that also holds for the odds ratio. So again, logging things, when we have ratios, equalizes the range of possible values for associations where the numerator's greater than the denominator, and vice versa. And our results are comparable in terms of their magnitude and interval width on the log scale, regardless of the direction we present them in. Which is why sometimes, these are graphically presented on a log scale that's labeled with the corresponding ratio value. So our intervals are comparable in terms of their precision. So hopefully this was a fun walk down memory lane with regards to material in lecture eight was helpful in reconsidering some of the ideas we've discussed.