In this final section of week four, we introduce the concept of confidence intervals. So this is all revolving around estimation. So, previously, we've looked at what we would refer to as point estimation, i.e., we have a parameter of interest, the population mean µ whose value is unknown because we don't observe the entire population. But from that population, let's suppose we select a simple random sample of size N, hence free of selection bias, and from the data which result, we calculate the sample mean x-bar. Now, the value of that x-bar represents our point estimate of µ, i.e., it's a numerical value, so best guess, if you will. But, of course, we know that that best guess has some uncertainty attached to it. From our look at the theoretical aspects of sampling distributions, all we know is that the expectation of x-bar is equal to µ. So, on average, our sample mean is correct, but the particular point estimate we happen to obtain may or may not be equal to the truths. So what we need to do now is to convert our point estimate, that single numerical value, into an interval estimate. Think of this interval as a range of likely or plausible values for the parameter. An alternative name for these interval estimates is confidence intervals. So, on the screen, you would see a couple of formulae which corresponds to how we would calculate a confidence interval for the population mean. Now, there are two versions. They simply differ in terms of whether or not we know the true value of the population standard deviation. Now, one might imagine, if you do not know the value of the population mean µ, why would you know the value of an alternative population parameter, i.e., sigma, the population standard deviation? Unlikely. Nonetheless, we may wish to perhaps assume the value of sigma may be based on previous studies, judgment or any other reasonable assumption that you might make. So, these two formulae represent how we would construct the endpoints to our confidence interval. So the big picture level, think of it this way. We have our best guess x-bar, a point estimate, plus-minus a margin of error. And hence, the size of that margin of error determines how wide these confidence intervals are. So you'll see that these margins of error consist of three components. In the first Z sigma and N, and then the second ZS and N. Now, let's deal with that sigma and S distinction first of all. Well, we've already introduced in our basic descriptive statistics the sample standard deviation S. So, if we didn't know the value of sigma, we might think about using a sample standard deviation S, which of course we can easily calculate using our sample data, as our point estimate for sigma. So either we will assume the value of sigma or simply use the sample standard deviation S to estimate its true value. Now, for statistical reasons, I don't really want to get heavily into. We'll consider these two formulae as being equivalent to each other provided we have large sample sizes. Now, what do we mean by a large sample size? Well, that in itself is somewhat of a subjective question to come up with an answer. But let's say for all purposes, sample sizes of 50 or more were deemed to be sufficiently large. So these margins of error will correspond to those three components. The Z, S or sigma and N. Now, let's deal with the N first of all, the sample size. So we see that N is in the denominator of these margins of error, such that, as the sample size increases, these margins of error become smaller and hence, the width of these confidence intervals reduces. Now, of course, other things equal, it is desirable to have a narrow confidence interval than a very wide one. You'd like to be able to say you were estimating your parameter within a small margin of error than a much larger margin of error. So we start to see trade-offs coming into play. Namely, if we decide to invest more time, money and effort, which is a bad thing, to get a larger sample size, the cost is offset by the benefit of having a shorter confidence interval. So as you decide to increase your sample size N, the width of your confidence interval will contract. Of course, N is under our control. We decide how big or small we wish our sample to be. Now, let's deal with the sigma or its estimate, S. The S reflects the amount of variation which exists in that wider population. Now, of course, unlike N, we have no direct control over this. If we have a population which is very heterogeneous, i.e., there's lots of variation within it, then we can see that as sigma and more likely S, its estimate, as these become larger, the width of our confidence interval becomes wider, reflecting the greater uncertainty in just how representative our sample may be, and hence, how precise our point estimate x-bar is likely to be for µ. So we see, as the variation within the population increases, other things equal, the width of our confidence interval becomes wider. And hence, we've now come onto that third component of this Z value. So, when we refer to a confidence interval, we will associate with it a particular level of confidence. Now, what might be a sensible level of confidence? Well, we can have any level we like, but nonetheless, we have some conventions and values which are typically used. So the most frequently used level of confidence would be a so-called 95% confidence interval. Of course, we can deviate from that if we so choose, and other fairly common conventions might be a lower level of confidence of just 90% or a higher level of confidence, as perhaps as high as 99%. So how does this relate to this Z value? Well, on the screen, you will see that depending on the level of confidence we choose, that will affect the Z value which is used in these endpoint calculations. Now again, think about trade-offs. Other things equal, I'd rather be 99% confident when estimating something than, perhaps, only being 95% confident. But whenever there's a benefit, there's always some offsetting cost, and then the question is, does the benefit justify the cost? So here, we're not looking at any sort of financial terms but rather a cost in terms of the width of the confidence interval. So other things equal, if you desire a higher level of confidence, then that equates to a large Z value which other things equal leads to a wider confidence interval due to the larger margins of error. So, like N, Z is also under our control because we can dictate what level of confidence we deem to be appropriate. So we note the three parameters which will affect the margin of error and hence, the confidence interval width. Now, perhaps, I just like to say a little bit more about the correct interpretation of a confidence interval and what this confidence really represent. So imagine we take one simple random sample of size N from, let's say, a normal distribution. From our previous section, when we looked at the sampling distribution of x-bar when sampling from a normal distribution, I said think of x-bar as a random drawing from this sampling distribution, which on average is going to be equal µ, but of course, maybe a bit above µ, maybe a bit below µ, and hence, there's uncertainty in our point estimate, hence, our need to convert that point estimate to our interval estimate or confidence interval. Now, let's imagine we didn't just take one simple random sample, but we took many. Let's say, we took a 100 independent simple random samples each of the same size N. So think of this now as 100 random drawings from the corresponding sampling distribution of x-bar. Inevitably, the values we get for our sample mean are going to vary across these hundred samples. So let's assume sigma is unknown, and we've taken an assumed value for it. We have the same sample size N, and we stick with a confidence level of, let's say, 95%. Hence, Z, sigma and N are fixed across these 100 samples, and hence, the margin of error is fixed across these 100 samples. But that, of course, is a variable component here, and that is the sample mean x-bar. So let's imagine our first sample gave us a sample mean of x-bar here to which we then attach the corresponding margin of error. A second sample independent of the first is no doubt going to give us a different value for x-bar, maybe a bit above the previous one, maybe a bit below the previous one. But nonetheless, the margin of error, given Z sigma and N are unchanged, the margin of error remains the same. So given we're going to have a different central point, a different value for x-bar, we will end up with different endpoints to the second confidence interval, but the width would remain fixed. Now, let's suppose we iterated this over the remaining 98 samples. So these will lead to different values of x-bar, and hence, we'll have overall 100 confidence intervals, each of the same width but are centered in different positions, and hence, these confidence intervals are all going to be shifted. Now, of course, some of the time, we will end up with a confidence interval which does happen to cover or span the true population mean. Some of the time though, we will be unlucky. Through no fault of our own, we've used a simple random sampling technique free of selection bias, but maybe we are just unlucky in that we end up with a highly unrepresentative sample. So when we talk about 95% confidence intervals, think of it this way. If we had a hundred such confidence intervals derived from a hundred independent random samples, then 95% of the time we would end up with a confidence interval which happens to cover the true population mean µ. Of course, that would also equate to 5% of the time on average that we end up with a confidence interval which doesn't cover the true population mean. So if we decide now to vary the level of confidence, let's say, we go from 95% confidence up to 99% confidence, then equivalently, over repeated samples 99% of the time, we end up with an interval covering the true parameter, and only 1% of the time, we end up with a confidence interval which doesn't cover the true parameter. But remember the trade-offs. As we vary the level of confidence, as the level of confidence increases, this Z value used in the calculation would also increase, leading to a wider confidence interval. But, of course, we said, other things equal. We would rather have a narrower confidence interval than a wider one. So this is just to give you a bit of an insight into confidence intervals themselves. In due course, we'll see some other examples, for example, related to opinion polling. When you come out with a point estimate, let's say, a presidential candidate is polling 40% according to the polls. But if you look closely at the small print when an opinion poll is published, it would talk about the margin of error associated with it. And in political science and hence, opinion polling, usually a three percentage point margin of error is the norm. But there's an added complication to opinion polling because there we're not really estimating a population mean per se or rather a population proportion. But we'll come on to more of those details as we go into week five. So a big picture level, when we want to estimate a population mean, we'll use the sample mean as our descriptive statistic of choice. We know on average, our point estimate is correct, but there is some uncertainty due to the potential for sampling error. So a confidence interval will quantify the uncertainty attached to this point estimate taking into account these three characteristics are chosen level of confidence, how variable the population, or if we don't know sigma, how variable our sample is, and also our sample size N.