We're back again. We're still talking about simple comparative experiments. We've just been through a pretty complete description of the pooled-t-test, or what people sometimes also called the two-sample t-test. In the pooled-t-test, we assume that the variances in the two populations are the same, but what if they're different? What does that do to the procedure? What changes do we have to make? Well, let's address that. I'm going to do that in the context of this example from the book, 2.1. This is about nerve preservation in surgery. Accidental injury to the nerve causes a lot of post-surgical problems pain, paralysis, numbness. Typically, surgeons identify nerves by their appearance and by their relationship to other physical structures in the body, but they can also be detected by local electrical stimulation. It is easy to overlook them, however. So here's an article that appeared in Nature Biotechnology in 2011, that described the use of a fluorescently labeled peptide that binds to nerves, and this assist in identification. There's a data table that I'll show you in just a minute that shows that normalized fluorescence after two hours for both nerve and muscle tissue for 12 mice. I had to read the data from a graph in this paper, so there may be some slight inaccuracies in the data. So we'd like to test the hypothesis that the mean normalized fluorescence after two hours is greater for nerve tissue than for muscle tissue. So if Mu is the normalized mean fluorescence for nerve tissue and Mu 2 is the main normalized fluorescence for muscle tissue. We want to test the hypothesis that Mu 1 is equal to Mu 2 against the alternative that Mu 1 is greater than Mu 2. So how would we do that? Well, here's the data, taken from the paper, and we have nerve tissue and muscle tissue in the two columns, and there are 12 observations. So if you look at the data, which is summarized on the right-hand side of the slide, the means appear to be quite different. For the nerve tissue, it's 4228 and for the non-nerve, it's 2534, but the standard deviations look pretty different too, in fact, the standard deviation for the nerve tissue appears to be about twice as big as the standard deviation for the non nerve tissue. Look at the normal probability plot. Whoa. Both samples appear to fall along straight lines. So the assumption of normality is probably not a big concern here. But look at the slopes of those two lines, the slopes are dramatically different. So that's an indication that the pooled-t test would not be appropriate in this case. So what do we do? Well, it's really pretty simple. If you're testing the null hypothesis of equal means and say the alternative is two-sided case, and you can't assume that the variances are equal, then just simply plug the actual observed sample variances into your t-ratio as you see in equation 231. Now, the problem is when you do this, the reference distribution is no longer exactly t. However, it can be approximated by the t very well. If we make an adjustment to the number of degrees of freedom, and the adjustment is in equation 232. So calculate this adjustment and use that as the number of degrees of freedom, and that will enable you to do a true two-sample t-test with different variances. If we apply that to the data from this experiment, this is the result we get. I've simply plugged in the two sample averages and the two sample variances are standard deviations, and then computed the test statistic to be 2.7354. How many degrees of freedom? Well, we have to calculate that. So we plug in the appropriate values of S1 square and S2 square, and N1 and N2, which are both 12, and we calculate the number of degrees of freedom from equation 232 to be 16.1955. Now, I suggest that you round the degrees of freedom down to 16. A lot of computer programs will interpolate in the t-table to find the P-value. I think that's really unnecessary. If you had a t-table with integer number of degrees of freedom, just simply use 16 degrees of freedom and everything will be fine. Bottom of the page shows you the Minitab output for testing the difference in means, and the estimate for the difference is 16.94, and the t-test value is 2.74. The observed P-value is 0.007. So there's a strong evidence here. There's strong evidence that says that the means are different. Go back and look at the data. That the mean fluorescence for nerve data is greater than the mean fluorescence for non-nerve or muscle tissue. Let's conclude this session with a brief discussion of inference on a single mean. We've defined comparative experiments as a single factor with two levels or a treatment with two levels. There are some experiments where there's only one population mean and it's to be compared to a target or specified value, say Mu_naught. So there, the hypotheses are H_naught, the null, Mu is equal to Mu naught against an alternative that says Mu is not equal to Mu_naught. So this Mu_zero is usually determined from past experience, maybe knowledge or experimentation that you've done. It may be the result of some theory or model that describes the situation, and it could be the result of some contractual specification or obligation. So how would we apply either a z-test or a t-test to this problem? Well, we have a random sample and we're going to be able to calculate the sample average, y-bar, let's say, the sample sizes in. If the variance is known, then we can use the z-test. The z-statistic is shown right here, y-bar minus Mu_zero divided by Sigma over the square root of n. If the mean of the population, Mu, is equal to the hypothesized value Mu_naught, then the distribution of Z is normal 0,1. So we could use our z-test decision rule to decide what to do about rejecting H_naught. Specifically, if the absolute value of Z_naught is greater than Z Alpha over 2, we would reject the null hypothesis. You could also calculate a P-value. You can also find a confidence interval on the difference rhetoric confidence interval on the true population mean. That's equation 236. In equation 236, Sigma over the square root of n is the standard error of y-bar and Z Alpha over 2 and minus Z Alpha over 2, are the percentage points of the standard normal distribution that correspond to a confidence level of 100 (1 minus Alpha) percent. Here's an example. A supplier submits lots of fabrics to a textile manufacturer. A lot, by the way, is a batch of fabric. The customer wants to know if the lot average breaking strength exceeds 200 psi. So if it does, she's perfectly willing to accept a lot. Past experience tells her that a reasonable value for the variances around 100 psi square. So the hypotheses to be tested are the H_0 Mu is equal to 200 and H1, Mu is greater than 200. So this is a one-sided alternative hypothesis testing situation. She's formulated this problem so that she will accept the lot only if the null hypothesis can be rejected. In other words, you're asking for a strong conclusion. You want demonstrated evidence that the mean is greater than 200. So she selects four specimens from the lot at random, and the average breaking strength observed in those four specimens is 214 psi. So the value of the z-test statistic is computed as 2.80. So if she was using a type I error rate of 0.05, then you would use the upper five percent point of the t-distribution as the boundary of your critical region. In other words, all of the risk would be on that upper side, and Z of 0.05 is 1.645. You can find that from the standard normal table. So what would we conclude? Well, we would conclude that we should reject this null hypothesis because there's strong indication that the breaking strength is greater than 200 psi. You could also use a P-value approach. You would have to simply find the area above 2.80 in the standard normal curve, and that turns out to be 0.00256. You don't have to double that because this is a one-sided test. So we have strong evidence that our breaking strength exceeds 200 psi. So what happens now if the variance is unknown? What changes do we make? Very simple, we use the same test statistic, except we substitute S for Sigma. So this becomes the test statistic, t-zero. That's the one-sample t-test statistic. So we proceed exactly as we did before, except, we find the critical value from the t-distribution, not from the normal. So for a two-sided alternative, we would reject the null hypothesis if the absolute value of that test statistic exceeded t Alpha over 2 with n minus 1 degrees of freedom, n minus 1 is the degrees of freedom, the appropriate degrees of freedom for the t-test. In general, the degrees of freedom for a t-test will always be equal to the number of degrees of freedom associated with the variance estimate in the test statistic. This is the variance estimate down here. So that's a trick that you can always use to help you find the appropriate number of degrees of freedom for a t-test. So that's the end of that. We've got one more topic to talk about in this chapter. We've talked a little bit about hypothesis testing on the cases for both the variances known and unknown. There is material here on testing hypotheses on variances. F-test are used to do that. We're not going to cover that, but we are going to talk about paired experiments. This is a very interesting topic and it's a simple example of the blocking principle, which is something that we'll talk about in module 4 of the course. So that's it for this time, and we'll be back with the discussion of paired test shortly.