Hi, my name is Brian Caffo and this is Mathematical Biostatistics Boot Camp Lecture twelve on Bootstrapping. Today, we're going to talk about the tool of bootstrapping which is an incredibly useful, handy result in statistics that you can use in a variety of settings. It was made, sort of coincided with the personal computer revolution. And so, it, it gives us a way to avoid an awful lot of mathematics in biostatistics. Before we talk about the bootstrap, we're going to talk about the jackknife, which is a precursor to the bootstrap that is as its name suggests, a handy little tool. So, let's talk about the jackknife here briefly, before we go into the bootstrap. The jackknife is as exactly as it's name suggests, a handy little tool. The bootstrap, on the other hand, is like an entire workshop of tools. The key idea in both the jackknife and the bootstrap is, to use the data, so called resampling of the data, to get at quantities that are difficult to get at otherwise. For example, variances, and biases, and that sort of thing. Now, we, we don't need either the bootstrap or the jackknife for something like the sample mean where we know all its theoretical properties. But, for other less obvious statistics, we need something that does it for us and, You know, it'd be preferable if that something didn't require a year of mathematics just to get us to the starting point. And, in contrast, the bootstrap is you dream up a, statistic or something like that. And you want to estimate a standard error with it, you can start bootstrapping it immediately. So, Let's talk a little bit about what the jackknife does before we begin with the bootstrap because sort of, historically, the jackknife came first. The first use of the jackknife was by the statistician, I mean. A butcher. His name, but I think it's pronounced Quenouille. And, he used the jackknife to estimate bias, I believe. Then, the jackknife was really popularized and further refined by the extremely well-known statistician, John Tukey, who we talked about a little bit in the lecture on plotting. Tukey had numerous inventions including the fast discreet Fourier transform. He coined the term bit for binary digit. He was the first person to do that, And he did lots of things. He invented the box plot. I think when you see it, you'll conclude along with me, the jackknifes a handy and incredibly clever thing for someone to think of. So, the idea behind the jackknife and similar to the idea behind the bootstrap is to, you have something you don't know, like the bias in a statistics, or the standard error of a statistic, and the idea is to use the data to get a sense of it. Wellm what the jackknofe does is it says, okay, well, one way to get at these quantities is to take one of the observations out. And then, formulate the statistic on the remainder and see how well the statistic does, you know, at estimating that one pulled out observation. And this is very related to the idea that, you know, frequently you hear of in machine learning and statistical prediction of so-called cross validation. The jackknife tends to have a different goal. In that, the goal of the jackknife tends to be bias estimation, or variation estimation. But the principle is very similar in that you're deleting observations. Leave one out cross validation is typically used as an estimate of prediction error. So, anyway, let's just focus on the jackknife. And if you take classes in machine learning or something like that, you'll talk about cross validation. The jackknife deletes one observation and calculates whatever estimate you're thinking of based on the remaining n - one of them. And then, it uses this estimate based on n - one of them in which you get n estimates having left out one observation one at a time. It uses these n estimates to do something like, estimate biases and standard error. And again, no, we don't need this for the sample mean. We know that the sample mean is unbiased under certain assumptions, and we know exactly what the standard error of the sample mean is under the standard setting. So, the jackknife isn't necessary for those settings, but it's, maybe necessary for other ones. So, let's just consider the jackknife for univariate data. And let's let x1 to xn be a collection of univariate data points where we want to estimate a parameter theta. And so, let's let theta be the estimate based on the full data set. And then, let's let theta hat sub i. Be the estimate of theta that you obtain, where you use the n - one observations obtain by deleting observation i.. And then, let's let thta bar be the average of the leave one out estimates. So, with that notation in mind, the jackknife estimate of the bias of our statistic theta hat, is just n - one theta bar minus theta hat. So, let's kind of consider the principal of this before we've, get to why in the world that n - one is there. So, theta hat is our estimate. Looking at how close it is to the averages of estimates where we deleted an observation each time, Is exactly going to give us a sense of kind of population level bias. And then, you might wonder, where in the world does this n - one come from? It's, factor that's based on the, For example, the sample variants where you would experiment the bias of the sample of variants, it would give you the correct answer. The n - one is sort of calibrated by, statistics that we actually know. So again, this estimate is really related to how far the average delete one estimate is from the actual estimate. And then, this n - one is just a factor that was sort of, a good estimate of what is the appropriate multiplier to have to get the bias to be an estimate of the true bias. And then, the jacknife estimate of the standard error is n - one over n times the sum of the squared deviations of the delete-one estimates around the average of the delete-one estimates. So, it's sort of like the square root of n - one times the variance of the delete-one out estimates. You, so, again, the rationale for this factor out front, The extra n - one, why not just, why not just take the variance of delete-one out estimates as an estimate of the standard error of the statistic? Well, it turns out that delete-one out estimates because they have the majority of the data. They have n - one of the data points included. They tend to be quite close to one another, And excessively close to one another. So, the variance, by itself, is not a good estimate of the standard error of the statistic. So, we need a fact, and they calibrated that n - one is a reasonable factor to do that, And the same thing is true with a bias. That, to delete-one out statistics tend to be a little too close to one another unless you sort of multiply this by its estimate by a little, but you don't get reasonable estimate. So, let's go through an example. So, we had 630 measurements of gray matter volume from workers from a lead manufacturing plant. The gray matter volume wound up to be about 589 cubic centimeters. And, we want to estimate the bias and the standard error of the median. And then, I'll come back to this discussion of jackknife the median because that's where we're going to move forward to the bootstrap. So, for example, the gist of the code to do this. Now, you don't actually have to execute the code. I'll show you in a page, how to do it. But, you can do it in any language, not just R. You just have to figure out how to delete observations one at a time. So, let's let n just be the number of observations we have. Theta hat is the median of these grey matter volumes. And then, the jackknife estimates are the median that I obtain each time where I delete the i-th observation, This sapply function is exactly that. Then, theta bar, just exactly from the notation from the previous couple of slides is just the mean of these delete-one out jackknife estimates. Then, my bias estimate is going to be n - one times the difference between theta bar and theta hat. And, the standard error is going to be the square root of n - one times the average squared deviation of the jackknife estimates around the average of the jackknife estimates. And then, on the next page, it's a lot easier to do this. [laugh] If you want to just use the software in the bootstrap library, you can jackknife, out is the jackknife function is the list of my grey matter volumes and the function I want to calculate the jackknife estimate of is the median. And then, I assign that to a variable out, then I pick out the standard error and the bias calculation. Both methods yield a estimated bias of zero and a standard error of 9.94. And, There's an odd little fact. The jackknife tends to work well for sort of smooth functions, and empirical quantiles often don't satisfy that requirement. The median is an example. So, it's an odd little fact the jackknife estimate of the bias for the median is always zero when the number of observations is even. So, the medians an example where the jackknife isn't that good of a thing to do. In general, if your function of the data, a nice smooth function, The estimate that you're getting is a nice smooth function of the data, then the jackknife will work fine. But, if it's not, then it tends to work pretty poorly. In that, there was a very well known paper by Efron, the inventor of the bootstrap that illustrated this quite starkly. And the jackknife has been shown to be a linear approximation of the bootstrap. So, if you're in some setting where it's going to be difficult to program off the bootstrap, then doing a jackknife, which is a pretty simple thing to do, is a handy little tool to use. And then, just to remind you, you know, don't use the jackknife for sample quantiles. It's a handy procedure and it works in a lot of settings, but maybe not for sample quantiles, like the median, as it's been shown to have some poor properties. And what could you possibly use then? Well, why not try to use the bootstrap. So, let's move on to the bootstrap which is maybe a little bit more of a complete toolbox but it's certainly a little less compact of a tool than the jackknife in exactly the way the analogy to the tools sounds like. By the way, the term bootstrap comes from this idea of pulling one out by ones own bootstraps, right? And, you know, of course, This has been discussed a lot. It's kind of an unfortunate title for a statistical procedure, because it makes it sound like the information's coming from nowhere, Right? Because you can't pull yourself up from your own bootstraps. It's physically impossible. But, you know, there's been plenty of theoretical work that shows where the information is coming from, from the bootstrap in, sort of, when it is applicable. Another thing I would note is this idea of pulling oneself up from one's own bootstrap is from the fable of Baron Munchhausen. And so, there's a great movie called The Adventures of Baron Munchausen. And it was done by some of the people who made the Monty Python series. If you get a chance, you should, you know, in honor of this lecture, watch the Baron Munchausen movie. But, at any rate, from that fable is where the term, pulling oneselves up from one's own bootstrap comes from. And then, that's where they got the idea for the name from this procedure. Any rate, Back to the jacknife. So, another way to think about the jackknife is this idea of so called, pseudo observations. So, if you take n times theta hat minus n minus one times theta hat sub i, you can kind of think of these as whatever observation I contribute to the estimate of theta. And then, notice that if, if the theta hat is the sample mean, then these pseudo observations are exactly the data themselves. So, it's sort of this idea of taking what worked in a very neat and tidy sense for the sample mean in trying to extend the idea to other statistics. And then, the sample standard error of these observations is the jackknife standard error. And, the mean of these observations is a sort of bias corrected estimate of the parameter that you're interested in. So, it takes your ordinary estimate and attempts to correct the bias. I have to admit, for my thinking about the jackknife, I kind of prefer to think about it this way in terms of the pseudo observations than in the, sort of, classical development of it.