So I mentioned that outliers were problematic for resampling methods, as they are for classical methods. And I want to make sure you're aware of a common technique to handle outliers, which turns out to be pretty useful, I think, in data science context, perhaps because the data is noisy enough that outliers are pretty common. So this is the rank transformation, and it's dead simple. The idea is to replace each value with its ordinal position in sorted order, so that the lowest value receives the value, is transformed to value one. The next lowest is transformed to the value two, and so one. So if you had data that look like this, where this is just sort of a meaningless observation ID, and and the actual measurement was here. There's all kinds of structure in the, whoops I'm sorry I'm not using the correct pin here. There's outliers here, and this is some measurement value. There's all kinds of structure in the observations down here pegged along zero, but these outliers up here are kind of screwing up the ability to kind of see that structure. So what you can do Is sort the values and you can sort of see what these outliers are doing. Again, there's all kinds of structure down here, but these outliers up here at the end are dominating the effect, and are gonna cause problems for most of the statistics we might want to do. After you do this rank transformation, you end up with this kind of nice even spread. And what we've done is take the, again the highest value gets, this is no longer in sorted order. This is back in the original observation ID order. But the highest value up here used to be orders of magnitude higher in the measurement space, but now it's just value 100 in the rank space. So this is the statistics that you'll compute over this, you can still take many of the same statistics and it still works. But you should be aware that the questions that you're answering are now different. So for example, in our running example here, the rank transformation answers a different question. Do the patients who receive the new treatment tend to be among those who live longer, as opposed to does the mean number of days survived higher among treated patients? So for example, if your looking at data salary measurements, if you have Bill Gates in your sample, there's gonna be a skew in the data. Although perhaps not anymore with salary. Maybe net worth as opposed to salary. I'm not sure what kind of salary he's drawing these days. But if you put it in rank order, he would still be the highest net worth individual perhaps, but it won't have that skewing effect and then you can reason about things. Do people with certain kinds of jobs tend to be among those who have the highest net worth, without reasoning about the exact value of that net worth? Which, again, sometimes it doesn't matter. You wanna be in the five, if you wanna reason it out properties of the top 5% earners. Those are valid questions you can ask independently of worrying about exactly how much they make. And in fact, these kinds of questions make sense in other kinds of contexts as well. For example, it kind of naturally takes care of cost-of-living differences between cities, for example.