Suppose I tell you I have the basketball records for all members of a college team for a particular year. And I do, in fact. It's for the University of Michigan. And suppose I pick the names of two players out of a hat. Here we go, I picked up Andre and Bob. Now suppose I ask you the likelihood that Andre scores more points than Bob in the second game in the season given that he scored more points than Bob in the first game. You're probably going to say it's greater than 50% and less than a 100%, but just what probability would you pick? Now, suppose I ask you the probability will score more points than Bob in the last 20 games of the year given that he scored more points than Bob in the first 20 games of the year. What would you guess? First, the probability that Andre scores more than Bob the next game given that he did this game I've asked University of Michigan students to make a guess about this. And what they guessed is that it's about 0.66. That is two times out of three Andre is going to score more than Bob in game two, given that he scored more in game one. What's the actual probability? It's actually 0.68. So people are very well-calibrated for basketball. And, incidentally, they're equally well-calibrated for spelling tests. I happen to have the spelling test scores for a whole year of a bunch of fifth graders, and the numbers are really very similar. So, what is people's guess about how likely it is that Andre will have a higher total for the next 20 games, given that Andre had a higher total for the last 20 games? That guess is 0.77, University of Michigan students estimate. So it goes from, in their view, a two-thirds chance to a three-quarters chance. The actual gain, however, is huge. It goes from about 0.68 to about 0.90. So people are well calibrated for both basketball and spelling and for abilities in general, I happen to know. For one occasion to another occasion. And they recognize that you have a better chance if you're looking at 20 occasions versus 20 occasions. But they very substantially under estimate the gain due to law of large numbers. Let's look at personality traits. I happen to know how friendly people seemed to be to raters, psychologists who were rating people in a wide range of situations. And I also know about the honesty of kids who were tested over many, many different kinds of honesty tests. Now, does the kid cheat in a game, does the kid take more than his or her fair share of cookies. So, given that Alice was friendlier than Betty on average over the last 20 occasions what's the likelihood that she'll be friendlier than Betty on the next occasion? University of Michigan students guessed that that was about 0.83, and in fact it's 0.82. So they got that right. What did they guess about the likelihood that Alice will be friendlier than Betty the next time she's encountered. Given that she was friendlier than they the last time she was encountered. University of Michigan students guessed that that probability is 0.80. Now that's pretty suspicious isn't it? They think that you can predict one occasion from another occasion about as well as you can predict 20 occasions, the average of 20 occasions from another 20 occasions. In fact, it's remarkably far off. The actual probability here is about 0.55. In other words, it's just not very likely. It's slightly more likely than 50% but it's not very likely that Alice is going to be friendlier next time, given that she was friendlier this time. So, we wildly overestimate the stability of behavior from one occasion to another. Moreover, we have very little recognition of the role of the law of large numbers, which tells us that we would have much greater predictability for the average of 20 occasions from the average of 20 other occasions. Then we could have from one occasion to another. I'm going to take off my statistician hat now and put on my psychologist hat. One of the most important lessons that psychologists have to tell us is this error about personality traits. People think that if person a is friendlier on one occasion, the odds are really very good that they will be on the next occasion as well. And that's wrong. For every trait that's ever been examined, friendliness, honesty, aggressiveness, extraversion, conscientiousness. In a later lesson I'll talk about why we make this error of assuming there is great stability for traits, here I'll just sketch the reason. The main driver of behavior is the situation that we find ourselves in. It's not some internal state that keeps making us behave in like fashion no matter what the circumstances. The party on Saturday night is a different kind of occasion from the committee meeting on Monday morning and the person who is friendlier on Saturday night at that party is not particularly likely to be friendlier at the committee meeting on Monday morning. Carlos may have been more honest than Edward in one situation But that's only a very weak basis for assuming he'll be more honest than Edward in some other situation. The mistake here is called the fundamental attribution error. Behavior of a kind that's prompted by the situation someone is in is mistakenly attributed to personality traits. We expect people to be much more of a piece than they really are, and that gets us in trouble. It's interesting to convert these probability judgements to correlations. Which we talked about in the last lesson. So I'm going to look at correlations here, that's on the Y axis. And on the bottom, we're going to look at item to item, that is, say, occasion to occasion, and total to total, that is to say, 20 occasions to another 20 occasions. And for abilities, the correlations there, the real correlations, it's about 0.5 for one situation to another, which is substantial. And when you go from 20 situations to another 20 situations, that correlation is 0.95. So if Andre scores more points in the first 20 games of the season you can bet your bottom dollar, he's going to score more points in the second 20 games in the season. People's guesses are right on the money, the correlation is about 0.5 for both, basketball playing, and for spelling, and for other abilities. And people's guesses do reflect that they understand the law of large numbers to some degree, but they way underestimate. Correlation has become 0.95, they think it's only 0.75. The situation is very different with personality traits. From one occasion to another, the predictability Is only a correlation of about 0.15. It's extremely low. Despite that fact, people really do have personality traits and if you've got 20 observations across different situations for someone, you can predict pretty darn well about the next 20 observations for that person. But we're hopelessly miscalibrated on this. People think that the correlation is 0.8, very very high, when it's tiny. And they also have no recognition that the law of large numbers applies here. They don't realize that there's much difference between one occasion to another as opposed to 20 occasions to another 20 occasions. So here's the overall picture. We're accurate at the level of the individual occasion for abilities. We don't quite understand how much more predictability we get when we have more data. But for personality traits, we're wildly miscalibrated. For the degree of predictability we get from observation on one occasion to observation on another occasion. In my twenties, I went to London for ten days. And lucky me, there was blue sky everyday. And it came away thinking, you know, the English are really big complainers. They're always talking about the rain. But, you know, I was there for ten days and it's gorgeous weather the whole time. But years later I spent a week in London in November. It rained every day. Oops, I was hasty before. Now I know perfectly well that the weather is highly variable. Every place I've ever spent very much time. But I fail to realize that London is going to be like every other place I know about. That is, the weather is variable. So I failed to realize that I didn't have enough evidence about London weather because I didn't stop and think, wait a minute, what's the variability for this dimension? A few years ago a cousin of mine from Wisconsin came to visit me in Ann Arbor. And we were talking about where we go for vacations and I was saying my wife and I really love going to Chicago. She said really, Chicago? I saw those slums from the expressway on the way here. Well, does my cousin not know that cities are variable? Of course she does. She knows every city has some high rise office buildings, some suburbs, some slums, but she didn't stop to think about that. She didn't stop to think about the fact that the dimensions she's trying to make a judgement about is one that's highly variable and she'd better get a large sample before she can come up with a conclusion that Chicago is a pretty slimy place. When we get evidence about a person or about a thing, we're likely to use that evidence to create a judgment. So wow, Madison is really a good bridge player, I had no idea. She's probably good at card games in general and maybe good at games in general. Maybe she's really very smart. And the sequence here is from an observation to a generalization, something we do dozens of times every day of our lives. Sometimes the generalizations are legitimate and sometimes not. And we can think about every observation of anything as being composed of true score plus error. Sometimes the error is zero, but, nevertheless, that observation is still of true score plus the error. This is true for every kind of observation about Bill's height. Got an observation. Make a judgment about how tall he is. Which is how tall he really is plus whatever error we have. Or the correlation between, or an observation about blood pressure. Which, by the way, is extremely variable. If you really want to know what your blood pressure is, you've gotta take it over many occasions of as different a kind as possible. Or Danny's spelling ability, or Alice's friendliness. So all kinds of things are variable. Observations of them contain error. Failing to remember this, we over-generalize from small samples. We do it for lots of things, even for weather and the composition of cities, but we almost always do it for behavior that can be understood as trait related. I want to talk now about a particularly dangerous type of statistic. In the last presidential race, one of the candidates announced that vaccinations caused autism. And his evidence for this, was that he had an employee and the employee had a child and the child got a vaccination and the child was autistic there after. That's what I call a man who statistic, as in I know a man who. The evidence is completely inadequate in this case. You actually require hundreds of cases to come to a conclusion about this because autism, thank god, is a rare occurrence. And you're going to have to have hundreds of people who are given, with the flip of a coin, they get the vaccination. Others, by the flip of a coin, don't get the vaccination. And the experiment has been done, and vaccination does not cause autism. We're constantly being offered man who statistics. Anecdotal evidence that can't bear the weight that's being put on them. And let's admit it, sometimes we're the ones who do the offering. Smoking doesn't cause lung cancer. Grandma smoked a carton a week and died at 102. That's a man who statistic. Blatsmobiles are lousy. I have a friend who bought a used one and transmission went, and the next thing, the rear end went, the whole exhaust system clocked out. Finally sold the thing for junk. Miami isn't safe anymore. My Uncle Max was robbed walking out of a 7-Eleven. And the weather's great in London. I was there for ten days once. When you hear a man who statistic, ask yourself, how much evidence is this, really? How much evidence would it take to establish the point being made by the anecdote? And that in turn is always good to depend on how much variability you think there is for the event in question. So lesson number two has been concerned with how you figure out what is the case about some object or person. What's going on? Lesson number three is going to be concerned with how you figure out what is related to what. And that's the topic of correlation.