Math behind Moneyball

You will learn how to predict a team’s won loss record from the number of runs, points, or goals scored by a team and its opponents. Then we will introduce you to multiple regression and show how multiple regression is used to evaluate baseball hitters. Excel data tables, VLOOKUP, MATCH, and INDEX functions will be discussed.

- Professor Wayne WinstonVisiting Professor

Bauer College of Business

Okay.

In this video, let's try and apply Bill James's great runs created formula which

could predict runs a team scores pretty well from batting statistics to a play.

Okay. So let's use an example,

the great Rogers Hornsby, who hit his peak years in the 1920s.

in 1924 he had 424, the highest batting average ever in a full season.

And he actually had 32 home runs.

Okay.

Sorry, 25 home runs.

He had 42 home runs in 1922 and hit 401.

Will somebody hit .400 again?

I doubt it, the way things are going.

And then we've got Prince Fielder in 2009, logged 46 home runs,

2007 logged 50 home runs.

Hit about 300.

So who would have created more runs per game?

Okay, and then that's sort of a nice metric to compare hitters on.

And then later on we can talk about linear weights and

what you probably heard of on base and where that came from.

Okay, so basically we have to look at how many outs there are per game.

And you'd say 27 because there's 9 innings in a game, but why isn't that quite right?

because you don't always bat in the bottom of the 9th,

games go into extra innings, games can be rained out before 9 innings.

But if you look at it, it's around 26.7 outs per game.

Okay. So if you look at the run created by

a player and you divide by how many game outs he creates.

In other words, if you assume 27 outs per game or

180 outs wasted, equals roughly 4 games.

And so what's the basic number of outs a batter creates.

And then there's extra outs which I call sacrifice flies plus

stolen bases plus grounded into double plays plus caught stealing.

Well start with your total at bats.

Take away errors, about 1.8% when I looked at this a couple years ago,

of at bats would result in errors.

And then take away hits and that's your basic outs at a level.

And so if you take the runs created for the player and divide it by

the game outs that he created, okay, games of outs, which I have here,

like games of you can figure out how many runs per game the hitter created.

So, is Roger Hornsby had these great years hitting 400 with some power

a better hitter than Fielder who hit more home runs but only hit 300?

And the answer yes if you assume that they faced equal

pitching which is something that's difficult to analyze.

So basically if I look here, I've got the runs created formula so

what this would be.

Okay. I would take, again,

I would take hits plus walks plus hit by pitcher.

>> Okay. >> And then I'd multiply it by slugging,

by total bases, which is 1 times singles plus 2 times doubles

plus 3 times triples plus 4 times home runs.

>> Okay.

>> And so x10 has the total bases there.

Sum of product function takes one, two, three, four times singles,

doubles, triples and home runs.

So I've got the total bases there.

And that gets multiplied times your reached base.

And then you divide by hit bats plus walks plus hit by pitcher.

And that'll give us runs created as we saw before.

Now how about total outs?

Again, I take bats minus 0.018 bats is 0.982 times hit bats.

Take away hits and add the extra outs and divide by 26.72.

So in this year, Roger Hornsby basically used 15 games of outs.

This year, when he was hitting that 424 batting average,

he only used 13 games of outs and he played most of the game there.

So the runs created per game,

you just take the runs created divided by the games of outs.

So you can see in his best year there Roger created about four,

well actually his best year was 1925 I believe.

He hit 424 in 1924, that's easy to remember.

And 1925 basically he hit 403.

He had more home runs and he had the same number of doubles.

39 home runs instead of 25, so even though he didn't hit as well.

Actually, that was his best year on runs created.

But you can see Roger Hornsby and his best years would create 13 or 14 runs per game.

Now if we look at Fielder, do exactly the same calculations.

You can check them if you want.

Basically, he was a solid nine runs per game which is really good.

Any team that scores nine runs a game is probably going to win the World Series

these days when teams average four runs per game.

But Hornsby was much better in terms of runs created per game.

Okay, so that's a crude, but

still not a bad formula to evaluate how good a hitter's hitting is.

So, we'll try to improve on this in the next few videos.

Using the concept of linear weights, but

we need to learn about multiple progression first.

