Devlin's Angle

September 2004

A game of numbers

The approach of the 2004 World Series sees the publication of not one but two books on the use of statistics in baseball. By statistics, I don't mean what most fans seem to think this means, namely collecting and tabulating game stats, but the use of sophisticated mathematical techniques to examine players' performance and the effectiveness of various plays in depth, to help clubs make hiring and salary decisions, and to decide on game strategy.

Alan Schwartz, a senior writer at Baseball America, has written a book called The Numbers Game, and math professor and former MAA President Ken Ross of the University of Oregon has published A Mathematician at the Ballpark. These books come close on the heels of last year's bestseller Moneyball, by Michael Lewis, which described how the Oakland As used mathematics to turn itself into one of the most successful teams in the league, despite being one of the poorest. The Schwartz book is a history of the use of statistics in baseball; it fills in a lot of the details that Lewis skipped over in Moneyball. Ross's book tries to explain the math itself. All the facts in this article are taken from one or more of these three books.

At this point, I need to admit up front that I am not a baseball fan. I attended my first Major League game only this year. Not that I have anything against the game. Just that, growing up in England, baseball looked to me like rounders played by men in pyjamas who seemed to wear very scratchy underpants that required constant adjustment and who had an unusual propensity for spitting.

Still, as Alan Schwartz points out in his book, many thousands of Americans got interested in math by collecting baseball statistics, including some who went on to be professors of mathematics at major universities like Harvard. I may be one of the few people in the world who did it the other way: I have become interested in baseball (to a degree) through math. And in writing about baseball, as I am now doing, I am following in the tradition set by the "father of baseball": Henry Chadwick.

Chadwick was a young English cricket reporter who became interested in baseball in 1856. (The game itself evolved from the English games of cricket and rounders in the 1830s.) Throughout a long career as a sports writer, Chadwick was an avid (and highly opinionated) promoter of the collection of baseball statistics and the computation of various measures of player and team performance, including the subsequently famous - though not particularly informative it turns out - batting average.

Moneyball

In 1997, when a former player called Billy Beane became General Manager of the Oakland Athletics, the team was one of the worst in the game, ending the year with 65 wins and 97 losses, 25 games behind the Seattle Mariners, who won their division. Beane's problem in addressing this weakness was that the As was also one of the poorest teams. While the New York Yankees spent $126 million on its twenty-five players that year, and had another $100 million to dip into if needed, Beane had just $40 million. (Exact financial comparisons are impossible, since clubs organize and report their finances in different ways, but you get some idea of the financial differences from the fact that for the period 1995-99, the Oakland As reported a loss of $44.95 million while the Yankees declared a profit of $64.5 million.) With more money, Beane might have gone shopping for some All-Star players, or else sent out his scouts to find some untapped talent in a high school or college, or perhaps overseas.

Beane did neither. His first major hire for the Athletics was a 26-year-old assistant manager named Paul DePodesta, who had majored in economics at Harvard. DePodesta had never been much of an athlete. As a reserve infielder on the university's baseball team, he "couldn't run, couldn't throw and had no power," he later told reporters, and as a wide receiver for the football team he had quickly come to realize that "the sideline was my friend." DePodesta could do one thing well, though, and that was enough: Mathematics.

What happened next is the subject of Moneyball. By surfing the Internet, downloading baseball data, and using statistical software to analyze it, DePodesta managed to put together a winning team for a fraction of what his competitors spent.

The situation is reminiscent of Wall Street in the 1980s, when a group of young mathematical types began to apply their skills to the stock market. Instead of relying on experience, intuition, foresight, or other traditional "people skills," the newcomers treated the market in purely abstract terms, as an enormous equation. And they made a killing.

Baseball is particularly suited to a by-the-numbers approach, Beane and DePodesta realized, because it's so dependent on individual performances. In football, a play can only succeed if all players on a team work together - blocking, running, catching, and throwing in synch. But Barry Bonds doesn't need anyone's help to hit a home run. In baseball, every pitch, every hit, every catch or throw depends on an individual's success or failure, so it can be given a precise numerical value. Take those numbers and analyze them dispassionately and you just might have the makings of a championship team.

For example, in 2001, DePodesta tried to draft a college player called Kevin Youkilis, an overweight third baseman who could neither run, throw, or field. The one thing he did have was the second highest on-base percentage in all of baseball after Barry Bonds. Similarly, no other Major League team had shown any interest in Jeremy Brown, a senior catcher at the University of Alabama. To the traditional scouts, Brown, with a soft, chubby body, simply did not look like someone who could play ball. To DePodesta, he was a player who racked up an awesome number of walks, and the math told DePodesta that walks were supremely important to winning games. The As made Brown a first-round draft pick in 2001.

By reducing baseball to a numbers game in this way, the Oakland As confounded all the old-time baseball pundits, becoming champions of the Western division of the American League in 2000, 2002, 2003, and setting an American League record by winning 20 consecutive games in 2002.

Patterns galore

As Schwartz makes clear in his book, baseball has always been, in one way or another, a game of numbers, and keeping statistics was viewed as important from the very start. The first primitive box score was printed in the New York Morning News in 1845. Soon after that, newspapers regularly printed tables of statistics after each game.

[Since this column is read by people with little background in mathematics, at this juncture maybe I should note that statisticians talk about two kinds of statistics: little-s statistics and capital-S statistics. Little-s statistics is (or are) numbers: counting, tabulating, calculating averages, and so on. Big-S statistics is the use of (often advanced) mathematics to process all those numbers in order to make informed decisions: such as, whether to hire player A or player B and for how much, whether batting order is really important (no), whether there is really such a thing as a clutch hitter (no), or whether Joe DiMaggio's 56-game hitting streak was just a matter of luck (no, although many other hallowed records probably were), etc.]

Baseball is a natural game both to collect little-s statistics in and to apply big-S Statistics to. One reason is obvious: there are lots of things to count. Another reason may be a bit less obvious: for all the skill and artistry of the great players, there is an enormous random element to the game. From those two observations, it doesn't take much mathematical knowledge to realize that it is likely to be quite hard to separate mirages from reality.

There have been over 11 million batter-pitcher confrontations in the more than 150,000 games that have been played since the major leagues began. With so much raw data and so many things you can do with that data, coupled with a big random element, you are going to get lots of patterns. Figuring out whether they tell you anything useful is likely to be very difficult. As several experts have observed, many of the most hallowed streaks and other records that made players famous are quite likely simply the result of pure luck. Like winning the lottery, sooner or later one player or another would have done it.

For instance, if the outcome of every play in major league history had been decided purely on luck, with no skill involved, someone would have chalked up a .424 batting average (as Rogers Hornsby did), someone would have scored three home runs in a World Series game (as Babe Ruth and Reggie Jackson did), and a whole ton of players would have earned a reputation for being clutch hitters (as many did). The only record that would not have arisen through pure luck is DiMaggio's 56-game hitting streak. The best that would have happened by chance is a 46-game streak, or thereabouts.

This does not mean that there isn't a lot of skill involved in baseball. Nor does it mean that some players aren't better than others. It does suggest that there is a lot more that happens due to pure luck than most fans (or record holding players) would like to admit.

But what statistical theory taketh away with one hand, it can give back, at least in part, with the other. A good example of what can be done with advanced statistics is a 1997 study made by Harvard statistician Carl Morris of the legendary Ty Cobb's batting record. Cobb hit above .400 for three seasons, .420 in 1911, .409 in 1912, and .401 in 1922. The question then is: was Cobb a true .400 hitter? He might have been a .385 hitter who got lucky those three seasons. On the other hand, he was just below .400 for two seasons, .390 in 1913 and .389 in 1921, so maybe he was a .400 hitter who just got unlucky those years. Morris analyzed Cobb's entire record and concluded that there is an 88% chance that Cobb was a true .400 hitter for some season (though not necessarily one of the three seasons when he actually hit that level).

How do you measure how good a batter is?

Ross, in his book, explains the most common baseball statistics for evaluating batters. By far the most common, although the least informative of all, is batting average, which, according to Schwartz, a man called H. A. Dobson of Washington suggested in a letter to Chadwick, who thereafter promoted it in his writings. Batting average, AVG, for a given period, is given by dividing the number of hits, H, by the number of official at-bats, AB:

AVG = H/AB

Batting averages are now generally regarded as a poor guide to performance - not least because they do not distinguish between a single, double, triple, or home run, or how many players on bases advance by virtue of a hit. They are also mathematically problematic, as shown by a curious phenomenon that can crop up known as Simpson's paradox, which Ross describes in his book.

Consider the records for Major League players Derek Jeter and David Justice in 1995 and 1996.

In 1995, Jeter had 12 hits from 48 at bats for an average of .250, while Justice had 104 hits from 411 at bats for an average of .253. So in 1995, Justice looks better than Jeter.

In 1996, Jeter had 183 hits from 582 at bats for an average of .314, while Justice had 45 hits from 140 at bats for an average of .321. Again, Justice looks better than Jeter.

So who was the better hitter over the two year period combined? You might think it is Justice. After all, in each year, Justice had the higher average. But do the math.

For the two year period 1995-96 combined, Jeter had 12 + 183 = 195 hits from 48 + 582 = 630 at bats for an average of .310, while Justice had 104 + 45 = 149 hits from 411 + 140 = 551 at bats for an average of .270. So over the two year period, Jeter did much better than Justice. Curious, no? Just who was the better hitter? Batting average won't tell you.

As I mentioned a moment ago, the most obvious defficiency of batting average is that it ignores extra-base hits, runs batted in, and bases on balls. A better statistic, that came to prominence in the 1980s, is slugging percentage, SLG. This takes into account the total number of bases, TB, given by:

TB = 1B + 2 x 2B + 3 x 3B + 4 x HR

where 1B is the number of singles, 2B the number of doubles, 3B the number of triples, and HR the number of home runs. The slugging percentage is given by the formula:

SLG = TB/AB

Although better than batting average, slugging percentage (which, as defined, is not a percentage, although the answer can easily be given as one), is problematic in that it gives too much weight to extra-base hits.

These days, arguably the most popular measure of batter effectiveness - because it has been shown to be very accurate - is the on-base percentage, OBP. This gives the proportion of actual plate appearances where the player gets on base (or scores a home run). It is given by

OBP = [H + BB + HBP]/PA

where BB is the number of bases on balls, HBP is the number of times the batter is hit by a pitched ball, and PA is the number of plate appearances, given by

PA = AB + BB + HBP + SF

where SF is the number of times the batter hits a sacrifice fly.

OBP was introduced in Sports Illustrated in 1956, which reported that Duke Snider of Brooklyn had led the National League with a 39.94 on-base percentage. In the 1960s, baseball statistician Pete Palmer ran correlation analyses that showed OBP was far superior to batting average and slightly more important than the more widely known slugging percentage. By then, the more statistically savvy fans had realized something that few of their fellow fans, and apparently few managers, knew: avoiding outs, which OBP measures, was more important than hitting runs. One person who did realize this was a man called Eric Walker, of whom more later.

An even better measure of performance than slugging percentage or on-base percentage is their sum, known rather imaginatively as on-base plus slugging:

OPS = SLG + OPB

In the late 1970s, a self-styled baseball writer called Bill James (we'll meet him again later) discovered a remarkable statistic for measuring a batter's performance that he called the "Runs Created Formula":

RC = (H + BB) x (Total bases)/[AB + BB]

This formula, which James discovered by trial and error, turns out to be a remarkably accurate predictor of the total number of runs a team will make in a season. Consequently, the higher the RC value, the more games the team will win overall. (The formula won't tell you much about winning the World Series, since that depends on the outcome of a small number of specific games; rather, like all statistical techniques, its accuracy is over a complete season.) The value of the RC formula to the team manager is that it shows the exact relative importance of the contributions players with different talents can make to a team's overall performance. It shows, for example, that walks are a major contribution to a team's overall success, whereas batting averages, by not figuring in the formula, are largely irrelevant.

A brief history of baseball statistics

Following Chadwick, a major impetus to the collection and tabulation of statistics in baseball came with Babe Ruth's exploits in the 1920s. With Ruth, the focus shifted clearly from team performance to individual performance.

There were many errors in the collection and recording of statistics, some of which were not discovered until many decades later. For instance, Ty Cobb is credited with a .401 batting average in 1922, but the true figure is now known to be .399. This particular discrepancy was discovered at the time, but ignored to avoid annoying fans by lowering the record below the magic .400.

In 1951, the Official Encyclopedia of Baseball was published, listing (for the first time) every major league player, past or present, with the batting averages given for each hitter and the won-lost record for each pitcher.

In the late 1950s and early 1960s, George Lindsey, an Operations Research expert at the Canadian Department of Defense, applied Operations Research techniques to analyze past games and develop baseball strategies. For instance, he found that the sacrifice bunt has value only late in the game when just one run is needed, and that stealing bases is rarely worth it. He also found that when hitters faced pitchers of the opposite handedness, batting averages go up by 32 points and that a true .300 hitter would often bat .180 over as many as seven games due purely to randomness. No one outside the OR and statistics communities took any notice.

A similar fate met the efforts of several others statisticians and OR practitioners.

The appearance in 1964 of the book Percentage Baseball, by Earnshaw Cook, drew more widespread attention, but still had little impact on clubs. Using a statistic called Scoring Index, Cook showed that the best ever hitter was Ty Cobb, beating out Babe Ruth, Ted Williams, and Lou Gehrig. This statistic was way ahead of its time. It's very close to the modern on-base percentage times slugging percentage.

In 1965, David Neft, a statistician for the Lou Harris polling organization with a BA in Statistics from Columbia University, persuaded Information Concepts, Inc. to commission him to create a computerized baseball encyclopedia. Neft soon realized that the existing records were so error ridden, he would have to recreate all of baseball's statistical record from the very beginnings of the game. He hired a staff of 21 researchers to work on the task for two years, traveling all over the country looking at original game reports in newspapers. The book was published in 1969. It had 2,338 pages and weighed six-and-a-half pounds. It came out to both rave reviews and controversy - the latter because it corrected many hallowed records. Its appearance also led to the formation in 1971 of SABR, the Society for American Baseball Research.

The initial group of 16 professional statisticians who gathered in Cooperstown, New York, in 1971, to form SABR has grown today to over 7,000 members worldwide, and produces an annual journal, The Baseball Research Journal. From the start, the SABR statisticians were less interested in ranking players, than in improving overall play. They made use of the latest statistical techniques, coupled these days with masses of computing power.

Very few SABR members are in professional baseball. The organization includes some sports journalists, but for the most part the members' love of baseball is purely an amateur one, albeit pursued with professional zeal. What they do bring to the game is a wealth of knowledge, ability and experience in the application of statistical techniques. (In the early days, before the advent of powerful desktop computers, they also brought access to some of the nation's most sophisticated computer systems - the systems of their employers, which were set to work on baseball statistics during the night, sometimes in secret, when the company was not using them.)

In 1977, a self-styled sports writer named Bill James published the first of what would become an annual (and initially self-published) magazine: Baseball Abstract, which ran until 1988. In it, in addition to saying some remarkably sensible things about baseball statistics, James coined the term "sabermetrics" to refer to the application of mathematical principles to the production and use of statistics in baseball, as advocated and carried out by the members of SABR.

James was particularly critical of the statistical measures advocated by Henry Chadwick, among them fielding errors, batting average, and RBI (runs batted in). Those statistics were easy to understand and to calculate, so baseball managers and coaches took to them right away. But they were often less informative than they appeared. For example, Chadwick measured a fielder's performance by his number of errors, yet to have an event recorded as an error the fielder has to do something right by being in the right place at the right time, and what he does is an "error" only when the observer makes a judgment of what another fielder might have done in the same circumstances. Chadwick also recorded a walk as a pitcher's error but gave no credit to the hitter who might have shown great judgment in deciding when to swing.

By a process of trial and error, James also produced his now famous Runs Created Formula, which we met earlier, and which is a remarkably good predictor of a team's overall success.

In 1981, a data company called STATS, Inc. was formed, securing a major contract to supply statistics for reporters covering the Oakland As, who wanted to increase their fan base. Soon after, the Chicago White Sox signed up to secure data for salary negotiations. Despite this encouraging start, four years later the company was effectively bankrupt; the baseball world was not yet ready for a computerized statistical service.

By the late 1980s, efforts by SABR members had uncovered many errors in the Baseball Encylopedia. (According to Schwartz, many of them were introduced deliberately by the editor who took over from Neft, Joe Reichler, who wanted to revert to the older, but incorrect records that Neft's team had corrected.)

In 1989, Pete Palmer and John Thorn published their book Total Baseball, a 2,294 page volume that not only put the records straight but also gave many of the newer statistics that had been developed, including James's Runs Created.

In 1990, STATS, Inc. was still in existence - just - due to a collaboration with Bill James' amateur network Project Scoresheet, which collected game statistics through a nationwide network of amateurs. That year, the company secured a major contract with USA Today to supply the statistics for a massively revamped daily box score. It was back in business. The same year, Electronic Arts bought STATS, Inc. data for its Earl Weaver Baseball game, and ESPN used STATS to supply the statistics for its newly launched MLB coverage.

In 1991, Associated Press abandoned their own data collection organization and contracted to buy their statistics from STATS, Inc. and in 1994, STATS went live on the Internet, supplying detailed numbers of games as they were being played.

What Moneyball left out

By focusing on Beane and DePodesta's efforts at the Oakland As, Lewis's account in Moneyball gives two impressions that, according to Schwartz, are misleading.

First, the Oakland As was not the first club to make a serious attempt to use statistics to build a team. The Brooklyn Dodgers had tried it in the late 1940s and early 50s. Manager Branch Rickey made the first ever hire of a full-time statistician, Allan Roth, and the two of them set out to apply mathematics to building a baseball team. For example:

Based on Roth's statistical analyses, after the 1947 season, in which Dixie Walker had hit .306, he was traded to Pittsburgh. Two years later he was out of the majors, his form having completely gone, as Roth's figures had indicated would happen.
In 1949, Jackie Robinson, a lowly .296 hitter, was moved to the cleanup spot. Why? Because Roth's figures showed that Robinson had batted .350 with men on base. By the end of the year, Robinson was named National League MVP, batting .342 with 124 RBIs.
In May 1952, Roy Campanella was batting .325, but the team played Rube Walker against Cincinnati, a player in the low .200s. Why? Roth's figures showed that Campanella's liftetime batting average against the Reds' pitcher was a paltry .065.

The first manager known to have used statistics in the dugout was Earl Weaver, manager of the Orioles from 1968 to 1982. He kept up-to-date player statistics on index cards and consulted them in making play decisions.

Lewis's second false impression, according to Schwartz, is that Beane introduced the mathematical approach to the As. The honor for that, according to Schwartz, goes to an NPR sports reporter called Eric Walker, way back in 1981. Schwartz tells the whole story.

In the 1970s, Walker, a former aerospace engineer, had started doing some radio reporting for the NPR affiliate KQED in San Francisco. A chance visit to a San Francisco Giants game was all it took for him to see through his engineer's eyes what few others appeared to have noticed: the supreme importance of walks and of a batter not getting out.

Walker started to talk about his observation, and some ideas to capitalize on it, in his daily five-minute morning baseball report on NPR.

He also took his ideas to the Giants, but they never bought into them, so a few months later he took his pitch across the Bay to the Oakland As. By then, he had written up his ideas in a little book called The Sinister First Baseman. The As' legal counsel, Sandy Alderson, had heard Walker on NPR, and had just read his book, and as a result Walker was very well received by the club. The As hired him as a consultant, and started to implement his ideas. Among the decisions they made by following Walker's creed were:

In June 1984, they drafted slugger Mark McGwire tenth overall rather than two more speed-oriented players, Shane Mack and Oddibe McDowell.
In 1986, they let go slugger Dave Kingman (35 homers and 94 RBIs the previous season) because he rarely walked, having a low OBP of .258. They signed Reggie Jackson (OBP .381) as new designated hitter.
In 1987 they traded Alfredo Griffin, who drew few walks, for pitcher Bob Welch.
In 1988 they signed Don Baylor, a power hitter who frequently got on base by being hit by the ball.
In 1989, they acquired Rickey Henderson, who had a super record of both homers and walks, and Ken Phelps, another player with a high OBP, both from the Yankees.
In 1990 they acquired another good walker, Harold Baines.

By concentrating on OBP, the As became a highly successful team, winning four division titles and three American League pennants from 1988 to 1992. Then they hired Billy Beane, and started to indoctrinate him in their ways.