Saturday, August 30, 2008

'Baseball Between the Numbers'

By now you might have noticed that I've mentioned baseball here a lot. The frequency of its discussion is entirely coincidental. The local team doesn't suck this year, prompting interest; pennant races heat up a little more each passing week, prompting more; and the non-baseball books I reserved from the local library keep getting renewed by whatever glacial readers currently have them, prompting the baseball books I reserved to keep coming in one after another.

Also, a few weeks ago, I read a review of one baseball book, which inspired me to check it out. While reading criticism of it, the critic mentioned and praised another baseball book, which I looked into, which led to the namedropping of another book and yet another. At that point, I figured I should try to cover all the bases (pardon the expression) and just read whatever everyone apparently considered to be the new "classic" sportswriting. After all, it beats another round of Nazis having their frostbitten toes gnawed off by mice.

As to why I'd be interested in baseball books in the first place, I enjoy the game. Even if I didn't, I'd probably affect an appreciation of it, considering The Wife absolutely loathes football, and at some point you just need to watch sports for no particular reason other than to watch sports. (Thankfully, she thinks baseball's all right.) But I also especially enjoy relearning about a game I learned in childhood by applying to it sabermetrics, the attempt to understand baseball from new, mathematically quantifiable standpoints.

Unfortunately, I don't enjoy the math. This presents a problem. Sabermetrics — a back-formation from SABR, the Society for American Baseball Research — are compelling new measures of events in baseball, ones that cause us to rethink the collective wisdom of over a century, but they're all math. And not just the comfy counting or average-based math that you're used to from old baseball statistics. (For instance, home runs: counting. Batting average: adding up every time someone gets a hit and dividing by the number of times he appeared at the plate. PECOTA?—uh, I give up.) A great many of the metrics conceived of by saberguys the nation over require a wonky familiarity with advanced statistics. What this means, in practical terms, is that you may not have much fun reading Baseball Prospectus' Baseball Between the Numbers: Why Everything You Know About the Game Is Wrong.

The book comes highly recommended from numerous sources, and for what it tries to do, it is unquestionably a very good book. Whether you can enjoy what it's doing is another matter entirely. I can't. I am not a luddite, but I struggled through calculus and never quite comfortably grasped the statistics required for my advanced science courses. At some point, I throw my hands up in the air and just accept that the mathematics behind a thing like EQA (equivalent average, which factors in league and ballpark effects — i.e., worse pitching in the league, short fences in the player's home ballpark, both of which increase home runs and doubles — and thus represents an average that can compare reasonably with any other) just make sense and that I don't have the mental wherewithal to question the premises that informed its creation.

For the most part, I don't have to. Rival gangs of sabermetricians are out there ready to rumble with the Baseball Prospectus gang if their mathematics prove shoddy. This sounds like an exaggeration, but it's not. The sort of people who recreationally try to create new statistical measures are also the sort of people who take pleasure in examining the flaws of other people's statistical measures. Also, a lot of stat wonks are bitchy people. As unscientific as it might be to resign myself to the fact that these metrics just work, it's also just as true that everyone else in the same field has more interest than I do in disproving the worth of the metrics — especially if they were created by someone else and rival your own newly created metric.

As with most things academic, credit is almost all. The authors of a new, proven proprietary metric stand a chance of selling their services to interested parties and making money. (Still others stand a chance of being hired by baseball franchises, a monetary and childhood-dream compensation that provides a strong incentive to be right. Both Bill James, basically the father of sabermetrics, and Voros McCracken — who invented Defense-Independent Pitching Statistics (DIPS) and whose name sounds like the punchline to a schoolboy roll-call joke in Greece — were hired by the Boston Red Sox.) For the most part, if a sabermetric is still in use by multiple stats sites after a couple of years, it means that it's held up under scrutiny and testing. It may not be the best means of measuring what it tries to measure, but that owes less to the notion that someone is trying to pass off poor mathematics and more to the fact that the ideal metric has yet to be devised.

Baseball Between the Numbers was definitely written by and for people who not only enjoy understanding those metrics currently devised but also enjoy the process of getting there, creating tables, testing hypotheses and expanding or contracting the amount of data analyzed. The book is divided into 27 chapters (like a ballgame: three outs, nine innings), each one asking a question. Almost all of the questions are engrossing to read:
1-2 Was Billy Martin Crazy?
4-1 What If Rickey Henderson Had Pete Incaviglia's Legs?
9-1 What Do Statistics Tell Us About Steroids?
9-3 Why Doesn't Billy Beane's Shit Work in the Playoffs?
But the process of answering them is not. Almost every one of these questions can be answered with one sentence.

* — Answers:
1-2 No.
4-1 A relatively insignificant drop in runs scored.
9-1 Not much
9-3 A balanced team stands a better chance of winning, but teams with multiple solid starters and good defense and good offense fare better than teams with only one or two exceptional pitchers or merely an exceptional offense — which we already knew anyway.

Statistics, tables and methodology occupy most of the space of each chapter. Simply asking the question, stating why certain statistics are probative and explaining why the answer is significant takes far less space. There are two immediate downsides to this:

1. As said above, if you either understand or really like trying to understand statistics, the whole ride must be a blast. If you love data tables, this book might also be pornography for you. However, if terms start skirting around or outright abandoning your capacity to understand, most of the chapter becomes a wasted effort, and all those tables start to look like smut in a computer language too abstruse to turn you on. You can skip to the last page or skim aggressively without missing much more than you would already have helplessly missed to begin with.

2. If you already explicitly understand why an old idea is flawed and already understand the new metric that better informs you, the process of reaching a chapter's conclusion can be a wearisome journey through familiar territory. For example, take chapter 1-1 What's the Matter with RBI? ... and Other Traditional Statistics. If you care at all about baseball and sabermetrics, you already know the answer to this. You've probably been over the answer dozens of times by now. Simply put: RBI can't tell you how good an individual is, since his "runs batted in" are dependent on other people to get on base in front of him so he can bat them in. Given that the statistic is dependent on the performance of multiple players, its traditional usage as a metric for evaluating one player, without context, is fundamentally flawed. Similarly, people who get on base stand a better chance of scoring a run, because they are already one base closer to home plate. But batting average does not count walks. Player A, who gets one hit in every four at bats but strikes out three other times, will have a batting average of .250. Meanwhile, Player B, who strikes out twice in every four at bats but walks the other two times, will have a batting average of .000. However, Player B will get on base half the time, giving him a .500 on-base percentage (OBP), compared to Player A's .250. According to old baseball stats, Player B is worthless, while Player A is pretty much a league-average batter. But the goal of any baseball team is to score more runs than their opponents, and doing that requires getting on base. Given that, Player B is actually twice as valuable a player. (We won't get into slugging percentage here.)

In the case of item number one, the book's capacity to inform you is limited by your capacity to be informed. In the case of item number two, however, a good deal of what the book would teach you is information you may have learned elsewhere in more accessible, more memorable and more anecdotal ways. Probably the first example that leaps to mind is Michael Lewis' excellent Moneyball.

Comparing this book to Moneyball, though, is unfair. Lewis' book sought to profile a unique figure in baseball at a time when the ideas he used were far from mainstream. In part due to that book's success, Baseball Between the Numbers (published in 2006, well after the "Moneyball" concept of exploiting inefficient metrics in player evaluation had entered public consciousness) doesn't seek to reveal new ideas but rather give them fully articulated, tested and replicable credence. Its contents are less revelatory and more explanatory that Moneyball's precisely because these ideas entered the public consciousness enough that people began demanding more articulated mathematics to bolster their worth.

This book isn't intended to be a primer or an introductory course to sabermetric concepts but rather a fleshed-out exploration and demonstration of their validity. That's the point from which the real pleasure of this book springs: for everyone who's heard a baseball traditionalist irrationally demand a dozen concrete examples of the value of OPS (on-base percentage plus slugging), salvation comes in table after table and comparative analysis after comparative analysis. Those seeking a pithy and light introductory glance at new metrics will be disappointed. But anyone who's sat back in frustration at an inability to produce a research-paper's understanding of how speed contributes to winning ballgames — especially anybody who's felt a winning argument slipping away from a want of data — will find all sorts of salvation in each chapter.

If you decide that you're one of the latter people and pick up the book, only two remaining elements might be cause for irritation or disappointment:

1. Because this is a Baseball Prospectus book, written by Baseball Prospectus staffers, it often accidentally reads like an advertisement for Baseball Prospectus. No contributor seems to be consciously tooting his own horn or lavishing praise on his colleagues, but the constant repetition of BP's name gets progressively more annoying. It doesn't help that the writers use BP's own proprietary metrics and eschew metrics from other sabermetricians. It makes sense: they wouldn't create those metrics if they didn't believe they were the best, and obviously it doesn't make sense to celebrate others' work if they sincerely believe their own work to be superior. It just provides more instances in which the BP name gets dropped, contributing overall to the advertisement atmosphere, however deliberate or accidental that might be.

2. It's a book about stats written by statheads. As narrow-minded as it might sound, statisticians don't exactly have a long and storied track record as poets of the modern age. Bill James earned a reputation for verbal economy and interesting turns of phrase, but he's the exception that proves the rule. Aside from periodic jokes about some ballplayers' careers and aside from a few contributors' chapters — Dayn Perry's come to mind — there aren't many opportunities for smiles, wry observations, drama, suspense or humanity. The authors can't be blamed: this is a book about mathematical explication, not the triumph of the human spirit in the face of impossible odds. Still, in the hands of slightly more unwound authors, perhaps more humor and a more conversational style would have shone through. As it is, much of the book contains dry material explained via a kind of dry mentality.

Nevertheless, the book succeeds in setting out the complex data it wants to interpret and demonstrating how that interpretation is valid. It is doubtless an excellent resource for anyone who wants a detailed understanding of how sabermetrics work — or just an opportunity for a detailed argument about them. The question of whether the book will be an entertaining read is best asked of the reader, as interest in the book is something brought to it, rather than something it provides.

Rating: 3
Strongly recommended for anyone who enjoys statistics or seeking an extremely detailed argument about sabermetrics. Recommended for baseball fans determined to expand their understanding of sabermetrics, even if they can't fully grasp everything in the book. Not recommended for anyone looking for a breezy, anecdotal or character-oriented look at sabermetrics. For that, see Moneyball: The Art of Winning an Unfair Game. Also, for anyone interested in learning about how these metrics work via mockery of people who don't understand them and persist in using flawed stats like RBI, see Fire Joe Morgan.