Blog Post

Sabermetrics and the Digital (Jasper Fforde-style) Analysis of the “Unquantifiable”

Trivial pursuit!Baseball records are a statistician’s dream: they go back decades, and include the details of thousands of games, players, teams, and plays: dropped balls, caught balls, stolen bases, hits, home runs, bases stolen… a single at-ball might produce a dozen or more numbers. To a sports fan, these stats can be profoundly interesting; to a non-fan, they make the Trivial Pursuit Sports category profoundly irritating. (If you’re in the latter category, stick with me—I promise I have a non-sports related point!)

Sabermetrics is the study of baseball (or basketball, or football) stats taken to the next level: its proponents argue that a player’s value to a team may be more than the sum of his traditional statistics. Sabermetricians ask: does he get on base—when it counts? Strikeouts—when it matters most? What about the balls that a shortstop should have fielded, but didn’t (and as a result, don’t show up in traditional statistics—ahem, Derek Jeter)? Or even more nebulously: does a player’s presence on the field positively affect his teammates, even if he’s not the strongest player himself?

The last question gets at something harder to quantify: team chemistry. To the extent that it can be quantified in sabermetrics, it can only be calculated after-the-fact, and is often single-player focused. Unexpected team chemistry is a surprise—both when it goes well and badly. On September 3, the Boston Red Sox were 9 games ahead in the American League wildcard race, with 24 games to go. This was a substantial, practically insurmountable lead; Boston had a 99.6% chance of making the playoffs. They didn’t. Faced with this bizarre turn of events, sports writers have been having a field day with the Red Sox’s disastrous clubhouse chemistry.

In his October 15 post on the New York Times’ Bats Blog, “Keeping Score: Collapse of Red Sox Offers Stark Lesson in Team Chemistry,” Neil Paine writes:

From a sabermetric standpoint, this kind of human element is often seen as a confounding factor, a messy variable that cannot be accounted for (and often does not need to be). But the evolution of analytics may lead to the capacity to embrace the chemistry problem and turn it into a competitive advantage.

In professional basketball, companies study player personality types to determine the best fit for a team beyond X’s and O’s. Using souped-up versions of the classic Myers-Briggs questionnaire, the tests they administer break down a player’s mental makeup in categories ranging from coachability to team identity. This information can help decision makers know the type of environment in which the player is built to succeed — or destined to fail.

Of course, the amount of time, effort, and money put into this sort of analysis is enough to make someone who’s interested in literary or historical digital analysis slap their foreheads. That aside, however, this whole sabermetrics thing is, on the whole, pretty new, dating only back to the early/mid-1990s. If sabermetricians are seriously working on how to quantify something as characteristically unquantifiable as “team chemistry”… maybe literary scholars could also take a look at similar things. I.e., can we look at narratives statistically? If sports commentators pull narratives out of statistics, why shouldn’t we pull all sort of statistics out of narratives?

Granted, it seems downright farcical to suggest that, along with our text markup projects, we could use statistics like “child death (maudlin)” combined with “dead mother (broken heart)” and “disfigurement (facial)” to compare literature like Ellen Price Wood’s East Lynne vs. Charles Dickens’ Bleak House. (Did I really just go here? Maybe this is utterly absurdist.) But, the thing is… to  someone who can read it, a baseball game unfolds narratively, and you can follow this (flattened-out) narrative through a scorecard and statistics. Could we come up with a scorecard for literature that could eventually lead itself to a sort of literary sabermetrics?

As literary scholars, we tend to get our stats out of the way at the start of our articles: East Lynne sold more than half a million copies; Ellen Price Wood was 47 when she wrote it. But… would it be useful to know that Wood’s main plot was the eighteenth most-used plot in Victorian novels when she wrote it in 1861, and that ten years later it was the third most popular?

Applying this kind of quantification sounds like Jasper Fforde (in Fforde’s world of fiction, various qualities of plot devices, characters, and backdrops are for purchase and exchange—some even on an underground literary blackmarket, where “head in a bag” and “a shot rang out” plot devices are both cheap). Part of the pleasure in Fforde’s world is its absurdity. But it also relies on readers’ familiarity with literary cliché, techniques, plot devices, and characters, so that a well-read reader gets the joke when Fforde explains that the reason there are so many dour housekeepers in fiction was a poorly-planned overproduction of Mrs. Danverses from du Maurier’s Rebecca. (As for his mapping...)

XKCD comicMaybe the attempt to quantify sports team chemistry is really a backwards attempt to re-create the real life narrative that was the game in real time, and which statistics flattened out in the first place (comic from Maybe we want to avoid that sort of flattening effect with literature. Part of the joke, for Fforde, is that even while he puts Heathcliff and the entire cast of Wuthering Heights in a group therapy session, these characters utterly and obviously transcend any easy categorization (except, of course, as A-1s: fully-developed leads).

But statistics have their place, particularly in a world made up of binary code. Are sabermetricians missing out on the narratives that make these real-life sports games so compelling? Or... are we missing out on this sabermetrics-type thinking...?

Sports fans watch home games in their stadium while checking their stats-driven fantasy teams on their phones; they read stats pages and watch the coach’s post-game press conference. There is no reason that one needs to replace the other. But thinking about how statistics could add a revealing and pleasurable dimenstion to literary study may make Ffordian thought experiments worth their while.




No comments