(Oh hey the draft of this said “baseball season is starting soon” – that is no longer correct, but should be a nice warning to anyone expecting timely posts)
I am a fan of baseball. But as a Mariners fan living on the east coast (and as an archaeologist who spends portions of the summer abroad), I don’t get a chance to watch my team play very often. Usually I am left with looking at a box score, looking at the array of runs scored in each inning, the pitchers’ stat lines, and the batting order’s series of hits, runs, RBIs, walks, strike-outs, et cetera. I want to call these kinds of statistics “narrative statistics,” because by looking at these series of counts I can get an idea of how the game progressed: whether all our runs were scored in an inning or two, who scored the runs of the game and who was on base when it happened, a general impression of how the starter fared. While I can’t watch the game, I can get an idea of what happened before I go looking for highlights and game recaps.
While I’m admittedly fairly ignorant of most of the SABR metrics that are out there now, I do enjoy analytics-heavy sports journalism. In discussions about the place of SABR metrics in baseball, everyone generally agrees that at least SOME level of statistics (or counting at its most basic sense) is necessary to play (well, to watch and interpret) the game. These most basic statistics (the “narrative statistics”) are necessary for the description of a game. Thus, one can answer questions like “who won the game?” by counting how many runs were scored by each team.
However, sticking to just narrative statistics may mislead you if you ask more complicated questions, like “which team is better?” or “who is more likely to win this upcoming game?” Derived metrics (such as Pythagorean Expectation) provide more accurate descriptors of whether team A is better than team B and would thus generally be favored to win a game against them. What makes these metrics better isn’t the fact that they’re derived or more abstract, but that they “modeled statistics”: they involve some model of how to interpret the phenomena they describe. Pythagorean Expectation involves some model of how a baseball team’s quality relates to the relationship of the number of runs it scores and allows. By adopting a model about the phenomenon you’re interested in, researchers (and baseball fans) can start to ask probabilistic questions that allow predictions. The applicability of our model’s assumptions can be validated and updated by the performance of our expectations with reality.
These thoughts coincided with me reading VanPool and Leonard’s (2010) Quantitative Analysis in Archaeology, which begins with a description of the kinds of numbers we archaeologists encounter and use to describe the past. I was struck by how a lot of what we look at as archaeologists can be described as narrative statistics: the Journal of Field Archaeology is full of site reports that will provide laundry lists of X number of cow bones or Y number of cores in Z houses excavated. By looking at these tables, we can get a fairly accurate sense of what it was like to excavate the site, but it’s less clear how these counts relate to how past human life operated (I think I just suggested middle-range theory, whoops).
Narrative statistics aren’t bad, nor are they less useful than modeled statistics – since a specific archaeological excavation or baseball game happens only once, it is imperative to collect and report the narrative statistics so that those of us who weren’t there have an idea of what happened. But I think it’s important to emphasize the two types of statistics are better for answering different types of questions. As a zooarchaeologist, I’m painfully aware that many of our “statistics” are of the narrative variety. Much of what we report provides details about the bones that passed over our tables. Again, this is necessary for those of us who can’t look at the bones first-hand: tables of faunal compositions are like box scores, with online data depositories like OpenContext or tDAR are like baseball-reference.com, providing as much narrative detail as you’d ever want (as long as people upload to them).
But what about modeled statistics? What are the models we use in zooarchaeology? There are some examples of modeled statistics, especially when you talk about estimating body size fluctuations and sex ratios, but there’s room for improvement. Right now, a lot of work on age estimation and survivorship is largely in the realm of narrative. While we use some models of biology to organize the order in which bones fuse and teeth develop, researchers have been largely reticent to model age-related data in a way that would allow us to ask probabilistic questions and make testable models relating to herd management and hunting strategies. Estimating animal abundance is another incredibly thorny issue that explicit modeling is necessary to address.
I am not suggesting that current practices need to be abandoned for something more complicated, more exclusive, and more opaque for those not studying your specific system. While we can get a deeper understanding of how different sites relate to one another by using modeled statistics, we still need to know what happened at each site. Describe your assemblage to your heart’s content – it’s necessary to get that information out there and let’s face it, zooarchaeologists like reading about zooarchaeological assemblages. But currently we’re trying to find the best player from their number of RBIs and the best pitcher from their win-loss record. There is definite room for improvement, and we owe it to ourselves as a profession and other people interested in the past to get a better grip on how to ask better questions and use data more creatively.