Created: 10 Apr 2013 | Modified: 23 Jul 2020 | BibTeX Entry | RIS Citation |
In this talk, I explore issues involved in connecting individual-scale cultural transmission models to archaeological data. In particular, archaeologists study a record of past human activity which is time averaged to varying degrees, observed through artifact classifications which vary in their approach to “lumping” and “splitting” variation. Cultural transmission (CT) models explain the distribution of artifact types and behaviors through patterns in social learning, cognitive biases, and social network structure. I explore how artifact class diversity measures are affected by the “coarse graining” of time averaging and archaeological classification, and whether this renders classes of behavioral-scale models indistinguishable as spatiotemporal scale increases. I conclude by exploring the role such coarse grained CT models can play in archaeological explanations.
Major point: CT models are analytical tools, not full-fledged explanatory models.
Introduce the distinction between “cultural transmission” and “social learning.” These have been used as synonyms, but there is a highly useful distinction here.
Social learning models describe the mechanisms and contexts by which informational resources and skills are acquired, preserved and extended (to use Sterelny’s terms) in a population.
Cultural transmission models describe the distributional consequences of social learning processes within populations, given their spatiotemporal and demographic history.
A lot of ink has been spilled about whether the CT models in the literature are “unrealistic.” Sperber certainly thinks so, as does Gabora. And anybody who has spent time looking at actual social learning (in humans or other animals) knows that we aren’t just imitation machines. Social learning, and by extension, cultural transmission, are not simply stochastic diffusion processes, even ones with spatial structure and perhaps bias in selecting models/targets.
But it’s also clear that at large scales, stochastic models of diffusive flow within a structured context are one of our most important mathematical tools for understanding how microscopic dynamics yields classes of macroscopic behavior, with interesting mesoscopic structure.
Social psychologists studying social learning mechanisms can afford to ignore this apparent conflict, and depending upon the research questions involved, so can sociocultural anthropologists (but not in other contexts). But except in rare contexts, archaeologists and paleobiologists cannot. We face empirical records which are irreducibly macroscopic in scope, aggregated along time, space, and social dimensions, and usually with data sets that are small samples of analytical classes taken from fossil or artifactual remains (rather than detailed behavioral observations or controlled experiments).
The relationship between detailed social learning models, and their “coarse grained,” large scale consequences, are thus crucial for us. I want to urge us to think of the relationship between these types of models as similar to the way physicists try to relate theories at different scales. The paradigm example, of course, relating observational (“phenomenological”) models of the behavior of solid matter, or liquids, or gases, to the theory of molecular motion and atomic physics. Historically, the study of “thermodynamics” and “solid state physics” generated the macro-level descriptions of matter and its behavior, and the study of “statistical mechanics” attempted to relate the “observables” of the macro-level description to probabilistic descriptions of the behavior of collections of molecules or atoms.
The direct approach to “reducing” thermodynamics to statistical mechanics is to build a detailed model of the behavior of the components with random terms, sum over all of the atoms and molecules (and potentially their interactions), to generate predictions for the macroscopic variables we can measure and observe. We can think of this as the “canonical” approach to theory reduction.
Except that it only works in very special cases, where the lower level model is so simple that the equations have very special properties (i.e., can be reduced to a simple sum of linear terms) which yield probability distributions we can work with. This happens, incidentally, mainly when we work with highly idealized systems like the “ideal gas,” which is composed of atoms so thinly spread out that we can assume that no atoms ever collide or interact, for example.
Real systems, of course, are full of rich interactions. In fact, all of the meso- and macroscopic structure we study empirically has to result from interactions, because we know the world isn’t a smooth, homogeneous system. So we have to take a different approach to relating theories at different levels. In general, physicists talk about “renormalization” and “coarse graining” to relate theories at different scales. Here, I’m going to use the more intuitive term “coarse graining.”
Think of “coarse graining” as changing the “zoom” level on your camera. When you zoom out, you capture more scene, but at lower detail and usually, resolution. When you zoom in, you restrict your view to a smaller scene, but you capture finer detail about that scene.
As archaeologists, we need to think about explanatory models as involving a “zoom level.” Where our zoom level is partially determined by our choices – the questions we ask, the models we construct, the scales we choose to analyze – and partially determined by the hard realities of a fossil, sedimentary record of past human activity.
In classic terms, we describe that record by making observations (or inductions) along three dimensions: space, time, and form (REF). Every single description we make of the record is accompanied (usually implicitly) by a specific “zoom level” along each of these dimensions. Surface collections or stratigraphic levels in an excavation become analytic assemblages, which have spatial extents and temporal durations (of deposition, of manufacture, of use…), and which are described by “types” or classes, which vary in their location along the “lumper/splitter” axis.
One major consequence of the varying “zoom levels” at which we work is that sometimes, we want to address hypotheses which are aimed at a fine zoom level, even though we often have data only from coarse levels. If the various hypotheses have unique descriptions as we “coarse grain” the model (either analytically or via simulation), then we have no real problem – just the standard abduction or “model selection” problem.
Often, however, the relationship between models is not one-to-one as we change zoom level. Physicists found this out when building models of realistic materials, finding that different materials exhibited common behavior (along certain dimensions) even though their microscopic details differed. The study of “phase transitions” and “critical phenomena” led to the discovery of universality classes.
If detailed models of social learning are what we usually want to study, we need to understand what happens when we change the “zoom level” – and in particular, we need to understand if we still end up with one-to-one mappings to macroscopic behavior, or whether there is convergence to “universality classes” as we zoom out.
There are good reasons for assuming the latter, at least with fairly simple models of population structure and social learning. Such models have much of the same mathematical structure as some of their physical counterparts, which isn’t surprising since we construct simple models to be be tractable first, and realistic second.