Zaslavsky, Kemp, Tishby & Regier (2019) #
@cite{zaslavsky-etal-2019}
Color Naming Reflects Both Perceptual Structure and Communicative Need. Topics in Cognitive Science 11(1), 207–219.
Core Contributions #
@cite{zaslavsky-etal-2019} adjudicate between two explanations of cross-linguistic color naming patterns: perceptual structure (the geometry of CIELAB space) and communicative need (how often colors must be communicated). Their key finding is that both matter.
Perceptual structure partly explains the warm–cool asymmetry. K-means clustering on CIELAB coordinates produces artificial naming systems that already show lower expected surprisal S(c) for warm colors — without any communicative pressure.
Communicative need contributes beyond perceptual structure. The salience-weighted prior (from natural image statistics) exhibits a linear −log p(c) vs S(c) relationship predicted by the CAP theorem, while the perceptually-derived KM-CAP prior does not.
The CAP theorem links need and precision. At a capacity-achieving prior, −log p(c) = S(c) + log Z. This information-theoretic identity is the paper's central theoretical contribution, formalized in
Core.ChannelCapacity.cap_linear.
Integration #
- Theory layer:
Core.ChannelCapacity(NamingChannel, CAP, cap_linear) - The RSA connection: a NamingChannel is an RSA literal speaker S₀, and the posterior is the literal listener L₀.
The 80 WCS color chips analyzed by @cite{zaslavsky-etal-2019}. These are the standard Munsell chips from the World Color Survey, excluding achromatic chips. Each chip has coordinates in CIELAB perceptual color space.
Instances For
Temperature classification: warm vs cool. The warm–cool asymmetry in communicative precision is the paper's central empirical finding. Warm colors (reds, yellows) have lower S(c) than cool colors (blues, greens) across languages.
- warm : Temperature
- cool : Temperature
Instances For
Equations
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
The paper's main empirical finding: across languages, warm colors have lower expected surprisal (= higher communicative precision) than cool colors, regardless of prior choice.
We state this as a property of a naming channel and temperature classification rather than as a concrete computation (which would require the full WCS dataset).
Equations
- One or more equations did not get rendered due to their size.
Instances For
CIELAB coordinates for a WCS chip. L* = lightness, a* = red-green, b* = yellow-blue. Euclidean distance in CIELAB approximates perceptual dissimilarity.
The irregular distribution of the 80 WCS chips in CIELAB reveals perceptual asymmetries between warm and cool colors that partly explain the communicative precision asymmetry.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Perceptual distance between two colors in CIELAB (Euclidean).
Equations
Instances For
A perceptually-derived naming system: k-means clustering on CIELAB assigns each chip to the nearest centroid, creating a hard partition. The paper shows these systems also exhibit warm–cool asymmetry in S(c), demonstrating that perceptual structure alone partially accounts for the effect.
- k : ℕ
Number of clusters (= number of color terms in the language).
Cluster assignment for each chip.
Instances For
Convert a hard k-means partition to a NamingChannel. A hard partition assigns p(w|c) = 1 if w = assignment(c), else 0. This is a deterministic channel (zero conditional entropy).
Equations
- One or more equations did not get rendered due to their size.
Instances For
The paper infers a universal need distribution by averaging per-language capacity-achieving priors (eq. 7): p̄(c) = 1/L Σ_l p_l(c), where each p_l is the CAP for language l's naming system p_l(w|c), found via Blahut-Arimoto.
Crucially, averaging CAPs does NOT in general preserve the CAP condition (footnote 4 of @cite{zaslavsky-etal-2019}): each p_l satisfies IsCAP for its own channel, but the averaged p̄ need not be a CAP for any single channel. The paper's key empirical finding is a dissociation:
- WCS-CAP (averaged from actual WCS+ languages): empirically approximates a CAP — −log p̄(c) vs S̄(c) is approximately linear.
- KM-CAP (averaged from k-means systems): does NOT approximate a CAP (r = 0.32) — suggesting real naming systems encode communicative structure beyond perceptual clustering.
- Salience-weighted prior (from natural image statistics, @cite{gibson-etal-2017}): exhibits both the linear CAP relation AND the warm–cool asymmetry — evidence for communicative need beyond perceptual structure.
Average a collection of per-language priors to obtain a universal need distribution (eq. 7 of @cite{zaslavsky-etal-2019}).
Equations
- Phenomena.LexicalTypology.Studies.ZaslavskyEtAl2019.averageCAP priors c = (∑ l : Fin L, priors l c) / ↑L
Instances For
Any TRUE capacity-achieving prior exhibits the linear relation −log p(c) = S(c) + log Z (eq. 6 of @cite{zaslavsky-etal-2019}).
This applies to each per-language CAP p_l found via Blahut-Arimoto.
However, the paper tests averaged priors (see averageCAP), not
individual ones. The empirical finding that WCS-CAP approximately
satisfies this relation despite averaging is evidence that the CAP
condition is robust across languages. KM-CAP's failure to satisfy
it (r = 0.32) shows that perceptual structure alone does not yield
the same robustness.
A naming channel p(w|c) is exactly an RSA literal speaker S₀ evaluated
at each world c. The posterior p(c|w) is the RSA literal listener L₀.
Channel capacity channelCapacity nc = max_{p(c)} I(W;C) is the maximum
informativity achievable under any world prior.
The paper shows that natural color naming systems operate near capacity: the salience-weighted prior exhibits the linear CAP relation with high correlation. This means color naming systems are approximately information-theoretically optimal — a prediction that RSA makes for any rational communication system.
The key difference from standard RSA: this paper analyzes the prior p(c), not the speaker/listener strategies. RSA typically takes the prior as given and derives speaker/listener behavior. The CAP framework goes one level up: it asks what prior would make the entire system optimally informative, and shows that natural priors approximate this optimum.
This "prior optimization" perspective connects to @cite{zaslavsky-hu-levy-2020}'s rate-distortion view of RSA, where the rationality parameter α trades off compression rate against distortion.