@cite{schlotterbeck-wang-2023} — Incremental RSA for Adjective Ordering #
@cite{cohn-gordon-goodman-potts-2019} @cite{degen-etal-2020}
Schlotterbeck, F. & Wang, H. (2023). An incremental RSA model for adjective ordering preferences in referential visual context. Proceedings of the Society for Computation in Linguistics (SCiL) 6, 121–132.
The Model #
The incremental sequence speaker (S1^inc) produces adjective–noun sequences word-by-word. At each step the utility is the incremental listener's posterior. The trajectory score accumulates utility across all prefixes:
S1^inc(w₁,...,wₙ | r) ∝ ∏ₖ U(w₁,...,wₖ; r)
where U(w⃗; r) = exp(β · log L0^inc(r | w⃗)) and the paper sets β = 1 in all reported simulations. With β = 1, no cost, and uniform language prior, this simplifies to:
S1^inc(w₁,...,wₙ | r) = ∏ₖ L0(r | w₁,...,wₖ)
The model uses continuous/noisy semantics (@cite{degen-etal-2020}) where each word applies with reliability v (correct application) or 1 − v (noise).
Key insight: With strictly positive noisy semantics, the prefix meaning is a product of per-word terms, and multiplication commutes. Therefore the full-sequence L0 posterior is order-independent: L0(r | w₁, w₂) = L0(r | w₂, w₁). In the paper's batch-normalized model, where S1^inc scores are normalized once over all trajectories, the ordering preference ratio S1^inc(adj₁,adj₂,n|r) / S1^inc(adj₂,adj₁,n|r) reduces entirely to the first-word L0 posterior ratio L0(r|adj₁) / L0(r|adj₂).
Formalization #
This uses RSAConfig's sequential infrastructure (following
@cite{cohn-gordon-goodman-potts-2019} and @cite{waldon-degen-2021}):
Ctx = List Word— the prefix produced so fartransition ctx w = ctx ++ [w]— append the next wordinitial = []— start with empty prefixmeaninguses continuous/noisy semantics (lexContinuousQ) with scene-dependent reliability parameters
Predictions use trajectoryProb for ordering preferences and S1_at for
first-word informativity, proved via rsa_predict.
Findings #
| # | Finding | Theorem |
|---|---|---|
| 1 | Prefix meaning is order-independent | prefix_meaning_swap |
| 2 | Size discriminatory → size-first preferred | size_first_when_size_discriminates |
| 3 | Equal discrimination + color reliable → color-first | color_first_when_color_reliable |
| 4 | Both orderings identify the target (A) | both_orderings_identify_target_A |
| 5 | Both orderings identify the target (B) | both_orderings_identify_target_B |
Connections #
- Noise theory:
lexContinuousQinstantiates the unified noise channel fromRSA.Core.Noise. SeelexContinuous_as_noiseChannel. - PoE structure:
prefix_meaning_productshows two-word prefix meaning decomposes as a product of per-word semantics, matching @cite{degen-etal-2020}'s Product of Experts. - Incremental RSA: Extends @cite{cohn-gordon-goodman-potts-2019} with scene-parameterized continuous semantics.
- Psychophysics: The paper's size perception noise is parameterized by
Weber fractions — the just-noticeable size difference is proportional to
absolute size (@cite{luce-1959}).
Core.Agent.PsychophysicalChoicederives Weber-like intensity ratios from Stevens power law + JND thresholds (stevens_jndL_intensity_ratio). A deeper integration could derive thesRelreliability parameter from aStevensScaleexponent rather than stipulating it, grounding the noise in the psychophysical theory layer.
Simplifications #
The paper's full model includes components not formalized here:
- Gaussian+binomial perception: The paper models size via Gaussian
distributions with Weber fractions and color via binomial noise ε
(@cite{degen-etal-2020}).
Core.Agent.Psychophysicsformalizes the Stevens power law and multidimensional decomposition that underlie Weber's law; a future integration could derive size reliability from this framework. We currently use a simpler noise model with flat reliability parameters sRel and cRel. - Language model P_Lang: The paper constrains the S1 vocabulary at each step to grammatically valid continuations (noun vs adjective). Our S1 distributes over all 6 words at each step. This does not affect the qualitative ordering predictions.
- S1^{inc_utt} vs S1^inc: The paper defines both a word-level speaker (S1^inc, used for data fitting with β = 1) and an utterance-level speaker (S1^{inc_utt}). We formalize S1^inc.
- Bias parameter b: The paper includes a prior bias b for size-first ordering (to account for language-specific defaults). We omit this.
The specific reliability values (sRel, cRel) are chosen to demonstrate the qualitative predictions — they are not taken from the paper's fitted values.
Equations
- One or more equations did not get rendered due to their size.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Equations
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Whether a word is veridically true of a referent.
Equations
- One or more equations did not get rendered due to their size.
- Phenomena.WordOrder.Studies.SchlotterbeckWang2023.wordApplies Phenomena.WordOrder.Studies.SchlotterbeckWang2023.Word.sticker x✝ = true
- Phenomena.WordOrder.Studies.SchlotterbeckWang2023.wordApplies x✝¹ x✝ = false
Instances For
Perceptual reliability for each word type: size words use sRel,
color words use cRel, the noun "sticker" applies universally.
Equations
- Phenomena.WordOrder.Studies.SchlotterbeckWang2023.reliabilityQ sRel cRel Phenomena.WordOrder.Studies.SchlotterbeckWang2023.Word.big = sRel
- Phenomena.WordOrder.Studies.SchlotterbeckWang2023.reliabilityQ sRel cRel Phenomena.WordOrder.Studies.SchlotterbeckWang2023.Word.small = sRel
- Phenomena.WordOrder.Studies.SchlotterbeckWang2023.reliabilityQ sRel cRel Phenomena.WordOrder.Studies.SchlotterbeckWang2023.Word.blue = cRel
- Phenomena.WordOrder.Studies.SchlotterbeckWang2023.reliabilityQ sRel cRel Phenomena.WordOrder.Studies.SchlotterbeckWang2023.Word.green = cRel
- Phenomena.WordOrder.Studies.SchlotterbeckWang2023.reliabilityQ sRel cRel Phenomena.WordOrder.Studies.SchlotterbeckWang2023.Word.red = cRel
- Phenomena.WordOrder.Studies.SchlotterbeckWang2023.reliabilityQ sRel cRel Phenomena.WordOrder.Studies.SchlotterbeckWang2023.Word.sticker = 1
Instances For
Noisy word meaning: returns reliability if the word truly applies, noise floor (1 − reliability) otherwise. Simplified from @cite{degen-etal-2020}'s continuous semantics.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Prefix meaning: product of noisy word meanings over a word sequence. This implements the Product of Experts model from @cite{degen-etal-2020}: each word contributes an independent noisy channel value.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Strict positivity: with reliability strictly between 0 and 1, every word–referent pair has a strictly positive noisy meaning value. This ensures the incremental L0 is well-defined (no zero denominators).
Scene A: Size-discriminatory scene. Objects: {big-blue, small-blue, small-green, small-red}. Target: big-blue. "big" uniquely identifies the target (1/4 objects are big).
Equations
- Phenomena.WordOrder.Studies.SchlotterbeckWang2023.sceneAMembers Phenomena.WordOrder.Studies.SchlotterbeckWang2023.Referent.bigBlue = true
- Phenomena.WordOrder.Studies.SchlotterbeckWang2023.sceneAMembers Phenomena.WordOrder.Studies.SchlotterbeckWang2023.Referent.smallBlue = true
- Phenomena.WordOrder.Studies.SchlotterbeckWang2023.sceneAMembers Phenomena.WordOrder.Studies.SchlotterbeckWang2023.Referent.smallGreen = true
- Phenomena.WordOrder.Studies.SchlotterbeckWang2023.sceneAMembers Phenomena.WordOrder.Studies.SchlotterbeckWang2023.Referent.smallRed = true
- Phenomena.WordOrder.Studies.SchlotterbeckWang2023.sceneAMembers x✝ = false
Instances For
Scene B: Equal-discrimination scene with color more reliable. Objects: {big-blue, big-green, small-blue, small-green}. Target: big-blue. Both "big" and "blue" narrow to 2/4 referents.
Equations
- Phenomena.WordOrder.Studies.SchlotterbeckWang2023.sceneBMembers Phenomena.WordOrder.Studies.SchlotterbeckWang2023.Referent.bigBlue = true
- Phenomena.WordOrder.Studies.SchlotterbeckWang2023.sceneBMembers Phenomena.WordOrder.Studies.SchlotterbeckWang2023.Referent.bigGreen = true
- Phenomena.WordOrder.Studies.SchlotterbeckWang2023.sceneBMembers Phenomena.WordOrder.Studies.SchlotterbeckWang2023.Referent.smallBlue = true
- Phenomena.WordOrder.Studies.SchlotterbeckWang2023.sceneBMembers Phenomena.WordOrder.Studies.SchlotterbeckWang2023.Referent.smallGreen = true
- Phenomena.WordOrder.Studies.SchlotterbeckWang2023.sceneBMembers x✝ = false
Instances For
The target referent in both scenes.
Equations
Instances For
Incremental RSA for adjective ordering, parameterized by scene and
perceptual reliability. Uses RSAConfig's sequential infrastructure:
- L0 uses product-of-experts noisy semantics
- S1 uses identity scoring (β = 1, no cost)
trajectoryProbchains word-by-word S1 probabilities
Equations
- One or more equations did not get rendered due to their size.
Instances For
Scene A config: sizeRel = 99/100, colorRel = 95/100.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Scene B config: sizeRel = 80/100, colorRel = 95/100.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Size-first ordering for the big-blue target.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Color-first ordering for the big-blue target.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Prefix meaning for two words is order-independent. This follows from commutativity of ℚ multiplication: foldl(*lex) 1 [a,b] = lex(a)·lex(b) = lex(b)·lex(a) = foldl(*lex) 1 [b,a].
Prefix meaning for three words is independent of the first two words' order. Swapping the adjectives before the noun does not change the product semantics.
Two-word prefix meaning decomposes as a product of per-word noisy meanings. This is the Product of Experts (PoE) structure from @cite{degen-etal-2020}: each word contributes an independent noisy channel value.
Finding: When size has high discriminatory power (Scene A), S1^inc prefers size-first ordering.
Finding: When both properties discriminate equally but color is more reliable (Scene B), S1^inc prefers color-first ordering.
The ordering preference flips between scenes: Scene A prefers size-first, Scene B prefers color-first. This captures @cite{schlotterbeck-wang-2023}'s key prediction: the preferred ordering depends on the discriminatory structure of the scene, not a fixed ordering rule.
After hearing both adjectives, the meaning function assigns highest value to the target among Scene A members.
After hearing both adjectives, the meaning function assigns highest value to the target among Scene B members.
lexContinuousQ is an instance of the unified noise channel from
RSA.Core.Noise. The continuous lexical semantics is exactly the
noise channel with onMatch = reliability, onMismatch = 1 − reliability.
This connects @cite{schlotterbeck-wang-2023} to the @cite{degen-etal-2020} parameterization where mismatch = 1 − match.
Qualitative findings from the incremental RSA adjective ordering model.
- prefix_order_independent : Finding
Prefix meaning is order-independent for any two words.
- size_first_when_size_discriminates : Finding
When size has high discriminatory power, size-first ordering is preferred: trajectoryProb(size-first) > trajectoryProb(color-first).
- color_first_when_color_reliable : Finding
When both properties discriminate equally but color is more reliable, color-first is preferred.
- both_orderings_identify_target_A : Finding
The meaning function correctly identifies the target (scene A).
- both_orderings_identify_target_B : Finding
The meaning function correctly identifies the target (scene B).
Instances For
Equations
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Map each finding to the model prediction that accounts for it.
Equations
- One or more equations did not get rendered due to their size.
Instances For
All 5 findings verified.