Anderson (2021): Conversation Update for RSA #
@cite{anderson-2021}
A system for multi-turn conversation update in the Rational Speech Acts framework. The core contributions:
- Common ground as distribution: the CG is a probability distribution over worlds, substituted directly for the RSA world prior
- Learning-rate update: CG evolves via convex combination with Pragmatic Listener posteriors
- Shared vs. approximate CG: two models differing in whether participants share a single CG representation
- Observation sampling: weighted, thresholded, and difference-based strategies for cooperative speaker behavior
Key Connections #
The CG update rule CG'(w) = (1-lr)·CG(w) + lr·post(w) is algebraically
identical to @cite{luce-1959}'s linear learning rule with retention rate
1-lr and reinforcement target post. This connects RSA pragmatics to
learning theory: multi-turn conversation IS iterated learning over
distributions.
The distributional CG refines @cite{stalnaker-2002}'s classical context set: worlds with zero weight are excluded from the context, recovering set-intersection update as a special case.
BToM Connection #
Anderson's distributional CG has the type expected by BToMModel.sharedUpdate
(Shared → Action → World → Shared). Setting Shared := DistributionalCG W
instantiates BToM's discourse dynamics for the first time in linglib — the
shared state is a distribution that evolves after each utterance via the
learning-rate update.
MutualFriends Domain #
The paper illustrates predictions using the MutualFriends dataset, where worlds are individuals characterized by features (major, location) and utterances describe those features.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Instances For
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Instances For
Equations
- Phenomena.Dialogue.Studies.Anderson2021.worldMajor Phenomena.Dialogue.Studies.Anderson2021.MFWorld.ina = Phenomena.Dialogue.Studies.Anderson2021.Major.astronomy
- Phenomena.Dialogue.Studies.Anderson2021.worldMajor Phenomena.Dialogue.Studies.Anderson2021.MFWorld.katie = Phenomena.Dialogue.Studies.Anderson2021.Major.astronomy
- Phenomena.Dialogue.Studies.Anderson2021.worldMajor Phenomena.Dialogue.Studies.Anderson2021.MFWorld.nancy = Phenomena.Dialogue.Studies.Anderson2021.Major.german
- Phenomena.Dialogue.Studies.Anderson2021.worldMajor Phenomena.Dialogue.Studies.Anderson2021.MFWorld.sally = Phenomena.Dialogue.Studies.Anderson2021.Major.german
Instances For
Equations
- Phenomena.Dialogue.Studies.Anderson2021.worldLocation Phenomena.Dialogue.Studies.Anderson2021.MFWorld.ina = Phenomena.Dialogue.Studies.Anderson2021.Location.indoors
- Phenomena.Dialogue.Studies.Anderson2021.worldLocation Phenomena.Dialogue.Studies.Anderson2021.MFWorld.sally = Phenomena.Dialogue.Studies.Anderson2021.Location.indoors
- Phenomena.Dialogue.Studies.Anderson2021.worldLocation Phenomena.Dialogue.Studies.Anderson2021.MFWorld.katie = Phenomena.Dialogue.Studies.Anderson2021.Location.outdoors
- Phenomena.Dialogue.Studies.Anderson2021.worldLocation Phenomena.Dialogue.Studies.Anderson2021.MFWorld.nancy = Phenomena.Dialogue.Studies.Anderson2021.Location.outdoors
Instances For
Utterances available to speakers. Includes a null utterance for passing when the speaker has no confident observation to share.
- studyHumanity : MFUtterance
- studyScience : MFUtterance
- likeIndoors : MFUtterance
- likeOutdoors : MFUtterance
- null : MFUtterance
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Truth-conditional semantics for MutualFriends utterances.
Equations
- One or more equations did not get rendered due to their size.
- Phenomena.Dialogue.Studies.Anderson2021.mfMeaning Phenomena.Dialogue.Studies.Anderson2021.MFUtterance.null x✝ = true
Instances For
Each non-null utterance partitions the world space in half.
A distributional common ground: a non-negative weight function over worlds.
This is the probabilistic counterpart of @cite{stalnaker-2002}'s context set.
Instead of a sharp membership predicate (W → Prop), a distributional CG
assigns graded plausibility (W → ℝ).
- weight : W → ℝ
Instances For
Uniform distributional CG: all worlds equally plausible (empty CG).
Equations
- Phenomena.Dialogue.Studies.Anderson2021.DistributionalCG.uniform = { weight := fun (x : W) => 1, weight_nonneg := ⋯ }
Instances For
Bridge to classical context set: a world is "in the context" iff its weight is positive. This recovers @cite{stalnaker-2002}'s set-membership view from the distributional representation.
Equations
- cg.toContextSet w = (0 < cg.weight w)
Instances For
Uniform distributional CG maps to the trivial context set.
A world with zero weight is excluded from the classical context set.
A distributional CG projects to a context set: worlds with positive weight.
Convex combination update for distributional common ground:
CG'(w) = (1 - lr) · CG(w) + lr · posterior(w)
The learning rate lr ∈ [0,1] controls how much weight is given to new
information vs. the existing CG.
Equations
Instances For
The CG update formula is a convex combination — definitionally equal
to (1 - lr) · CG(w) + lr · posterior(w).
Bridge to @cite{luce-1959} linear learning: the CG update has the same
algebraic form as LinearLearner.update with retention rate 1 - lr and
reinforcement target posterior:
CG'(w) = (1 - lr) · CG(w) + lr · posterior(w) [Anderson]
v'(a) = α · v(a) + (1 - α) · r(a) [Luce §4.C]
Setting α = 1 - lr and r = posterior makes the formulas identical.
Multi-turn conversation IS iterated learning over distributions.
With learning rate 0, the CG is unchanged (full retention).
With learning rate 1, the CG is replaced by the posterior.
The state of a two-participant conversation (Figure 2).
Tracks the common ground (distributional), each participant's private
beliefs, and the learning rate for updates. In the shared CG model
(§5.1, Figure 4), both participants access the same cg. In the
approximate CG model (§5.2, Figure 6), each maintains a separate
approximation (not yet formalized).
The distributional CG serves as both RSAConfig.meaning (L0 prior) and
RSAConfig.worldPrior (L1 prior) — the CG enters the RSA model at two
levels (Figure 4). Between turns, the CG evolves via updateCG, and the
RSA model is reconstructed via mfRSAAt.
- cg : DistributionalCG W
- belA : DistributionalCG W
- belB : DistributionalCG W
- lr : ℝ
- speakerIsA : Bool
Instances For
Initial conversation state: uniform CG, specified beliefs, A speaks first.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Weighted sampling: sample a world proportional to the speaker's belief. Biased toward truth (zero-probability worlds are never sampled) but can lead to flip-flopping when the speaker has no strong beliefs.
Equations
Instances For
Thresholded sampling: filter out worlds below a confidence threshold. If no world exceeds the threshold, the speaker produces the null utterance (passes). Prevents noncommittal speakers from making random assertions.
Equations
Instances For
Difference-based sampling: weight worlds by the positive difference between the speaker's belief and the current CG. Worlds already established in the CG get downweighted, favoring informative (non-redundant) contributions.
weight(w) = max(0, Bel(w) - CG(w))
This implements a pressure toward Gricean Quantity: don't repeat what's already in the common ground.
Equations
Instances For
A believes the person is Nancy: weight 3 on Nancy, 1 on others.
Equations
- One or more equations did not get rendered due to their size.
Instances For
B believes the person is Katie: weight 3 on Katie, 1 on others.
Equations
- One or more equations did not get rendered due to their size.
Instances For
A's beliefs favor Nancy over all other worlds.
B's beliefs favor Katie over all other worlds.
Under difference-based sampling, A initially prioritizes Nancy (highest positive difference from uniform CG).
Equations
- One or more equations did not get rendered due to their size.
Instances For
RSA model for the MutualFriends domain at turn 1.
Anderson's Shared CG model (Figure 4) uses the distributional CG at BOTH L0 and L1:
L0(w|u) ∝ ⟦u⟧(w) · CG(w) -- CG enters L0 via `meaning`
S1(u|w) ∝ L0(w|u) -- speaker optimizes CG-weighted informativity
L1(w|u) ∝ S1(u|w) · CG(w) -- CG enters L1 via `worldPrior`
At turn 1, CG is uniform (weight 1 everywhere), so the CG factor drops out
of L0 and the meaning reduces to Boolean semantics: ⟦u⟧(w) · 1 = ⟦u⟧(w).
The general CG-weighted pattern is visible in mfRSA_turn2.
Equations
- One or more equations did not get rendered due to their size.
Instances For
A speaker who knows the person is Nancy prefers "studyHumanity" over "studyScience". Nancy studies German (a humanity), so "studyScience" has L0(nancy|studyScience) = 0, while "studyHumanity" has L0(nancy|studyHumanity) = 1/2.
A speaker who knows it's Nancy prefers "likeOutdoors" over "likeIndoors". Nancy likes being outdoors.
A speaker who knows it's Ina prefers "studyScience" over "studyHumanity". Ina studies Astronomy (a science).
A speaker who knows it's Ina is indifferent between "studyScience" and
"likeIndoors": both are true of exactly 2 worlds, giving equal L0 posteriors.
This tests rsa_predict on equality goals.
The null utterance is always suboptimal: a speaker who knows it's Nancy strictly prefers any true specific utterance over saying nothing. Null is true of all 4 worlds (L0 = 1/4), while "studyHumanity" is true of only 2 (L0 = 1/2).
Symmetry: S1(studyHumanity|nancy) = S1(likeOutdoors|nancy). Both utterances partition the 4 worlds into 2 true + 2 false, so L0(nancy|studyHumanity) = L0(nancy|likeOutdoors) = 1/2, hence equal S1.
False utterances get zero S1 probability.
"studyScience" is false of Nancy (she studies German), so S1 = 0.
Tests rsa_predict on negation of strict inequality.
After hearing "studyHumanity", L1 assigns higher probability to Nancy than to Ina. Nancy studies a humanity; Ina studies a science.
After hearing "likeOutdoors", L1 favors Nancy over Sally. Nancy likes outdoors; Sally likes indoors.
After hearing "studyHumanity", L1 assigns equal probability to Nancy and Sally — both study a humanity, and S1 scores are symmetric.
After hearing "studyScience", L1 assigns equal probability to Ina and Katie — both study a science.
The null utterance conveys no information: L1 assigns equal probability to all worlds. Every world has S1(null|w) = 1/5 by the domain's symmetry (each world makes exactly 2 non-null utterances true).
CG weights after hearing "studyHumanity" at turn 1.
After L1 processes "studyHumanity" with uniform prior, the posterior
concentrates on nancy and sally (the German-studying worlds):
L1(studyHumanity) = [0, 0, 1/2, 1/2]. Updating the normalized uniform
CG [1/4, 1/4, 1/4, 1/4] via updateCG with lr=0.2 (footnote 9) gives:
CG'(w) = 0.8 · 1/4 + 0.2 · L1(w)
CG' = [1/5, 1/5, 3/10, 3/10]
The weights [2, 2, 3, 3] are proportional to [1/5, 1/5, 3/10, 3/10], which is the exact post-update CG from the paper's Figure 5, panel 1A. Since RSA normalizes, proportional weights give identical predictions.
Equations
- Phenomena.Dialogue.Studies.Anderson2021.cgTurn2 Phenomena.Dialogue.Studies.Anderson2021.MFWorld.ina = 2
- Phenomena.Dialogue.Studies.Anderson2021.cgTurn2 Phenomena.Dialogue.Studies.Anderson2021.MFWorld.katie = 2
- Phenomena.Dialogue.Studies.Anderson2021.cgTurn2 Phenomena.Dialogue.Studies.Anderson2021.MFWorld.nancy = 3
- Phenomena.Dialogue.Studies.Anderson2021.cgTurn2 Phenomena.Dialogue.Studies.Anderson2021.MFWorld.sally = 3
Instances For
RSA model after hearing "studyHumanity" at turn 1.
The updated CG enters BOTH L0 and L1 (Figure 4), matching Anderson's Shared CG model:
L0(w|u) ∝ ⟦u⟧(w) · CG'(w) -- CG in L0 via `meaning`
L1(w|u) ∝ S1(u|w) · CG'(w) -- CG in L1 via `worldPrior`
This means S1 adapts to the CG: the speaker reasons about informativity relative to the current common ground. After "studyHumanity" shifts the CG toward nancy/sally (weights [2,2,3,3] ∝ [1/5, 1/5, 3/10, 3/10]), utterances that disambiguate within that subspace (e.g., "likeOutdoors") become more informative than utterances that merely re-assert the major dimension (e.g., "studyHumanity").
Equations
- One or more equations did not get rendered due to their size.
Instances For
After CG update from "studyHumanity", L1("likeOutdoors") now favors Nancy over Katie. In turn 1, they were symmetric (both like outdoors). The updated prior (3 vs 1) breaks the tie — Nancy's higher CG weight makes her more probable. This is the key multi-turn prediction.
After CG update, "likeIndoors" favors Sally over Ina. Both like indoors, but Sally has higher prior (3 vs 1) from the CG shift.
After CG update, "studyScience" still treats Ina and Katie equally: both study a science and both have equal prior weight (1).
After CG update, "studyHumanity" still treats Nancy and Sally equally: both study a humanity and both have equal updated prior (3).
CG update breaks turn-1 symmetry: in turn 1, L1("likeOutdoors") assigned equal weight to Nancy and Katie. After the CG shift, Nancy is favored. Multi-turn conversation enriches inference.
With CG entering L0 (Figure 4), the speaker at turn 2 adapts to what's already in the common ground. After "studyHumanity" establishes the major dimension, utterances that disambiguate within the high-CG subspace become more informative. This section verifies that the CG-weighted S1 captures Anderson's cooperative contribution mechanism.
CG-adapted informativity: At turn 2, the speaker who knows it's Nancy prefers "likeOutdoors" over "studyHumanity". At turn 1, these were equal (both partition the 4-world space into 2+2). After the CG shifts toward nancy/sally (weights [2,2,3,3]), "likeOutdoors" discriminates within the high-weight subspace (L0(nancy|likeOutdoors) = 3/5) while "studyHumanity" merely re-asserts what's already established (L0(nancy|studyHumanity) = 1/2).
This is Anderson's key insight: the CG-weighted L0 makes speakers prefer new information over redundant information.
At turn 1, the same two utterances were equal (pre-CG-adaptation).
CG adaptation works differently for low-CG worlds: at turn 2, Ina (weight 2) prefers "studyScience" over "likeIndoors" because sally (weight 3) dominates the indoor partition, making L0(ina|likeIndoors) = 2/5 < L0(ina|studyScience) = 1/2. The CG shift makes the major dimension MORE informative for low-CG worlds, the opposite of the high-CG case (nancy, §12b above).
S2 endorsement: given world Nancy, the pragmatic speaker endorses "studyHumanity" over "studyScience". S2(u|w) ∝ L1(w|u), and L1(nancy|studyHumanity) > 0 = L1(nancy|studyScience).
S2 endorsement: given world Nancy, "studyHumanity" and "likeOutdoors" are equally endorsed (symmetric L1 posteriors).
RSA model for MutualFriends at an arbitrary CG (Figure 4).
This is the general form of Anderson's Shared CG model: the CG enters
as the L0 meaning weight (via meaning) AND as the L1 prior (via
worldPrior). One-shot RSA is the special case with uniform CG.
Used by conversationStep to construct the RSA model at each turn.
Equations
- One or more equations did not get rendered due to their size.
Instances For
One step of the Shared CG conversation loop (Figure 2).
Given the current CG and an utterance:
- Build the RSA model at the current CG (
mfRSAAt) - Compute L1 posteriors: the pragmatic listener's world beliefs
- Update the CG via convex combination with the posteriors
This closes the loop: RSA inference → CG update → new RSA model. The returned CG serves as the world prior for the next turn's model.
Normalization note: The L1 posterior is normalized (sums to 1), so
the CG weights should also be normalized for the convex combination
to preserve total weight. With normalized CG, updateCG is a true
convex combination and preserves sum-to-1. Starting from
DistributionalCG.uniform (weight=1 per world) gives correct RSA
predictions but produces CG updates with a different scale than the
paper's normalized distributions.
Equations
- One or more equations did not get rendered due to their size.
Instances For
The conversation step preserves CG non-negativity.
With lr = 0, the conversation step leaves the CG unchanged.
The RSA predictions above verify properties from Phenomena.Dialogue.Basic:
Contributions informative: S1 prefers specific utterances over null (§9,
s1_null_suboptimal), matchingcontributionsInformative = true.Uncertainty decreases: L1 concentrates probability after hearing an informative utterance (this section).
CG-adapted contributions: At turn 2, S1 adapts to what's already in the CG, preferring non-redundant information (§12b).
L1 concentrates probability after an informative utterance:
L1(nancy|studyHumanity) > L1(nancy|null). The null utterance gives
uniform L1 (= 1/4), while "studyHumanity" concentrates on the 2
German-studying worlds (= 1/2). This matches
Phenomena.Dialogue.successfulInfoSharing.uncertaintyDecreases.
Informed speakers are informative: S1 assigns higher probability
to a true specific utterance than to null. This matches
Phenomena.Dialogue.successfulInfoSharing.contributionsInformative.
Anderson's distributional CG update subsumes @cite{stalnaker-2002}'s
set-intersection update as a special case: with learning rate 1 and a
posterior that assigns zero weight to worlds where the utterance is false,
the updated CG excludes exactly those worlds — recovering ContextSet.update.
This bridge connects the probabilistic conversation model to the classical
assertion framework in Core.Semantics.CommonGround.
Exact rational values for the turn-1 RSA computations underlying Figure 5, panel 1A. At turn 1 with uniform CG, the domain's 2×2 feature symmetry gives clean fractions: each non-null utterance is true of exactly 2 worlds (L0 = 1/2), null is true of all 4 (L0 = 1/4), and each world makes exactly 2 non-null utterances true, giving S1(true u|w) = (1/2)/(5/4) = 2/5 and S1(null|w) = (1/4)/(5/4) = 1/5.
Null gives uniform L1: every world has the same S1(null|w) by the domain's symmetry, so L1(w|null) = CG(w)/Σ CG = 1/4.
The Approximate Common Ground model relaxes the shared-CG assumption: each participant maintains their own CG approximation. The speaker uses CG_S in production; the listener uses CG_L in comprehension with their private beliefs B_L as the L1 world prior.
Key difference from shared CG (Figure 4):
- Shared: L1(w|u) ∝ S1(u|w) · CG(w) — same CG for everyone
- Approx: L1(w|u) ∝ S1(u|w) · B_L(w) — listener's own beliefs
This models realistic divergence: participants with different priors hear the same utterance but reach different posteriors, causing their CG approximations to drift apart (Figure 7).
State for the Approximate CG model (§5.2, Figure 6).
- cgA : DistributionalCG W
- cgB : DistributionalCG W
- belA : DistributionalCG W
- belB : DistributionalCG W
- lr : ℝ
- speakerIsA : Bool
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Approximate comprehension RSA (Figure 6): L0 uses CG_L, but L1 uses B_L (listener's private beliefs) as the world prior.
Equations
- One or more equations did not get rendered due to their size.
Instances For
The belief update model extends the conversation system by also updating participants' private beliefs. After comprehension, the listener updates their beliefs via the same linear rule as CG update:
Bel'(w) = (1 - lr_bel) · Bel(w) + lr_bel · posterior(w)
The speaker does not update beliefs (they already knew the info). Different learning rates for CG vs beliefs allow modeling:
- §6.2: skeptical listeners (low belief lr, high CG lr)
- §6.3: uncertainty-based rates (lr scales with entropy)
State for the belief update model (Figure 8). Extends approximate CG with separate learning rates for CG and beliefs.
- cgA : DistributionalCG W
- cgB : DistributionalCG W
- belA : DistributionalCG W
- belB : DistributionalCG W
- cgLR : ℝ
Learning rate for CG updates.
- belLR : ℝ
Learning rate for belief updates (may be lower for skeptical agents).
- speakerIsA : Bool
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Belief update is algebraically identical to CG update — both are instances of @cite{luce-1959}'s linear learning rule. The only difference is the learning rate parameter and the interpretation (private vs shared).
A speaker with uniform beliefs (no private information) assigns equal weight to all worlds under weighted sampling. Since no observation is more probable than any other, the speaker makes random assertions about worlds they don't know, causing the CG to flip-flop (Figure 12).
Solutions:
- Threshold sampling (§7.1.1): filter out low-confidence worlds; a noncommittal speaker passes (null utterance) instead of guessing.
- Uncertainty-based lr (§6.3): scale the CG update rate by the listener's uncertainty, so confident listeners resist random input.
Uniform beliefs assign equal weight to all worlds under weighted sampling — a noncommittal speaker has no basis for choosing.
Threshold sampling filters out all worlds when beliefs don't exceed the threshold. For uniform beliefs (weight 1), any θ > 1 produces zero weight everywhere — the speaker passes (Figure 13).
Threshold preserves confident worlds: weights above θ pass through.
Under weighted sampling, a speaker whose beliefs match the CG keeps
repeating already-shared information (Figure 14). Difference-based
sampling fixes this by weighting worlds by max(0, Bel(w) - CG(w)):
worlds already established in the CG get zero weight.
Combined with thresholding, thresholded difference-based sampling gives the best behavior (Figure 15): informed speakers contribute new information, noncommittal speakers pass.
When beliefs match the CG exactly, difference sampling assigns zero weight to all worlds — nothing new to contribute.
Difference sampling assigns positive weight when belief exceeds CG — these worlds carry new information not yet in the common ground.
A's initial difference from uniform CG: Nancy has the highest positive difference (belief 3 vs CG 1), so A's first contribution should describe Nancy — matching the stochastic trace in Figure 5.