@cite{bergen-goodman-2015} #
@cite{frank-goodman-2012} @cite{degen-etal-2020}
The Strategic Use of Noise in Pragmatic Reasoning. Topics in Cognitive Science 7(2), 336–350.
Core Argument #
Standard RSA assumes perfect transmission: the utterance the speaker produces is the utterance the listener hears. This paper extends RSA with a noisy channel P_N(u_p | u_i) — the probability that the listener perceives u_p given the speaker intended u_i. This extension explains two phenomena that standard RSA cannot:
Sentence fragments (§3): "Bob" as answer to "Who went?" has no literal meaning, yet listeners correctly interpret it by reasoning that noise deleted words from a full sentence.
Prosodic exhaustivity (§4): "BOB went" (stressed) signals exhaustive knowledge (only Bob went), while "Bob went" (unstressed) is compatible with others also going. This emerges because stress reduces noise rate, and speakers strategically protect informative words.
Key Equations #
L_0(m | u_p) ∝ P(m) Σ_{u_i: m∈⟦u_i⟧} P(u_i) · P_N(u_p | u_i) (Eq. 6)
U_n(u_i | m) = Σ_{u_p} P_N(u_p|u_i) · log L_{n-1}(m|u_p) - c(u_i) (Eq. 7)
L_n(m | u_p) ∝ P(m) Σ_{u_i} S_n(u_i|m) · P_N(u_p|u_i) (Eq. 8)
RSAConfig Encoding #
The noisy channel model is encoded in RSAConfig by folding noise into the score functions:
meaning(u_p, m)= Eq. 6 numerator: Σ_{u_i} ⟦u_i⟧(m) · P(u_i) · P_N(u_p|u_i)s1Score(l0, α, _, m, u_p)= Eqs. 7-8 combined: Σ_{u_i} exp(α · Σ_{u_p'} P_N(u_p'|u_i) · log l0(u_p',m)) · P_N(u_p|u_i)
This works because the noise channel rows sum to 1, so normalization before vs after folding in noise gives the same result:
S1_noise(u_p|m) = Σ_{u_i} S1_intended(u_i|m) · P_N(u_p|u_i)
and L1(m|u_p) ∝ worldPrior(m) · S1_noise(u_p|m), matching Eq. 8.
The S1 speaker only produces full sentences — fragments arise solely from
noise corruption. This is encoded via literalMeaning: fragments have no
literal meaning, so S1's exp(α·utility) term is zero for them (matching
exp(-∞) = 0 from log(0) in the utility).
Ellipsis #
From the paper (§3):
(1) A: Who went to the movies? B: Bob
The fragment "Bob" has no literal truth conditions. The listener interprets it as "Bob went to the movies" by reasoning that noise (word deletion) corrupted a full sentence into a fragment.
Model Setup #
- Meanings M = {aliceWent, bobWent, nobodyWent}
- Utterances U = full sentences + fragments (7 total)
- Noise: per-word deletion with probability δ
- Only full sentences have literal meaning; fragments have none
Equations
- One or more equations did not get rendered due to their size.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Instances For
Literal meaning: only full sentences have truth conditions. Fragments have no literal meaning — this is the key to the model.
Equations
- One or more equations did not get rendered due to their size.
- Phenomena.Ellipsis.Studies.BergenGoodman2015.EllipsisModel.literalMeaning x✝¹ x✝ = false
Instances For
Prior over utterances (speaker's production probability). Only full sentences are in the speaker's production distribution.
Equations
- Phenomena.Ellipsis.Studies.BergenGoodman2015.EllipsisModel.utterancePrior Phenomena.Ellipsis.Studies.BergenGoodman2015.EllipsisModel.Utterance.aliceWentToMovies = 1
- Phenomena.Ellipsis.Studies.BergenGoodman2015.EllipsisModel.utterancePrior Phenomena.Ellipsis.Studies.BergenGoodman2015.EllipsisModel.Utterance.bobWentToMovies = 1
- Phenomena.Ellipsis.Studies.BergenGoodman2015.EllipsisModel.utterancePrior Phenomena.Ellipsis.Studies.BergenGoodman2015.EllipsisModel.Utterance.nobodyWentToMovies = 1
- Phenomena.Ellipsis.Studies.BergenGoodman2015.EllipsisModel.utterancePrior x✝ = 0
Instances For
Noise channel for ellipsis: word deletion with probability δ.
P_N(u_p | u_i):
- Full sentence heard correctly: 1 - δ
- Full sentence reduced to subject fragment: δ (predicate deleted)
- Everything else: 0
This models the subject-deletion path. The paper also considers predicate deletion and multi-word deletion, but subject deletion is the relevant path for the "Bob" example.
Equations
- One or more equations did not get rendered due to their size.
- Phenomena.Ellipsis.Studies.BergenGoodman2015.EllipsisModel.noiseChannel δ x✝¹ x✝ = 0
Instances For
Noisy L0 meaning (Eq. 6 numerator).
meaning(u_p, m) = Σ_{u_i} ⟦u_i⟧(m) · P(u_i) · P_N(u_p | u_i)
The listener considers all intended utterances u_i with meaning m, weighted by how likely noise would produce the perceived u_p.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Noise-folded S1 score (Eqs. 7-8 combined).
s1Score(l0, α, , m, u_p) = Σ{u_i} raw(u_i|m) · P_N(u_p | u_i)
where raw(u_i|m) = exp(α · Σ_{u_p'} P_N(u_p'|u_i) · log l0(u_p',m)) for utterances with literal meaning m, and 0 otherwise (exp(-∞) = 0).
This folds the noise channel into S1's output so that RSAConfig's S1(u_p|m) = Σ_{u_i} S1_intended(u_i|m) · P_N(u_p|u_i), matching Eq. 8. The row-sum-to-1 property of the noise channel ensures normalization is correct.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Noise rate: 1% per-word deletion. The paper's Fig. 1 shows results are robust across noise rates from 10⁻⁵ to 10⁻¹.
Equations
Instances For
RSAConfig for the ellipsis model.
The noisy channel is encoded via:
meaning: noisy L0 score (Eq. 6 numerator)s1Score: noise-folded S1 (Eqs. 7-8 combined)
Equations
- One or more equations did not get rendered due to their size.
Instances For
L0 assigns higher probability to "Bob went" than "Alice went" or "Nobody went" given the fragment "Bob".
Even though "Bob" has no literal meaning, the noisy L0 infers it must have come from "Bob went to the movies" via deletion. This is the only full sentence that produces "Bob" via noise, so L0 assigns it probability 1.
L0 correctly interprets the "Nobody" fragment.
L0 correctly interprets a full sentence (sanity check).
L1 also correctly interprets the fragment "Bob" as "Bob went".
The pragmatic listener, reasoning about S1's production choices, also assigns bobWent the highest probability.
Parametric Robustness (Fig. 1, left panel) #
Fragment interpretation works for ANY noise rate δ > 0. Since "Bob" can only arise from "Bob went to the movies" via deletion, L0 assigns probability 1 regardless of δ. This is the paper's key theoretical result: "this reasoning will work even if the noise rate is arbitrarily close to 0, so long as it is positive."
We prove this by showing that the noisy meaning at "bob" is zero for all meanings except bobWent, and nonzero (= δ) for bobWent. Since L0 normalizes the meaning, L0(bobWent | bob) = δ/δ = 1.
The noisy meaning at "bob" is δ for bobWent and 0 for all others.
Only "Bob went to the movies" can produce "Bob" via noise deletion, and only with meaning bobWent. Therefore L0("bob") = δ/δ = 1.
Robustness (Fig. 1, left panel): Fragment interpretation works for ANY noise rate δ > 0. Since "Bob" can only arise from "Bob went to the movies" via deletion, L0 assigns probability 1 regardless of δ.
This is the paper's key theoretical result: "this reasoning will work even if the noise rate is arbitrarily close to 0, so long as it is positive."
Prosody #
From the paper (§4):
(2) A: Who went to the movies? B: BOB went to the movies.
Prosodic stress reduces the noise rate on stressed words. The listener infers that stress → the speaker had reason to protect that word → the speaker has exhaustive knowledge → only Bob went.
Mechanism #
- Speaker with exhaustive knowledge (only Bob went) needs listener to correctly hear "Bob" — mishearing "Alice" would be wrong
- Therefore: stress "Bob" to reduce noise
- Speaker with non-exhaustive knowledge (Bob went, maybe others too) has less need to protect — "Alice" is also compatible
- Listener infers: stress → exhaustive knowledge
Model Setup #
- Meanings: {onlyAlice, onlyBob, both} (who went)
- Utterances: {aliceWent, ALICE_went, bobWent, BOB_went, aliceAndBobWent}
- Noise: baseline ε, reduced to ε/2 for stressed words
- The paper (§4.1): "placing stress on a word can reduce the noise rate of that word from ε to ε/n for some reduction factor n > 1." We use n = 2.
Equations
- One or more equations did not get rendered due to their size.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Literal meaning: lower-bound semantics. "Alice went" is true if Alice went (regardless of others).
Equations
- One or more equations did not get rendered due to their size.
- Phenomena.Ellipsis.Studies.BergenGoodman2015.ProsodyModel.literalMeaning x✝¹ x✝ = false
Instances For
Noise channel with prosody.
The confusion is between subjects: "Alice" ↔ "Bob".
- No stress: ε chance of subject confusion
- With stress: ε/2 chance (stress reduces noise)
Equations
- One or more equations did not get rendered due to their size.
- Phenomena.Ellipsis.Studies.BergenGoodman2015.ProsodyModel.noiseChannel ε x✝¹ x✝ = 0
Instances For
Baseline noise rate (1%).
Equations
Instances For
RSAConfig for the prosody model.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Stress increases exhaustive interpretation.
"BOB went" (stressed) is strictly more likely to mean "only Bob" than "Bob went" (unstressed). This is the paper's central prosody prediction (§4).
Both stressed and unstressed assign positive probability to "both went" (it is compatible with either prosody). We show this via comparison: .both gets more probability than .onlyAlice under each utterance.
Bridge: FragmentAnswers #
The noisy channel model explains the data in Phenomena.Ellipsis.FragmentAnswers:
fragmentSubject(question: "Who went to the movies?", fragment: "Bob") is correctly interpreted by both L0 and L1 (seeEllipsisModel.l0_fragment_correct,EllipsisModel.l1_fragment_correct)fragmentNobodyfollows the same mechanism (different subject)
The model also accounts for non-question fragments (fragmentAssertion,
fragmentTopic): noise-based inference is not restricted to question-answer
pairs — any context where deletion is plausible licenses fragment use.
The model's prediction aligns with the fragment answer data.
Bridge: ProsodicExhaustivity #
The noisy channel model explains the data in
Phenomena.Focus.ProsodicExhaustivity:
stressedSubject: "BOB went" → exhaustive readingunstressedSubject: "Bob went" → non-exhaustive reading
The model derives this from noise reduction: stress lowers the noise rate, and the listener infers that the speaker had reason to protect the stressed word, implying exhaustive knowledge.
The model's prediction aligns with the prosody data.
Connection to RSA.Noise #
RSA.Noise defines the fundamental noise channel operation:
noiseChannel(match, mismatch, b) = match · b + mismatch · (1 - b)
Both the ellipsis deletion channel and the prosody confusion channel are
special cases. The key insight from @cite{bergen-goodman-2015} is that noise
can be strategically exploited — a feature not shared by
@cite{degen-etal-2020}'s semantic noise model
(see Phenomena.Reference.Studies.DegenEtAl2020).
| Property | @cite{bergen-goodman-2015} | @cite{degen-etal-2020} |
|---|---|---|
| Noise location | Channel (transmission) | Semantics (perception) |
| Type | P_N(u_p | u_i) | φ(u, o) ∈ [0,1] |
| Effect | Word corruption | Graded feature match |
| Strategic use | Yes (ellipsis, prosody) | No |
Prosodic stress increases channel discrimination between intended and confused utterances. Stressed "BOB went" has a larger gap between correct and confused perception than unstressed "Bob went".
- Stressed gap: (1 - ε/2) - ε/2 = 1 - ε
- Unstressed gap: (1 - ε) - ε = 1 - 2ε
- Difference: ε > 0