Documentation

Linglib.Phenomena.Ellipsis.Studies.BergenGoodman2015

@cite{bergen-goodman-2015} #

@cite{frank-goodman-2012} @cite{degen-etal-2020}

The Strategic Use of Noise in Pragmatic Reasoning. Topics in Cognitive Science 7(2), 336–350.

Core Argument #

Standard RSA assumes perfect transmission: the utterance the speaker produces is the utterance the listener hears. This paper extends RSA with a noisy channel P_N(u_p | u_i) — the probability that the listener perceives u_p given the speaker intended u_i. This extension explains two phenomena that standard RSA cannot:

  1. Sentence fragments (§3): "Bob" as answer to "Who went?" has no literal meaning, yet listeners correctly interpret it by reasoning that noise deleted words from a full sentence.

  2. Prosodic exhaustivity (§4): "BOB went" (stressed) signals exhaustive knowledge (only Bob went), while "Bob went" (unstressed) is compatible with others also going. This emerges because stress reduces noise rate, and speakers strategically protect informative words.

Key Equations #

L_0(m | u_p) ∝ P(m) Σ_{u_i: m∈⟦u_i⟧} P(u_i) · P_N(u_p | u_i)     (Eq. 6)
U_n(u_i | m) = Σ_{u_p} P_N(u_p|u_i) · log L_{n-1}(m|u_p) - c(u_i)  (Eq. 7)
L_n(m | u_p) ∝ P(m) Σ_{u_i} S_n(u_i|m) · P_N(u_p|u_i)              (Eq. 8)

RSAConfig Encoding #

The noisy channel model is encoded in RSAConfig by folding noise into the score functions:

This works because the noise channel rows sum to 1, so normalization before vs after folding in noise gives the same result:

S1_noise(u_p|m) = Σ_{u_i} S1_intended(u_i|m) · P_N(u_p|u_i)

and L1(m|u_p) ∝ worldPrior(m) · S1_noise(u_p|m), matching Eq. 8.

The S1 speaker only produces full sentences — fragments arise solely from noise corruption. This is encoded via literalMeaning: fragments have no literal meaning, so S1's exp(α·utility) term is zero for them (matching exp(-∞) = 0 from log(0) in the utility).

Ellipsis #

From the paper (§3):

(1) A: Who went to the movies? B: Bob

The fragment "Bob" has no literal truth conditions. The listener interprets it as "Bob went to the movies" by reasoning that noise (word deletion) corrupted a full sentence into a fragment.

Model Setup #

Meanings: who went to the movies.

Instances For
    Equations
    • One or more equations did not get rendered due to their size.
    Equations
    • One or more equations did not get rendered due to their size.
    Instances For

      Utterances: full sentences and fragments. Fragments arise from noise deleting words from full sentences.

      Instances For
        Equations
        • One or more equations did not get rendered due to their size.
        Equations
        • One or more equations did not get rendered due to their size.
        Instances For

          Literal meaning: only full sentences have truth conditions. Fragments have no literal meaning — this is the key to the model.

          Equations
          Instances For

            Noise channel for ellipsis: word deletion with probability δ.

            P_N(u_p | u_i):

            • Full sentence heard correctly: 1 - δ
            • Full sentence reduced to subject fragment: δ (predicate deleted)
            • Everything else: 0

            This models the subject-deletion path. The paper also considers predicate deletion and multi-word deletion, but subject deletion is the relevant path for the "Bob" example.

            Equations
            Instances For

              Noisy L0 meaning (Eq. 6 numerator).

              meaning(u_p, m) = Σ_{u_i} ⟦u_i⟧(m) · P(u_i) · P_N(u_p | u_i)

              The listener considers all intended utterances u_i with meaning m, weighted by how likely noise would produce the perceived u_p.

              Equations
              • One or more equations did not get rendered due to their size.
              Instances For
                noncomputable def Phenomena.Ellipsis.Studies.BergenGoodman2015.EllipsisModel.noisyS1Score (δ : ) (l0 : UtteranceMeaning) (α : ) :
                Unit(m : Meaning) → (u_p : Utterance) →

                Noise-folded S1 score (Eqs. 7-8 combined).

                s1Score(l0, α, , m, u_p) = Σ{u_i} raw(u_i|m) · P_N(u_p | u_i)

                where raw(u_i|m) = exp(α · Σ_{u_p'} P_N(u_p'|u_i) · log l0(u_p',m)) for utterances with literal meaning m, and 0 otherwise (exp(-∞) = 0).

                This folds the noise channel into S1's output so that RSAConfig's S1(u_p|m) = Σ_{u_i} S1_intended(u_i|m) · P_N(u_p|u_i), matching Eq. 8. The row-sum-to-1 property of the noise channel ensures normalization is correct.

                Equations
                • One or more equations did not get rendered due to their size.
                Instances For

                  Noise rate: 1% per-word deletion. The paper's Fig. 1 shows results are robust across noise rates from 10⁻⁵ to 10⁻¹.

                  Equations
                  Instances For

                    RSAConfig for the ellipsis model.

                    The noisy channel is encoded via:

                    • meaning: noisy L0 score (Eq. 6 numerator)
                    • s1Score: noise-folded S1 (Eqs. 7-8 combined)
                    Equations
                    • One or more equations did not get rendered due to their size.
                    Instances For

                      L0 assigns higher probability to "Bob went" than "Alice went" or "Nobody went" given the fragment "Bob".

                      Even though "Bob" has no literal meaning, the noisy L0 infers it must have come from "Bob went to the movies" via deletion. This is the only full sentence that produces "Bob" via noise, so L0 assigns it probability 1.

                      L1 also correctly interprets the fragment "Bob" as "Bob went".

                      The pragmatic listener, reasoning about S1's production choices, also assigns bobWent the highest probability.

                      Parametric Robustness (Fig. 1, left panel) #

                      Fragment interpretation works for ANY noise rate δ > 0. Since "Bob" can only arise from "Bob went to the movies" via deletion, L0 assigns probability 1 regardless of δ. This is the paper's key theoretical result: "this reasoning will work even if the noise rate is arbitrarily close to 0, so long as it is positive."

                      We prove this by showing that the noisy meaning at "bob" is zero for all meanings except bobWent, and nonzero (= δ) for bobWent. Since L0 normalizes the meaning, L0(bobWent | bob) = δ/δ = 1.

                      The noisy meaning at "bob" is δ for bobWent and 0 for all others.

                      Only "Bob went to the movies" can produce "Bob" via noise deletion, and only with meaning bobWent. Therefore L0("bob") = δ/δ = 1.

                      Robustness (Fig. 1, left panel): Fragment interpretation works for ANY noise rate δ > 0. Since "Bob" can only arise from "Bob went to the movies" via deletion, L0 assigns probability 1 regardless of δ.

                      This is the paper's key theoretical result: "this reasoning will work even if the noise rate is arbitrarily close to 0, so long as it is positive."

                      Prosody #

                      From the paper (§4):

                      (2) A: Who went to the movies? B: BOB went to the movies.

                      Prosodic stress reduces the noise rate on stressed words. The listener infers that stress → the speaker had reason to protect that word → the speaker has exhaustive knowledge → only Bob went.

                      Mechanism #

                      1. Speaker with exhaustive knowledge (only Bob went) needs listener to correctly hear "Bob" — mishearing "Alice" would be wrong
                      2. Therefore: stress "Bob" to reduce noise
                      3. Speaker with non-exhaustive knowledge (Bob went, maybe others too) has less need to protect — "Alice" is also compatible
                      4. Listener infers: stress → exhaustive knowledge

                      Model Setup #

                      Meanings: who went to the movies (exhaustive vs non-exhaustive).

                      Instances For
                        Equations
                        • One or more equations did not get rendered due to their size.
                        Equations
                        • One or more equations did not get rendered due to their size.
                        Instances For

                          Utterances: with and without prosodic stress (CAPS = stress).

                          Instances For
                            Equations
                            • One or more equations did not get rendered due to their size.
                            Equations
                            • One or more equations did not get rendered due to their size.
                            Instances For

                              Literal meaning: lower-bound semantics. "Alice went" is true if Alice went (regardless of others).

                              Equations
                              Instances For

                                Noise channel with prosody.

                                The confusion is between subjects: "Alice" ↔ "Bob".

                                • No stress: ε chance of subject confusion
                                • With stress: ε/2 chance (stress reduces noise)
                                Equations
                                Instances For

                                  Noisy L0 meaning (Eq. 6 numerator) for the prosody model.

                                  Equations
                                  • One or more equations did not get rendered due to their size.
                                  Instances For
                                    noncomputable def Phenomena.Ellipsis.Studies.BergenGoodman2015.ProsodyModel.noisyS1Score (ε : ) (l0 : UtteranceMeaning) (α : ) :
                                    Unit(m : Meaning) → (u_p : Utterance) →

                                    Noise-folded S1 score (Eqs. 7-8) for the prosody model.

                                    Equations
                                    • One or more equations did not get rendered due to their size.
                                    Instances For

                                      RSAConfig for the prosody model.

                                      Equations
                                      • One or more equations did not get rendered due to their size.
                                      Instances For

                                        Stress increases exhaustive interpretation.

                                        "BOB went" (stressed) is strictly more likely to mean "only Bob" than "Bob went" (unstressed). This is the paper's central prosody prediction (§4).

                                        Both stressed and unstressed assign positive probability to "both went" (it is compatible with either prosody). We show this via comparison: .both gets more probability than .onlyAlice under each utterance.

                                        Bridge: FragmentAnswers #

                                        The noisy channel model explains the data in Phenomena.Ellipsis.FragmentAnswers:

                                        The model also accounts for non-question fragments (fragmentAssertion, fragmentTopic): noise-based inference is not restricted to question-answer pairs — any context where deletion is plausible licenses fragment use.

                                        Bridge: ProsodicExhaustivity #

                                        The noisy channel model explains the data in Phenomena.Focus.ProsodicExhaustivity:

                                        The model derives this from noise reduction: stress lowers the noise rate, and the listener infers that the speaker had reason to protect the stressed word, implying exhaustive knowledge.

                                        Connection to RSA.Noise #

                                        RSA.Noise defines the fundamental noise channel operation:

                                        noiseChannel(match, mismatch, b) = match · b + mismatch · (1 - b)
                                        

                                        Both the ellipsis deletion channel and the prosody confusion channel are special cases. The key insight from @cite{bergen-goodman-2015} is that noise can be strategically exploited — a feature not shared by @cite{degen-etal-2020}'s semantic noise model (see Phenomena.Reference.Studies.DegenEtAl2020).

                                        Property@cite{bergen-goodman-2015}@cite{degen-etal-2020}
                                        Noise locationChannel (transmission)Semantics (perception)
                                        TypeP_N(u_p | u_i)φ(u, o) ∈ [0,1]
                                        EffectWord corruptionGraded feature match
                                        Strategic useYes (ellipsis, prosody)No

                                        Prosodic stress increases channel discrimination between intended and confused utterances. Stressed "BOB went" has a larger gap between correct and confused perception than unstressed "Bob went".

                                        • Stressed gap: (1 - ε/2) - ε/2 = 1 - ε
                                        • Unstressed gap: (1 - ε) - ε = 1 - 2ε
                                        • Difference: ε > 0