Documentation

Linglib.Theories.Pragmatics.RSA.Extensions.NoncooperativeCommunication

Noncooperative Communication: Unified Argumentative RSA #

@cite{barnett-griffiths-hawkins-2022} @cite{cummins-2025} @cite{cummins-franke-2021} @cite{merin-1999} @cite{sperber-2010} @cite{goodman-stuhlmuller-2013} @cite{yoon-etal-2020}

Unifies @cite{cummins-franke-2021}'s argumentative strength framework and @cite{barnett-griffiths-hawkins-2022}'s persuasive RSA into a single parameterized model, following @cite{cummins-2025}'s analysis of noncooperative communication.

Core Unification #

Both models instantiate the same weighted-utility architecture:

U(u; w, G) = U_epistemic(u; w) + β · U_goal(u; G)

ModelU_goalβ
Standard RSA0
Barnett et al.ln P_L0(w*∣u)fitted (β̂ = 2.26)
C&F (semantic)argStr(u, G)implicit (speaker maximizes argStr)

The parameter β controls the cooperativity spectrum:

Epistemic Vigilance #

Following @cite{sperber-2010}, the hearer's interpretation is a trust-weighted mixture of pragmatic and literal posteriors:

P_vigilant(w∣u) = τ · P_L1(w∣u) + (1−τ) · P_L0(w∣u)

where τ ∈ [0,1] is the hearer's trust in speaker cooperativity.

Meaning-Level Taxonomy #

@cite{cummins-2025} identifies four levels at which falsehood can occur: assertion, implicature, presupposition, and typicality departure. Both C&F and Barnett involve misleading at the typicality/implicature level while maintaining truthful assertions — the argumentative speaker exploits pragmatic expectations without violating Quality.

Speaker orientation on the cooperativity spectrum.

  • cooperative: β=0, speaker maximizes hearer's accurate belief (standard RSA)
  • argumentative: β>0, speaker has a goal G and balances informativity and persuasion (@cite{cummins-franke-2021}, @cite{barnett-griffiths-hawkins-2022}, Macuch @cite{macuch-silva-etal-2024})

The distinction is continuous: β parameterizes the spectrum.

Instances For
    Equations
    • One or more equations did not get rendered due to their size.
    Instances For

      Classify speaker orientation from the goal weight λ ∈ [0,1]

      Equations
      • One or more equations did not get rendered due to their size.
      Instances For

        goalOrientedUtility and combinedWeighted are literally the same function (up to the (1,β) scaling).

        Both representations agree at β=0: cooperative RSA.

        theorem RSA.NoncooperativeCommunication.bayesFactor_monotone_in_posterior (bf₁ bf₂ pG : ) (hG : 0 < pG) (hG1 : pG < 1) (hbf₁ : 0 < bf₁) (_hbf₂ : 0 < bf₂) (hord : bf₂ < bf₁) :
        bf₂ * pG / (bf₂ * pG + (1 - pG)) < bf₁ * pG / (bf₁ * pG + (1 - pG))

        The ordinal bridge: an utterance with higher Bayes factor (C&F's measure) also induces a higher posterior for the goal (Barnett's measure).

        P(G|u) = bf · P(G) / (bf · P(G) + P(¬G))

        This function is strictly increasing in bf when P(G) ∈ (0,1).

        theorem RSA.NoncooperativeCommunication.positive_argStr_iff_posterior_above_prior (pG pNotG prior : ) (hG : 0 < pG) (hNotG : 0 < pNotG) (hPrior : 0 < prior) (hPrior1 : prior < 1) :
        ArgumentativeStrength.hasPositiveArgStr pG pNotG pG * prior / (pG * prior + pNotG * (1 - prior)) > prior

        C&F's positive argumentative strength (bayesFactor > 1) iff the utterance shifts the posterior above the prior.

        bayesFactor(u,G) > 1 iff P(G|u) > P(G)

        This connects C&F's ordinal condition hasPositiveArgStr to the Bayesian posterior shift that Barnett's model operates on.

        Level of meaning at which falsehood can occur.

        @cite{cummins-2025} identifies four levels, ordered by speaker blameworthiness: assertion > implicature > presupposition > typicality.

        Both C&F and Barnett involve misleading at the typicality/implicature level while maintaining truthful assertions — the argumentative speaker exploits pragmatic expectations without violating Quality at the assertion level.

        Instances For
          Equations
          • One or more equations did not get rendered due to their size.
          Instances For

            Epistemic vigilance: the hearer's trust in speaker cooperativity.

            Following @cite{sperber-2010} as discussed in Cummins (2025 §4):

            1. Hearer first interprets as if speaker is cooperative (stance of trust)
            2. Then weighs the pragmatic interpretation by trust level τ
            3. Falls back toward literal interpretation as trust decreases

            The vigilant posterior is: P_V(w|u) = τ · P_L1(w|u) + (1−τ) · P_L0(w|u)

            This is CombinedUtility.combined(τ, L0_posterior, L1_posterior).

            Instances For
              Equations
              • One or more equations did not get rendered due to their size.
              Instances For

                Full trust: standard pragmatic interpretation

                Equations
                • One or more equations did not get rendered due to their size.
                Instances For

                  No trust: literal interpretation only

                  Equations
                  • One or more equations did not get rendered due to their size.
                  Instances For

                    Vigilant listener posterior: trust-weighted mixture of L1 and L0.

                    When the hearer suspects the speaker is argumentative (low τ), they discount pragmatic enrichment and rely more on literal meaning.

                    Equations
                    Instances For

                      At full trust, vigilant listener = pragmatic listener L1

                      At zero trust, vigilant listener = literal listener L0

                      Vigilant posterior IS CombinedUtility.combined(τ, L0, L1). The epistemic vigilance machinery reuses the same interpolation framework as the speaker's utility tradeoff.

                      theorem RSA.NoncooperativeCommunication.vigilant_convex (ev : EpistemicVigilance) (l1Post l0Post : ) :
                      min l0Post l1Post vigilantPosterior ev l1Post l0Post vigilantPosterior ev l1Post l0Post max l0Post l1Post

                      Vigilant posterior is a convex combination when τ ∈ [0,1]

                      The unified noncooperative RSA model has two parameters:

                      • goalWeight (speaker side): convex weight on goal utility ∈ [0,1]
                      • τ (hearer side): trust level (1 = full trust, 0 = literal only)

                      Both sides use combined (convex interpolation), making the symmetry explicit:

                      • Speaker: combined goalWeight uEpi uGoal
                      • Hearer: combined τ l0Post l1Post (via vigilantPosterior)

                      Standard RSA is the special case goalWeight=0, τ=1. @cite{barnett-griffiths-hawkins-2022} is goalWeight=226/326 (≈ 0.693, from β=2.26), τ=1. A suspicious hearer facing an argumentative speaker would have high goalWeight, low τ.

                      • goalWeight :

                        Speaker's goal-orientation weight ∈ [0,1]

                      • τ :

                        Hearer's trust level

                      • goalWeight_nonneg : 0 self.goalWeight
                      • goalWeight_le_one : self.goalWeight 1
                      • τ_nonneg : 0 self.τ
                      • τ_le_one : self.τ 1
                      Instances For
                        Equations
                        • One or more equations did not get rendered due to their size.
                        Instances For

                          Standard cooperative RSA: no goal bias, full trust

                          Equations
                          • One or more equations did not get rendered due to their size.
                          Instances For

                            @cite{barnett-griffiths-hawkins-2022} fitted model: goalWeight = β/(1+β) = 226/326 ≈ 0.693, pragmatic group with full trust.

                            Original paper parameterization: β̂ = 2.26 (additive form). Convex reparameterization: goalWeight = 2.26/3.26 = 226/326.

                            Equations
                            • One or more equations did not get rendered due to their size.
                            Instances For
                              def RSA.NoncooperativeCommunication.fullModel (params : NoncooperativeRSAParams) (uEpi uGoal l1Post l0Post : ) :

                              In the unified model, BOTH sides use combined (convex interpolation):

                              • Speaker: combined goalWeight uEpi uGoal
                              • Hearer: vigilantPosterior(τ, L1, L0) = combined τ L0 L1

                              Full model: speaker chooses u to maximize combined(goalWeight, U_epi, U_goal), listener computes combined(τ, L0(w|u), L1(w|u)).

                              Equations
                              • One or more equations did not get rendered due to their size.
                              Instances For
                                theorem RSA.NoncooperativeCommunication.fullModel_standard (uEpi uGoal l1Post l0Post : ) :
                                fullModel standardRSA uEpi uGoal l1Post l0Post = (uEpi, l1Post)

                                The unified model at standard parameters reduces to (U_epi, L1) — the standard RSA speaker utility and pragmatic listener.

                                Barnett et al.'s Eq. 6 (additive: U + β·V) is a scaled version of the convex combined form used in fullModel:

                                goalOrientedUtility uEpi uGoal β = (1+β) · combined(β/(1+β), uEpi, uGoal)

                                Since scaling by (1+β) > 0 preserves ranking, the additive and convex parameterizations are strategically equivalent.

                                theorem RSA.NoncooperativeCommunication.vigilant_mono_trust (l1Post l0Post : ) (ev1 ev2 : EpistemicVigilance) (hord : ev1.trustLevel < ev2.trustLevel) (hne : l0Post < l1Post) :
                                vigilantPosterior ev1 l1Post l0Post < vigilantPosterior ev2 l1Post l0Post

                                The vigilant posterior is monotone in trust: more trust pulls the posterior toward L1. When L1 is misleading, more trust = more misled.

                                This is the hearer-side consequence of higher_lambda_when_B_dominates: the vigilant posterior is combined τ L0 L1, so increasing τ gives more weight to L1 (the B component).

                                theorem RSA.NoncooperativeCommunication.vigilant_mono_trust_sym (l1Post l0Post : ) (ev1 ev2 : EpistemicVigilance) (hord : ev1.trustLevel < ev2.trustLevel) (hne : l1Post < l0Post) :
                                vigilantPosterior ev2 l1Post l0Post < vigilantPosterior ev1 l1Post l0Post

                                Symmetric: when L1 < L0, more trust DECREASES the posterior (toward L1).

                                theorem RSA.NoncooperativeCommunication.pragmatic_vulnerability (l1Post l0Post : ) (ev : EpistemicVigilance) (hτ0 : 0 < ev.trustLevel) (hτ1 : ev.trustLevel < 1) (h_diverge : l0Post < l1Post) :
                                l0Post < vigilantPosterior ev l1Post l0Post vigilantPosterior ev l1Post l0Post < l1Post

                                Pragmatic vulnerability (@cite{cummins-2025} §4): pragmatic inference is exploitable precisely because it is rational.

                                When L1 diverges from L0 (l0 < l1), the fully-pragmatic listener (τ=1) is maximally exposed to the divergence. Epistemic vigilance (0 < τ < 1) strictly pulls the posterior back toward the immune L0:

                                The weak evidence effect (@cite{barnett-griffiths-hawkins-2022}, weak_evidence_effect_s4) is a concrete instance: L0 correctly identifies stick 4 as evidence for "longer", but L1 at β=2 overshoots in the wrong direction. Reducing τ from 1 would pull the posterior back toward L0's correct assessment.

                                Linguistic content: An argumentative speaker (goalWeight > 0) exploits L1's reasoning about speaker intentions. L0, which interprets literally without modeling the speaker, cannot be manipulated this way. Vigilance (epistemic caution about speaker cooperativity) is the rational defense.

                                The converse also holds: if L1 = L0, no speaker parameterization can exploit the difference, since vigilantPosterior degenerates to L0 = L1 at any τ.

                                theorem RSA.NoncooperativeCommunication.pragmatic_vulnerability_sym (l1Post l0Post : ) (ev : EpistemicVigilance) (hτ0 : 0 < ev.trustLevel) (hτ1 : ev.trustLevel < 1) (h_diverge : l1Post < l0Post) :
                                l1Post < vigilantPosterior ev l1Post l0Post vigilantPosterior ev l1Post l0Post < l0Post

                                Symmetric vulnerability: when L1 undershoots L0 (l1 < l0), vigilance pulls the posterior upward from L1 toward L0.

                                When L1 = L0, pragmatic inference adds nothing exploitable: the vigilant posterior equals L0 = L1 at any trust level τ. No speaker parameterization can create a gap to exploit. This is the formal converse of pragmatic_vulnerability.

                                theorem RSA.NoncooperativeCommunication.vigilant_deviation_exact (ev : EpistemicVigilance) (l1Post l0Post : ) :
                                vigilantPosterior ev l1Post l0Post - l0Post = ev.trustLevel * (l1Post - l0Post)

                                Exact signed deviation from L0: the vigilant posterior deviates from L0 by exactly τ · (L1 - L0).

                                τ is literally the fraction of L1's divergence from L0 that the listener absorbs. At τ=0: deviation = 0 (immune). At τ=1: deviation = L1-L0 (fully exposed).

                                theorem RSA.NoncooperativeCommunication.vulnerability_gap_exact (ev : EpistemicVigilance) (l1Post l0Post : ) :
                                l1Post - vigilantPosterior ev l1Post l0Post = (1 - ev.trustLevel) * (l1Post - l0Post)

                                Exact gap between L1 and the vigilant posterior: the amount of L1's divergence that vigilance closes is (1-τ) · (L1 - L0).

                                At τ=0: gap = L1-L0 (fully closed). At τ=1: gap = 0 (nothing closed).

                                theorem RSA.NoncooperativeCommunication.exploitability_scales_as_tau_sq (ev : EpistemicVigilance) (l1Post l0Post : ) :
                                (vigilantPosterior ev l1Post l0Post - l0Post) ^ 2 = ev.trustLevel ^ 2 * (l1Post - l0Post) ^ 2

                                Squared exploitability scales as τ²: the squared deviation of the vigilant posterior from L0 is exactly τ² times the squared L1-L0 gap.

                                This gives the precise risk calculus for trust: doubling τ quadruples the squared deviation from L0.

                                theorem RSA.NoncooperativeCommunication.vigilant_error_decomposition (ev : EpistemicVigilance) (truth l1Post l0Post : ) :
                                vigilantPosterior ev l1Post l0Post - truth = ev.trustLevel * (l1Post - truth) + (1 - ev.trustLevel) * (l0Post - truth)

                                Error decomposition: the vigilant posterior's deviation from truth decomposes as a τ-weighted sum of L1's error and L0's error.

                                (vigilant - truth) = τ · (L1 - truth) + (1-τ) · (L0 - truth)

                                This is the fundamental equation of epistemic vigilance: the listener's error is a convex combination of the two listeners' errors.

                                theorem RSA.NoncooperativeCommunication.vigilant_error_when_l0_correct (ev : EpistemicVigilance) (l1Post truth : ) :
                                (vigilantPosterior ev l1Post truth - truth) ^ 2 = ev.trustLevel ^ 2 * (l1Post - truth) ^ 2

                                When L0 is perfectly calibrated (l0 = truth), the squared error of the vigilant posterior reduces to τ² · (L1 - truth)².

                                This is the strongest quantitative vulnerability result: if the literal listener is correct (as in Barnett's weak evidence domain where L0 correctly assesses stick 4), then vigilance reduces squared error by exactly a factor of τ². At τ=1 you absorb all of L1's error; at τ=0 you absorb none.

                                theorem RSA.NoncooperativeCommunication.optimal_vigilance (truth l0Post l1Post : ) (hne : l1Post l0Post) :
                                have τ_opt := (truth - l0Post) / (l1Post - l0Post); τ_opt * l1Post + (1 - τ_opt) * l0Post = truth

                                Zero-error vigilance: there exists a unique τ that makes the vigilant posterior exactly equal truth, namely τ* = (truth - L0) / (L1 - L0).

                                When truth is between L0 and L1, τ* ∈ [0,1] (optimal_vigilance_in_range), so it corresponds to a valid trust level.

                                The optimal τ* has a natural interpretation: it is the relative position of truth within the [L0, L1] interval. If truth is at L0 (literal listener is perfect), τ*=0. If truth is at L1 (pragmatic listener is perfect), τ*=1. If truth is at the midpoint, τ*=1/2.

                                theorem RSA.NoncooperativeCommunication.optimal_vigilance_in_range (truth l0Post l1Post : ) (hne : l0Post < l1Post) (hlo : l0Post truth) (hhi : truth l1Post) :
                                0 (truth - l0Post) / (l1Post - l0Post) (truth - l0Post) / (l1Post - l0Post) 1

                                The optimal τ* is in [0,1] when truth is between L0 and L1.

                                def RSA.NoncooperativeCommunication.backfire_generalization (n : ) (hn : 2 n) (l0_goal l1_goal : Fin n) (prior : ) :

                                Backfire generalization conjecture: the weak evidence effect and scalar implicature are structurally identical phenomena.

                                Whenever a pragmatic listener expects the speaker to use the strongest available utterance, observing a non-maximal one triggers a negative inference that can reverse the literal evidence.

                                Instances:

                                • Scalar implicature: hearing "some" → infer speaker couldn't say "all" → conclude ¬all
                                • Weak evidence effect: seeing stick 4 → infer speaker lacked stick 5 → conclude probably not "longer"
                                • Polite understatement: "not terrible" → infer speaker couldn't honestly say "good" → conclude mediocre

                                Abstract pattern: Let U = {u₁,..., uₙ} be utterances ordered by strength, and let L0(goal | uᵢ) be monotone in i. If the speaker is goal-oriented (prefers stronger utterances when goal is true), then for some non-maximal uᵢ:

                                L0(goal | uᵢ) > prior AND L1(goal | uᵢ, β) < prior

                                The pragmatic listener's inference — "if the speaker had stronger evidence, they would have used it" — reverses the literal evidence.

                                Formal statement: Given L0 posteriors monotone in utterance strength with at least two values above the prior, and L1 posteriors derived from a goal-oriented speaker (β > 0), there exists a non-maximal utterance where L0 says "goal is likely" but L1 says "goal is unlikely."

                                Evidence: weak_evidence_effect_s4 (BarnettEtAl2022) demonstrates this at β=2. A general proof requires formalizing L1 Bayesian inversion over abstract strength-ordered RSA scenarios.

                                What's missing for a proof: The L1 computation (Bayesian inversion of the speaker model) must be formalized generically over ordered utterance sets. The key step is showing that the speaker's monotone preference concentrates probability mass on maximal utterances, making the L1 posterior decrease for non-maximal ones. This is essentially the derivation of quantity implicatures from RSA, generalized beyond scales.

                                Equations
                                • One or more equations did not get rendered due to their size.
                                Instances For
                                  theorem RSA.NoncooperativeCommunication.barnett_backfire_instance :
                                  backfire_generalization 5 (fun (i : Fin 5) => match i with | 0, isLt => 1 / 6 | 1, isLt => 1 / 3 | 2, isLt => 1 / 3 | 3, isLt => 1 / 2 | 4, isLt => 2 / 3 | n.succ.succ.succ.succ.succ, h => absurd h ) (fun (i : Fin 5) => match i with | 0, isLt => 1 / 6 | 1, isLt => 1 / 3 | 2, isLt => 1 / 3 | 3, isLt => 7 / 20 | 4, isLt => 2 / 3 | n.succ.succ.succ.succ.succ, h => absurd h ) (2 / 5)

                                  The Barnett stick domain instantiates backfire_generalization.

                                  Sticks {s1,...,s5} ordered by length. L0(longer | sᵢ) is monotone (l0Longer_monotone). Sticks s4 and s5 both have L0 above the prior (s4_positive_under_l0, s5_strongest_evidence). At β=2, stick 4 backfires (weak_evidence_effect_s4).

                                  TODO: Prove via Fin 5 ↔ Stick correspondence and the existing BarnettEtAl2022 theorems.