Documentation

Linglib.Phenomena.ScalarImplicatures.Studies.BarnettEtAl2022

@cite{barnett-griffiths-hawkins-2022}: A Pragmatic Account of the Weak Evidence Effect #

@cite{barnett-griffiths-hawkins-2022}

Extends RSA with a persuasive speaker who has a goal state w* that may differ from the true world state w. The speaker's utility combines epistemic and persuasive components (Eq. 6):

U(u; w, w*) = U_epi(u; w) + β · U_pers(u; w*)

where U_epi = ln P_L0(w|u) and U_pers = ln P_L0(w*|u). The parameter β controls persuasive bias (β=0 recovers standard RSA).

Key Result: The Weak Evidence Effect #

When β > 0, weak positive evidence can backfire: a pragmatic listener who expects the speaker to show the strongest available evidence infers that the absence of strong evidence means it doesn't exist, shifting beliefs in the opposite direction.

Stick Contest Domain #

The paper's experiment uses 5 sticks from {1,...,9} (C(9,5)=126 worlds, midpoint 5). We formalize a simplified Stick Contest (3 sticks from {1,...,5}, 10 worlds, midpoint 3) that preserves the key structural properties: the prior favors ¬longer (P(longer)=0.4), and sticks have monotonically increasing L0(longer|·) values. The simplification enables verified interval arithmetic via rsa_predict.

RSAConfig Design #

The paper's Eq. 8 gives:

S(u|w, w*=longer, β) ∝ L0(longer|u)^β · 𝟙[u ∈ w]

Since the paper fixes α=1 and treats αβ as a single parameter, RSAConfig's α plays the role of β. The s1Score uses precomputed L0(longer|u) values squared (β=2), gated by stick availability.

Findings #

#FindingTheorem
1L0: stick 5 is positive evidence for "longer"l0_s5_positive
2L0: stick 5 is the strongest evidencel0_s5_strongest
3L0: stick 1 is evidence against "longer"l0_s1_negative
4WEE: stick 4 backfires under L1 (β=2)weak_evidence_effect
5Strong evidence works: stick 5 favors "longer" under L1strong_evidence_works
6L0(longer·) is monotone in stick length
7Stick 4 has positive argStr yet backfiresargStr_positive_but_backfires
8Model predicts the observed interaction effectmodel_predicts_interaction
9Pragmatic group shows backfire in experimentpragmatic_backfire
10RSA speaker-dependent model fits bestrsa_speaker_dep_best_waic

Stick lengths 1–5

Instances For
    Equations
    • One or more equations did not get rendered due to their size.
    Instances For
      Equations
      • One or more equations did not get rendered due to their size.

      Worlds: sets of 3 distinct sticks from {1,...,5}. C(5,3) = 10 worlds.

      Instances For
        Equations
        • One or more equations did not get rendered due to their size.
        Instances For
          Equations
          • One or more equations did not get rendered due to their size.

          Whether a stick is available in a given world.

          Equations
          Instances For

            S1 score as ℚ: L0(longer|u)^β · 𝟙[u ∈ w], at β=2. The squared L0 values are precomputed as literal fractions so that the reifier extracts concrete ℚ values from the ℚ→ℝ cast without needing to reduce function calls.

            Equations
            • One or more equations did not get rendered due to their size.
            Instances For

              @cite{barnett-griffiths-hawkins-2022} RSA with persuasive speaker.

              The s1Score implements Eq. 8: S(u|w, longer, β) ∝ L0(longer|u)^β · 𝟙[u ∈ w]. The score is precomputed as s1ScoreQ (ℚ) and cast to ℝ, so that rsa_predict's reifier can extract the rational value directly.

              Equations
              • One or more equations did not get rendered due to their size.
              Instances For

                L0(longer|s5) > L0(¬longer|s5): stick 5 is positive evidence for "longer". 4 of 6 worlds containing s5 are longer, vs 2 not-longer.

                L0(longer|s5) > L0(longer|s4): stick 5 provides stronger evidence than s4.

                L0(¬longer|s1) > L0(longer|s1): stick 1 is evidence against "longer". Only 1 of 6 worlds containing s1 is longer.

                L0(longer|·) is monotonically increasing in stick length. This structural property ensures the simplified domain faithfully mirrors the paper's full domain (Appendix Theorem 2).

                Weak evidence effect: at β=2, showing stick 4 (positive literal evidence) decreases the pragmatic listener's belief in "longer" below the prior.

                The listener reasons: "If the true average were high, the speaker would have had stronger sticks (like 5) available and would have shown them instead. The fact that they showed a 4 implies they lacked stronger evidence."

                L1 assigns more posterior mass to ¬longer than longer worlds after seeing s4.

                Strong evidence does NOT backfire: stick 5 increases belief at β=2.

                The strongest available evidence is always effective because it cannot be "explained away" by the absence of something better.

                At β=1, the persuasive utility equals combinedWeighted(1,1,...).

                The paper's Eq. 6 (additive: U_epi + β·U_pers) equals (1+β) · combined(β/(1+β), U_epi, U_pers).

                Connection to ArgumentativeStrength: stick 4 has positive argumentative strength for the goal "longer" (L0(longer|s4) = 1/2 > 2/5 = P(longer)).

                Stick 3 does NOT have positive argumentative strength (L0(longer|s3) = 1/3 < 2/5 = P(longer)).

                The weak evidence effect shows that argumentatively positive evidence can still backfire under a pragmatic listener model. This is the core insight connecting @cite{barnett-griffiths-hawkins-2022} to @cite{cummins-franke-2021}'s work on argumentative strength.

                Stick 4 has positive argStr at L0 (1/2 > 2/5), yet L1 assigns more mass to ¬longer than longer after seeing s4.

                Listener type inferred from speaker expectation phase

                Instances For
                  Equations
                  • One or more equations did not get rendered due to their size.
                  Instances For

                    Evidence strength conditions (distance from midpoint 5")

                    Instances For
                      Equations
                      • One or more equations did not get rendered due to their size.
                      Instances For

                        Stick Contest design parameters

                        • nSticks :
                        • minLength :
                        • maxLength :
                        • midpoint :
                        • nParticipants :
                        Instances For
                          Equations
                          • One or more equations did not get rendered due to their size.
                          Instances For

                            The actual experimental parameters

                            Equations
                            Instances For

                              Behavioral result for a listener group

                              Instances For
                                Equations
                                • One or more equations did not get rendered due to their size.
                                Instances For

                                  Pragmatic group: weak evidence backfires (mean below 50). 95% CI: [32.3, 37.3] (paper p. 175).

                                  Equations
                                  • One or more equations did not get rendered due to their size.
                                  Instances For

                                    Literal group: no weak evidence effect (mean at 50). CIs not reported in paper.

                                    Equations
                                    • One or more equations did not get rendered due to their size.
                                    Instances For

                                      Pragmatic group shows backfire: mean significantly below 50 (midpoint)

                                      Model families compared

                                      Instances For
                                        Equations
                                        • One or more equations did not get rendered due to their size.
                                        Instances For

                                          Model variant (how individual differences are handled)

                                          Instances For
                                            Equations
                                            • One or more equations did not get rendered due to their size.
                                            Instances For

                                              Model comparison result from Table 1

                                              Instances For
                                                Equations
                                                • One or more equations did not get rendered due to their size.
                                                Instances For

                                                  Table 1 data

                                                  Equations
                                                  • One or more equations did not get rendered due to their size.
                                                  Instances For

                                                    The RSA speaker-dependent model has the best (highest) log-likelihood

                                                    The RSA speaker-dependent model has the best (lowest) WAIC

                                                    Fitted parameters for the best model (RSA speaker-dependent). MAP values from Appendix Fig S5; CV values from main paper Fig 3B.

                                                    • betaMAP :
                                                    • betaCV :
                                                    • responseOffsetMAP :
                                                    • responseOffsetCV :
                                                    • pragmaticMixWeight :
                                                    • literalMixWeight :
                                                    Instances For
                                                      Equations
                                                      • One or more equations did not get rendered due to their size.
                                                      Instances For
                                                        Equations
                                                        • One or more equations did not get rendered due to their size.
                                                        Instances For

                                                          β > 0 provides strong support for non-zero persuasive bias

                                                          Pragmatic group is best explained by J1 (pragmatic listener model)

                                                          Literal group is best explained by J0 (literal listener model)

                                                          The RSA model predicts the qualitative pattern underlying the observed interaction between listener type and evidence strength (t(718) = 5.2, p < 0.001). The literal model (L0) assigns s4 positive argumentative strength, predicting no backfire. The pragmatic model (L1) shows backfire. The experiment confirms exactly this divergence: pragmatic participants' mean (34.7) falls below neutral (50), while literal participants' mean (50.1) does not.