Documentation

Linglib.Phenomena.Reference.Studies.HawkinsGweonGoodman2021

================================================================ PART I: EMPIRICAL DATA ================================================================

Experimental Design #

Experiment 1: Speaker Production #

Experiment 2: Listener Comprehension #

Key Empirical Findings #

1. Speakers increase informativity with occlusions (Exp 1) #

2. Scripted utterances cause more errors (Exp 2) #

3. Listeners adapt over time #

4. Speaker informativity predicts listener accuracy #

Visual perspective state in director-matcher task

Instances For
    Equations
    • One or more equations did not get rendered due to their size.
    Instances For

      Trial type in Experiment 1

      • occlusionPresent : Bool
      • distractorPresent : Bool
      Instances For
        Equations
        • One or more equations did not get rendered due to their size.
        Instances For
          Equations
          • One or more equations did not get rendered due to their size.
          Instances For

            All trial types in 2×2 design

            Equations
            • One or more equations did not get rendered due to their size.
            Instances For

              Mean words produced in each condition

              Equations
              Instances For

                Feature mention rates by condition (Exp 1, Figure 4B)

                Instances For
                  Equations
                  • One or more equations did not get rendered due to their size.
                  Instances For

                    Occlusion increases feature mention rates (distractor-absent)

                    Equations
                    Instances For

                      Speaker condition in Experiment 2

                      Instances For
                        Equations
                        • One or more equations did not get rendered due to their size.
                        Instances For

                          Informativity: how well utterance fits target vs distractor

                          • targetFit :
                          • distractorFit :
                          Instances For
                            Equations
                            • One or more equations did not get rendered due to their size.
                            Instances For

                              Scripted utterances: roughly equal fit (by design)

                              Equations
                              Instances For

                                Unscripted utterances: much better target fit

                                Equations
                                Instances For

                                  The paper identifies these key qualitative predictions:

                                  1. Speakers hedge against known unknowns: Increase informativity with occlusions
                                  2. Division of labor depends on expectations: Optimal effort = f(partner's expected effort)
                                  3. Listeners adapt to speaker behavior: Update beliefs about speaker's effort over time
                                  4. Intermediate weights are optimal: When perspective-taking is costly, partial weighting is best
                                  Instances For
                                    Equations
                                    • One or more equations did not get rendered due to their size.
                                    Instances For

                                      All key predictions from the paper

                                      Equations
                                      • One or more equations did not get rendered due to their size.
                                      Instances For

                                        Critical item from @cite{keysar-etal-2003} replication

                                        Instances For
                                          Equations
                                          • One or more equations did not get rendered due to their size.
                                          Instances For

                                            The 8 critical items used in Experiment 2

                                            Equations
                                            • One or more equations did not get rendered due to their size.
                                            Instances For

                                              ================================================================ PART II: RSA MODEL ================================================================

                                              Two RSAConfig instances formalize the reference game:

                                              Utterance semantics derive from predicate modification (Part III): each feature word is an intersective adjective, composed via predMod.

                                              The 3 visible objects in the example display.

                                              target: shape=0, color=0, texture=0 d1: shape=1, color=0, texture=0 (shares color+texture with target) d2: shape=2, color=1, texture=1 (differs on all features)

                                              Instances For
                                                Equations
                                                • One or more equations did not get rendered due to their size.
                                                Instances For
                                                  Equations
                                                  • One or more equations did not get rendered due to their size.

                                                  The 4 objects in the asymmetric display (3 visible + 1 behind occlusion)

                                                  Instances For
                                                    Equations
                                                    • One or more equations did not get rendered due to their size.
                                                    Instances For
                                                      Equations
                                                      • One or more equations did not get rendered due to their size.

                                                      Utterance: which features to mention (2³ = 8 possible utterances)

                                                      Instances For
                                                        Equations
                                                        • One or more equations did not get rendered due to their size.
                                                        Instances For
                                                          Equations
                                                          • One or more equations did not get rendered due to their size.

                                                          Does utterance apply to an entity with given feature-match profile? For each feature the utterance mentions, the entity must match the target.

                                                          Instances For

                                                            Egocentric RSA: reference game among 3 visible objects. Belief-based scoring (S1 score = L0^α), α = 2, uniform priors.

                                                            Equations
                                                            • One or more equations did not get rendered due to their size.
                                                            Instances For

                                                              Asymmetric RSA: reference game with hidden object behind occlusion. Latent = (matchShape, matchColor, matchTexture) for hidden object. Prior: each feature independently matches target with probability 1/4, encoded as unnormalized weights (1 for match, 3 for non-match).

                                                              Equations
                                                              • One or more equations did not get rendered due to their size.
                                                              Instances For

                                                                ================================================================ PART III: COMPOSITIONAL GROUNDING ================================================================

                                                                The utterance semantics derive from predicate modification (H&K Ch. 4):

                                                                ⟦α β⟧ = λx. ⟦α⟧(x) ∧ ⟦β⟧(x)

                                                                Each feature mention (shape, color, texture) is an intersective adjective that denotes a characteristic function of type e → t:

                                                                This is exactly Semantics.Montague.Modification.predMod applied iteratively.

                                                                Compositional utterance denotation via intersective predicate modification. Each mentioned feature contributes an intersective adjective, composed left-to-right via predMod.

                                                                Instances For

                                                                  Grounding theorem: egoMeaning equals the compositional derivation. The ad-hoc semantics match Montague intersective predicate modification.

                                                                  ================================================================ PART IV: PREDICTIONS VIA rsa_predict ================================================================

                                                                  Core RSA predictions verified via rsa_predict. The egocentric model captures the no-occlusion case; the asymmetric model captures occlusion.

                                                                  In the egocentric model, the listener is equally confident about the target whether hearing shape-only or full description. Both uniquely identify target among visible objects, so additional features add nothing.

                                                                  S1 is indifferent between shape-only and full description for target (both have L0 = 1 among visible objects).

                                                                  Paper Prediction 1: Full description produces higher L1 posterior for target than shape-only under asymmetry. Hidden objects can match individual features (P(match_shape) = 1/4), so more specific utterances are more reliably informative.

                                                                  Shape+color also beats shape-only: each additional feature narrows the set of possible hidden distractors.

                                                                  When hidden object matches target's shape (but not color or texture), S1 prefers full description over shape-only. Shape-only fails to distinguish target from hidden; full description succeeds.

                                                                  When hidden matches no features, S1 is indifferent: both shape-only and full description have L0 = 1 for target.

                                                                  Even under asymmetry, L1 correctly identifies target over d1 (which differs in shape).

                                                                  ================================================================ PART V: EXTENSIONS (Mixture Model & Resource-Rational Analysis) ================================================================

                                                                  The mixture model (Eq. 5) and resource-rational optimization (Eq. 10-11) sit outside the standard RSA loop. These are paper-specific extensions, defined in ℝ and grounded in RSAConfig.L0.

                                                                  Key equations from the paper:

                                                                  The mixture operates in log-space (over utilities, not probabilities). This means the mixture speaker uses a weighted geometric mean of L0 values, not an arithmetic mean: exp(w_S · E[log L0^asym] + (1−w_S) · log L0^ego).

                                                                  Parameters: α = 2, cost(u) = 0.03 (uniform, cancels in S1 normalization).

                                                                  Egocentric L0 success rate: P_L0^ego(target | u). Grounded directly in cfgEgo.L0.

                                                                  Equations
                                                                  • One or more equations did not get rendered due to their size.
                                                                  Instances For

                                                                    Asymmetric L0 success rate: E_l[P_L0^asym(target | u, l)]. Marginalizes the literal listener's success over hidden object profiles, weighted by the latent prior.

                                                                    Equations
                                                                    • One or more equations did not get rendered due to their size.
                                                                    Instances For

                                                                      Expected log-L0 under the asymmetric model (Eq. 2, utility component): E_h[log P_L0(target | u, C ∪ {h})]. This is inside the expectation, so by Jensen's inequality asymLogInfR(u) ≤ log(asymInfR(u)).

                                                                      Equations
                                                                      • One or more equations did not get rendered due to their size.
                                                                      Instances For

                                                                        Mixture speaker utility (Eq. 5): U^mix(u; w_S) = w_S · E_h[log P_L0^asym(target|u,h)] + (1−w_S) · log P_L0^ego(target|u) Uniform cost (0.03) omitted: it cancels in S1 normalization.

                                                                        Equations
                                                                        • One or more equations did not get rendered due to their size.
                                                                        Instances For

                                                                          Mixture S1 score: P_S1^mix(u | target, w_S) ∝ exp(α · U^mix(u; w_S)). Paper Eq. 1 with the mixture utility from Eq. 5.

                                                                          Equations
                                                                          Instances For

                                                                            The full model marginalizes over listener perspective-taking weight w_L.

                                                                            The simplified model (Eqs 2–5) treats w_L as fixed at 1. The full model
                                                                            (Eqs 7–9) has the speaker consider a range of listener weights, and the
                                                                            resource-rational analysis (Eq. 10) measures accuracy averaged over w_L.
                                                                            
                                                                            **Mixture L0** (Eq. 8): P_{L_0}^{mix}(target|u, l, w_L) =
                                                                              w_L · P_{L_0}^{asym}(target|u, l) + (1−w_L) · P_{L_0}^{ego}(target|u).
                                                                            At w_L = 0, the listener ignores hidden objects. At w_L = 1, the listener
                                                                            accounts for all potential hidden distractors.
                                                                            
                                                                            **Marginalized S1** (Eq. 9): the speaker's utility integrates over w_L,
                                                                            discretized to 5 grid points {0, 1/4, 1/2, 3/4, 1} with uniform weight.
                                                                            
                                                                            **Accuracy** (Eq. 10): since listener accuracy is linear in w_L,
                                                                            E_{uniform w_L}[accuracy] = (egoInfR + asymInfR) / 2. 
                                                                            

                                                                            Mixture L0 accuracy: probability the mixture listener at weight w_L correctly identifies the target, given hidden object profile l (Eq. 8).

                                                                            Equations
                                                                            • One or more equations did not get rendered due to their size.
                                                                            Instances For

                                                                              Asymmetric speaker utility at a specific listener weight (Eq. 7). U^asym(u; w_L) = Σ_l P(l)/Z · log(P_L0^mix(target|u, l, w_L))

                                                                              Equations
                                                                              • One or more equations did not get rendered due to their size.
                                                                              Instances For

                                                                                Mixed speaker utility at specific (w_S, w_L) (Eq. 8).

                                                                                Equations
                                                                                • One or more equations did not get rendered due to their size.
                                                                                Instances For

                                                                                  W_L-marginalized speaker utility (Eq. 9 inside the exp). Discretized: 5 uniform grid points at w_L ∈ {0, 1/4, 1/2, 3/4, 1}.

                                                                                  Equations
                                                                                  Instances For

                                                                                    Listener accuracy averaged over uniform w_L (for Eq. 10). Since accuracy(u, w_L) = w_L·asymInfR(u) + (1−w_L)·egoInfR(u) is linear in w_L, the expectation under uniform P(w_L) is the midpoint.

                                                                                    Equations
                                                                                    • One or more equations did not get rendered due to their size.
                                                                                    Instances For

                                                                                      Full expected accuracy (Eq. 10) with w_L marginalization. Uses the w_L-marginalized S1 for speaker production and the w_L-averaged listener accuracy for evaluation.

                                                                                      Equations
                                                                                      • One or more equations did not get rendered due to their size.
                                                                                      Instances For

                                                                                        Full resource-rational utility (Eqs 10–11). U_RR(w_S) = ExpAccuracy_full(w_S) − β · w_S

                                                                                        Equations
                                                                                        Instances For

                                                                                          At w_S = 0, the simplified mixture utility reduces to egocentric log-L0.

                                                                                          At w_S = 1, the simplified mixture utility reduces to asymmetric expected log-L0.

                                                                                          Paper prediction (β = 0): When perspective-taking is free, full PT (w_S = 1) achieves higher expected accuracy than no PT (w_S = 0). The asymmetric speaker produces more specific utterances, improving listener accuracy. (Paper Figure 2, rightmost point of β = 0 curve.)

                                                                                          Paper prediction (high β): When perspective-taking is costly, the cost term β · w_S dominates, making w_S = 0 preferable to w_S = 1. (Paper Figure 2, β = 0.5 curve.)

                                                                                          Interior optimum limitation: The paper's central result (§2.4, Figure 2) is that at moderate cost (β = 0.2), an intermediate weight w*_S ≈ 0.36 outperforms both extremes.

                                                                                          Our 3+1 object reference game is too simple to produce this effect. Shape alone uniquely identifies the target among visible objects (egoInfR .s = 1), so the egocentric baseline accuracy is ≈97%. The marginal accuracy gain from perspective-taking is ≈0.3%, far below the β = 0.2 cost. The interior optimum requires a richer display where egocentric accuracy is substantially lower, creating a larger incentive for specific utterances that disambiguate from hidden objects.

                                                                                          Verified: rrUtilityFull 0 2 β > rrUtilityFull 1 2 β for all tested β ≥ 1/50 (even with the full w_L-marginalized model).

                                                                                          Listener's belief about speaker's perspective-taking weight. Over time, listeners update their expectation of w_S based on observed utterance informativity.

                                                                                          • wS_expectation :
                                                                                          • observations :
                                                                                          Instances For

                                                                                            Initial uniform belief: E[w_S] = 1/2

                                                                                            Equations
                                                                                            Instances For

                                                                                              Update beliefs after observing utterance informativity. Short/uninformative utterances → lower w_S estimate; long/informative utterances → higher w_S estimate.

                                                                                              Equations
                                                                                              • One or more equations did not get rendered due to their size.
                                                                                              Instances For

                                                                                                After seeing short utterances, listener expects lower w_S

                                                                                                Equations
                                                                                                • One or more equations did not get rendered due to their size.
                                                                                                Instances For

                                                                                                  Paper prediction (@cite{hawkins-gweon-goodman-2021} §2.4.1): Listeners infer low speaker effort from under-informative utterances.

                                                                                                  Optimal listener weight: compensate for low speaker effort. When the speaker uses low w_S, the listener should increase their own perspective-taking to compensate.

                                                                                                  Equations
                                                                                                  Instances For

                                                                                                    Paper prediction (@cite{hawkins-gweon-goodman-2021} §2.4.1): Listener increases effort when speaker decreases theirs.