Documentation

Linglib.Phenomena.Imperatives.Studies.SumersEtAl2023

Mushroom Foraging Cover Story #

Features: 3 colors × 3 textures

Each feature has a reward value in {-2, -1, 0, +1, +2} Mushroom reward = sum of color reward + texture reward

Feature types in the experimental domain

Instances For
    Equations
    • One or more equations did not get rendered due to their size.
    Instances For

      Possible reward values for features (coarse: data-level)

      Instances For
        Equations
        • One or more equations did not get rendered due to their size.
        Instances For

          Experiment 1: 1DP vs 2DP Contexts #

          Manipulated context complexity:

          Key prediction: In 1DP, truthful and relevant responses converge. In 2DP, they can diverge, revealing tradeoff.

          Context complexity conditions

          Instances For
            Equations
            • One or more equations did not get rendered due to their size.
            Instances For

              Experiment 1: Sample size and conditions

              • nParticipants :

                Number of participants

              • participantsPerCondition :

                Participants per condition (between-subjects)

              • trialsPerCondition :

                Trials per participant per condition

              • totalTrials :

                Total trials

              Instances For
                Equations
                Instances For

                  Experiment 1: Human response rates by condition.

                  From Figure 3a and statistical analysis:

                  • 1DP: ~91% relevant responses
                  • 2DP: ~60% relevant responses
                  • relevanceRate1DP :

                    Rate of relevance-maximizing responses in 1DP condition

                  • relevanceRate2DP :

                    Rate of relevance-maximizing responses in 2DP condition

                  • se1DP :

                    Standard error for 1DP

                  • se2DP :

                    Standard error for 2DP

                  Instances For
                    Equations
                    Instances For

                      Statistical test: Significant effect of context.

                      Mixed-effects logistic regression: β = -2.13, z = -5.95, p < 0.001

                      • beta :

                        Coefficient for 2DP vs 1DP effect

                      • zStat :

                        z-statistic

                      • pValueLessThan :

                        p < 0.001

                      Instances For
                        Equations
                        Instances For

                          Experiment 2: Instruction Manipulation #

                          Between-subjects manipulation of speaker goals:

                          Instruction conditions

                          Instances For
                            Equations
                            • One or more equations did not get rendered due to their size.
                            Instances For

                              Experiment 2: Design parameters

                              • nParticipants :

                                Total participants

                              • participantsUnbiased :

                                Participants per condition

                              • participantsTruthBiased :
                              • participantsRelevanceBiased :
                              • trialsPerParticipant :

                                Trials per participant

                              Instances For
                                Equations
                                • One or more equations did not get rendered due to their size.
                                Instances For

                                  Experiment 2: Human response rates by instruction condition.

                                  From Figure 3b:

                                  • Unbiased: ~55% relevant
                                  • Truth-biased: ~35% relevant
                                  • Relevance-biased: ~85% relevant
                                  • relevanceRateUnbiased :

                                    Relevance rate in unbiased condition

                                  • relevanceRateTruthBiased :

                                    Relevance rate in truth-biased condition

                                  • relevanceRateRelevanceBiased :

                                    Relevance rate in relevance-biased condition

                                  • seUnbiased :

                                    Standard errors

                                  • seTruthBiased :
                                  • seRelevanceBiased :
                                  Instances For
                                    Equations
                                    • One or more equations did not get rendered due to their size.
                                    Instances For

                                      Statistical tests for Experiment 2.

                                      Pairwise comparisons with Bonferroni correction:

                                      • Relevance-biased > Unbiased: χ² = 18.4, p < 0.001
                                      • Unbiased > Truth-biased: χ² = 8.2, p = 0.004
                                      • chiSqRelevanceVsUnbiased :

                                        Chi-squared: relevance vs unbiased

                                      • chiSqUnbiasedVsTruth :

                                        Chi-squared: unbiased vs truth

                                      • allSignificant : Bool

                                        Both p < 0.05 after correction

                                      Instances For
                                        Equations
                                        Instances For

                                          Model Comparison #

                                          The paper compares several models against human data:

                                          1. Combined model (truthfulness + relevance)
                                          2. Truthfulness-only
                                          3. Relevance-only
                                          4. Literal speaker

                                          Best-fit λ parameters by condition:

                                          MLE parameter estimates for λ by condition

                                          • lamUnbiased :

                                            λ for unbiased condition

                                          • lamTruthBiased :

                                            λ for truth-biased condition

                                          • lamRelevanceBiased :

                                            λ for relevance-biased condition

                                          Instances For
                                            Equations
                                            Instances For

                                              Model fit statistics (log-likelihood)

                                              • llCombined :

                                                Combined model log-likelihood

                                              • llTruthOnly :

                                                Truthfulness-only log-likelihood

                                              • llRelevanceOnly :

                                                Relevance-only log-likelihood

                                              • llLiteral :

                                                Literal speaker log-likelihood

                                              Instances For
                                                Equations
                                                Instances For

                                                  Summary of Key Empirical Patterns #

                                                  1. Tradeoff exists: Speakers don't maximize either truthfulness or relevance alone
                                                  2. Context-sensitive: More truthful in complex (2DP) contexts
                                                  3. Instruction-sensitive: λ shifts with explicit goal manipulation
                                                  4. Gradedness: Responses show graded preferences, not categorical choices

                                                  Example Trial #

                                                  World: Green = +2, Spotted = +1, other features have various values

                                                  Context (2DP):

                                                  True utterance: "Green is +2" Relevant utterance: "Spotted is +1" (if it uniquely identifies best mushroom)

                                                  Example trial structure

                                                  • worldDescription : String

                                                    World state description

                                                  • actions : List String

                                                    Available actions

                                                  • truthfulUtterance : String

                                                    Most truthful utterance (about highest feature)

                                                  • relevantUtterance : String

                                                    Most relevant utterance (identifies best action)

                                                  • diverges : Bool

                                                    Whether they diverge

                                                  Instances For
                                                    Equations
                                                    • One or more equations did not get rendered due to their size.
                                                    Instances For

                                                      Signaling Bandits: RSA Model #

                                                      @cite{frank-goodman-2012} @cite{sumers-hawkins-2023}

                                                      Unlike Lewis signaling games where world state = correct action, signaling bandits separate abstract knowledge (feature values) from concrete decisions (which action to take).

                                                      Features that characterize actions (e.g., colors, textures)

                                                      Instances For
                                                        Equations
                                                        • One or more equations did not get rendered due to their size.
                                                        Instances For

                                                          Feature values in the experimental range

                                                          Instances For
                                                            Equations
                                                            • One or more equations did not get rendered due to their size.
                                                            Instances For

                                                              All feature values

                                                              Equations
                                                              • One or more equations did not get rendered due to their size.
                                                              Instances For

                                                                All features

                                                                Equations
                                                                • One or more equations did not get rendered due to their size.
                                                                Instances For

                                                                  World state: mapping from features to values.

                                                                  In the mushroom experiment, this defines how valuable each feature is. Example: {Green -> +2, Red -> 0, Blue -> -2, Spotted -> +1, Solid -> 0, Striped -> -1}

                                                                  Instances For
                                                                    Equations
                                                                    • One or more equations did not get rendered due to their size.

                                                                    Get the rational value of a feature in a world

                                                                    Equations
                                                                    Instances For

                                                                      Action (mushroom) characterized by features it has

                                                                      • hasFeature : FeatureBool

                                                                        Which features this action has (e.g., a green spotted mushroom)

                                                                      • name : String

                                                                        Human-readable name

                                                                      Instances For
                                                                        Equations
                                                                        • One or more equations did not get rendered due to their size.

                                                                        Reward for taking an action in a world state.

                                                                        R(a,w) = Sum_f [a has f] * w(f)

                                                                        Linear combination of feature values for features the action has.

                                                                        Equations
                                                                        • One or more equations did not get rendered due to their size.
                                                                        Instances For

                                                                          Decision context: subset of available actions

                                                                          Instances For

                                                                            Utterance: claim about a feature's value.

                                                                            Example: "Spots are +1" = {feature :=.spotted, value :=.pos1}

                                                                            Instances For
                                                                              Equations
                                                                              • One or more equations did not get rendered due to their size.
                                                                              Instances For
                                                                                Equations
                                                                                • One or more equations did not get rendered due to their size.
                                                                                Instances For

                                                                                  All possible utterances (30 = 6 features x 5 values)

                                                                                  Equations
                                                                                  • One or more equations did not get rendered due to their size.
                                                                                  Instances For

                                                                                    Model parameters for Sumers et al. speaker model

                                                                                    • βS :

                                                                                      Speaker rationality (soft-max temperature)

                                                                                    • βL :

                                                                                      Listener rationality

                                                                                    • lam :

                                                                                      Tradeoff: 0 = pure truthfulness, 1 = pure relevance

                                                                                    • costWeight :

                                                                                      Cost weight

                                                                                    Instances For
                                                                                      Equations
                                                                                      • One or more equations did not get rendered due to their size.
                                                                                      Instances For
                                                                                        Equations
                                                                                        Instances For

                                                                                          Default parameters (matches Exp 1 Unbiased MLE)

                                                                                          Equations
                                                                                          Instances For

                                                                                            Speaker Utilities #

                                                                                            Three components:

                                                                                            1. Truthfulness (Eq. 5): epistemic preference for true utterances
                                                                                            2. Relevance (Eq. 8): decision-theoretic preference for action-improving utterances
                                                                                            3. Cost: production/processing effort

                                                                                            Truthfulness utility (Eq. 5).

                                                                                            U_T(u|w) = +1 if [u] = true = -1 if [u] = false

                                                                                            Note: This is a soft constraint via betaS, not a hard filter.

                                                                                            Equations
                                                                                            Instances For

                                                                                              Utterance cost.

                                                                                              Default: 0 for all utterances. Can be extended for valence bias (positive utterances preferred).

                                                                                              Equations
                                                                                              Instances For

                                                                                                Valence-based cost (from Exp 1 residual analysis).

                                                                                                Negative-valued utterances have higher cost (require more processing).

                                                                                                Equations
                                                                                                • One or more equations did not get rendered due to their size.
                                                                                                Instances For

                                                                                                  Combined utility (Eq. 9).

                                                                                                  U_C(u|w,A) = lambda*U_R(u|w,A) + (1-lambda)*U_T(u|w) - C(u)

                                                                                                  Convex combination of relevance and truthfulness, minus cost. Note: Relevance utility requires the full listener model, which depends on the removed RSA.Eval infrastructure. We define the combined utility in terms of the abstract combined function from CombinedUtility, with relevance as a parameter.

                                                                                                  Equations
                                                                                                  Instances For

                                                                                                    Experimental Domain: Mushroom Foraging #

                                                                                                    The experiments use a mushroom foraging cover story:

                                                                                                    Create a mushroom with one color and one texture

                                                                                                    Equations
                                                                                                    • One or more equations did not get rendered due to their size.
                                                                                                    Instances For

                                                                                                      Canonical world state from the experiment.

                                                                                                      Green = +2, Red = 0, Blue = -2 Spotted = +1, Solid = 0, Striped = -1

                                                                                                      Equations
                                                                                                      • One or more equations did not get rendered due to their size.
                                                                                                      Instances For

                                                                                                        Example context from Figure 6B: three mushrooms

                                                                                                        Equations
                                                                                                        • One or more equations did not get rendered due to their size.
                                                                                                        Instances For

                                                                                                          True utterance in canonical world

                                                                                                          Equations
                                                                                                          • One or more equations did not get rendered due to their size.
                                                                                                          Instances For

                                                                                                            False but relevant utterance

                                                                                                            Equations
                                                                                                            • One or more equations did not get rendered due to their size.
                                                                                                            Instances For

                                                                                                              True but irrelevant utterance (feature not in context)

                                                                                                              Equations
                                                                                                              • One or more equations did not get rendered due to their size.
                                                                                                              Instances For

                                                                                                                Key Theoretical Results #

                                                                                                                These connect to Comparisons/RelevanceTheories.lean for the deep theorems.

                                                                                                                Combined model reduces to truthfulness when lambda = 0.

                                                                                                                U_C(u|w,A) = U_T(u|w) when lambda = 0. Delegates to CombinedUtility.combined_at_zero.

                                                                                                                Combined model reduces to relevance when lambda = 1.

                                                                                                                U_C(u|w,A) = U_R(u|w,A) when lambda = 1. Delegates to CombinedUtility.combined_at_one.

                                                                                                                Truthfulness and relevance are independent objectives.

                                                                                                                In Lewis signaling games, they are perfectly correlated (knowing the world = knowing the best action). In signaling bandits, they can diverge:

                                                                                                                • True but irrelevant: "Green is +2" when no green actions in context
                                                                                                                • False but relevant: "Spots are +2" when spots are actually +1

                                                                                                                Witness 1 (true but irrelevant): "Green is +2" -- true in the canonical world but no green mushrooms appear in the example context. Witness 2 (false but relevant): "Spots are +2" -- false (spots are +1) but would steer the listener toward the spotted mushroom (the best action).

                                                                                                                Empirical Predictions from Experiments #

                                                                                                                The paper reports MLE parameters and response patterns.

                                                                                                                Experiment 1: Free choice paradigm.

                                                                                                                Participants chose from 30 utterances. MLE parameters:

                                                                                                                • Truth-biased: lambda = 0.35
                                                                                                                • Unbiased: lambda = 0.55
                                                                                                                • Relevance-biased: lambda = 0.85
                                                                                                                • truthBiased_lam :
                                                                                                                • unbiased_lam :
                                                                                                                • relevanceBiased_lam :
                                                                                                                Instances For

                                                                                                                  Experiment 2: Forced choice (endorsement) paradigm.

                                                                                                                  Participants endorsed specific utterances. MLE parameters:

                                                                                                                  • Truth-biased: lambda = 0.15
                                                                                                                  • Unbiased: lambda = 0.75
                                                                                                                  • Relevance-biased: lambda = 0.90
                                                                                                                  • truthBiased_lam :
                                                                                                                  • unbiased_lam :
                                                                                                                  • relevanceBiased_lam :
                                                                                                                  Instances For

                                                                                                                    Unbiased participants jointly optimize truthfulness and relevance.

                                                                                                                    Neither lambda = 0 (pure truth) nor lambda = 1 (pure relevance) fits the data. Participants make a graded tradeoff.

                                                                                                                    Connections to Other Frameworks #

                                                                                                                    Sumers et al. bridges several research traditions:

                                                                                                                    1. Standard RSA: Pure epistemic utility. Recovered when lambda = 0 and listener has identity decision problem.

                                                                                                                    2. Game-theoretic pragmatics (Benz, Parikh): Decision-theoretic relevance. Recovered when lambda = 1.

                                                                                                                    3. Relevance Theory (Sperber & Wilson): Relevance as primary. Empirically challenged: participants value truthfulness independently.

                                                                                                                    4. QUD models (Roberts): Question under discussion. QUDs can be derived from decision problems (Theorem 2).

                                                                                                                    See Comparisons/RelevanceTheories.lean for the formal connections:

                                                                                                                    Standard RSA is a special case: when lambda = 0 and cost = 0, the combined utility equals truthfulness utility alone.

                                                                                                                    This recovers standard RSA's epistemic speaker, which soft-maximizes truthfulness (informativity). The identity-DP connection (Theorem 1 of Sumers et al.) is proved in combined_pure_truthfulness above.

                                                                                                                    Relevance Theory predicts lambda = 1, which is empirically falsified

                                                                                                                    Summary #

                                                                                                                    Unified speaker model combining truthfulness and relevance:

                                                                                                                    U_C(u|w,A) = lambda*U_R(u|w,A) + (1-lambda)*U_T(u|w) - C(u)

                                                                                                                    Empirical findings:

                                                                                                                    1. Participants use both truthfulness and relevance (0 < lambda < 1)
                                                                                                                    2. Neither objective strictly dominates
                                                                                                                    3. The tradeoff is graded, not binary

                                                                                                                    Theoretical implications:

                                                                                                                    theorem Phenomena.Imperatives.Studies.SumersEtAl2023.sumers_uses_combined (lam uT uR costWeight cost : ) :
                                                                                                                    combinedUtility lam uT uR costWeight cost = RSA.CombinedUtility.combined lam uT uR (costWeight * cost)

                                                                                                                    Sumers et al.'s combinedUtility is CombinedUtility.combined(lambda, U_T, U_R, cost).

                                                                                                                    This makes the shared combined theorems (combined_at_zero, combined_at_one, combined_convex, combined_mono_A/B) directly applicable.

                                                                                                                    The integrated model of truthfulness and relevance

                                                                                                                    Equations
                                                                                                                    Instances For