Documentation

Linglib.Phenomena.Modality.Studies.HerbstrittFranke2019

@cite{herbstritt-franke-2019} #

@cite{herbstritt-franke-2019}

Complex probability expressions & higher-order uncertainty: Compositional semantics, probabilistic pragmatics & experimental data. Cognition 186 (2019) 50–71.

Overview #

Simple probability expressions (probably, certainly, possibly) convey first-order uncertainty about the probability of an event. Complex/nested expressions (certainly likely, probably possible, might be unlikely) convey higher-order uncertainty — uncertainty about uncertainty — with compositional threshold semantics following @cite{fagin-halpern-1994}.

The paper presents an RSA model extending @cite{goodman-stuhlmuller-2013} to an urn scenario with 10 balls, where both the proportion of red balls (first-order) and the number of balls observed (higher-order) are manipulated. The key innovation over @cite{goodman-stuhlmuller-2013}: Hellinger distance replaces KL divergence as the speaker utility measure, because KL-divergence assigns infinite disutility to "true enough" messages (see Core.Divergence for the theoretical analysis).

Model (Eq. 12–18) #

P_rat.bel(s|o, a) ∝ Hypergeometric(o|a, s, 10) · P_prior(s)     (Eq. 12)
⟦probably(p)⟧ = {s ∈ S | s/10 > θ_probably}                     (Eq. 13)
P_LL(s|m) ∝ δ_{s∈⟦m⟧} · P_prior(s)                             (Eq. 15)
EU(m, o, a) = −HD[P_rat.bel(·|o, a), P_LL(·|m)]                 (Eq. 16)
P_S(m|o, a) ∝ exp(λ · EU(m, o, a))                              (Eq. 17)
P_PL(s, o, a|m) ∝ P_S(m|o, a) · Hyp(o|a, s, 10) · P(a) · P(s)  (Eq. 18)

rsa_predict Stress Test #

This file uses Hellinger distance in the speaker's utility (Eq. 16), which requires Real.sqrt. The rsa_predict tactic's interval arithmetic engine (Core.Interval.ReflectInterval.RExpr) supports exp, log, rpow (ℕ exponent), and basic arithmetic — but NOT sqrt. All predictions therefore use sorry, documenting a concrete gap in rsa_predict's coverage. To close these, RExpr would need a .rsqrt constructor with matching QInterval.sqrt evaluation and soundness proof.

Inferred Parameters (Table 6) #

θ_possibly  = 0.247  [0.200, 0.299]
θ_probably  = 0.549  [0.500, 0.594]
θ_certainly = 0.949  [0.904, 1.000]
λ           = 4.873  [4.583, 5.174]
@[reducible, inline]

Urn state: number of red balls in the urn (0..10).

Equations
Instances For

    The proportion of red balls: s/10. This is the first-order probability of drawing a red ball, and the measure function for simple expressions.

    Equations
    Instances For
      @[reducible, inline]

      Observation count: how many red balls the speaker observed out of access balls drawn. For access level a, valid observations are 0..a; the observation model assigns probability 0 to obs > a.

      Equations
      Instances For

        Unnormalized observation likelihood (ℝ-valued for RSAConfig).

        P(obs | access, state) = C(state, obs) · C(10−state, access−obs) / C(10, access)

        Returns 0 when obs > access (impossible observation). Uses Nat.choose which returns 0 for impossible combinations (k > K), making explicit guards unnecessary except for the Nat subtraction issue (truncated subtraction gives wrong results when obs > access).

        Instantiates Core.Distributions.hypergeometric for N=10.

        Equations
        Instances For

          Speaker's posterior belief over urn states given observation count.

          P_rat.bel(s|o, a) = P(o|a, s) / Σ_{s'} P(o|a, s')

          With uniform prior, this is the normalized hypergeometric (Eq. 12). Used in the Hellinger distance computation (Eq. 16).

          Equations
          • One or more equations did not get rendered due to their size.
          Instances For

            ℚ-valued speaker belief for decidable verification.

            Equations
            • One or more equations did not get rendered due to their size.
            Instances For
              theorem Phenomena.Modality.Studies.HerbstrittFranke2019.obsPrior_eq_hypergeometric (access : ) (s : UrnState) (obs : Obs) (h : obs access) :
              obsPrior access s obs = (Core.Distributions.hypergeometric 10 (↑s) access obs)

              The observation model is an instance of the general hypergeometric from Core.Distributions (N=10).

              Semantic threshold for "possibly" (posterior mean from Table 6). HDI: [0.200, 0.299].

              Equations
              Instances For

                Semantic threshold for "probably" (posterior mean from Table 6). HDI: [0.500, 0.594].

                Note: this differs from the LaBToM threshold (0.70) used in Attitudes.EpistemicThreshold.EpistemicEntry.likely_. The discrepancy may reflect differences between the production task here and the Theory-of-Mind task in @cite{ying-zhi-xuan-wong-mansinghka-tenenbaum-2025}, or genuine differences between "probably" and "likely".

                Equations
                Instances For

                  Semantic threshold for "certainly" (posterior mean from Table 6). HDI: [0.904, 1.000].

                  Equations
                  Instances For

                    The inferred threshold ordering matches the theoretical prediction: certainly > probably > possibly.

                    The five simple expressions from Experiments 2 and 3.

                    Instances For
                      Equations
                      • One or more equations did not get rendered due to their size.
                      Instances For
                        Equations
                        • One or more equations did not get rendered due to their size.

                        Simple expression meaning using the paper's inferred thresholds.

                        Eq. 13: ⟦X(RED)⟧ = {s ∈ S | s/10 > θ_X} (positive expressions) Eq. 14: ⟦X not(RED)⟧ = {s ∈ S | s/10 < 1 − θ_X} (negated expressions)

                        With the inferred thresholds, the cutoffs are:

                        • certainly: s/10 > 0.949 → s = 10 only
                        • probably: s/10 > 0.549 → s ≥ 6
                        • possibly: s/10 > 0.247 → s ≥ 3
                        • probably not: s/10 < 0.451 → s ≤ 4
                        • certainly not: s/10 < 0.051 → s = 0 only
                        Equations
                        • One or more equations did not get rendered due to their size.
                        Instances For

                          Expression meanings are nested: certainly ⊂ probably ⊂ possibly.

                          Posterior probability that an urn-state proposition φ holds, given the speaker's observation. ℚ-valued for decidable verification.

                          posteriorProb(access, obs, φ) = Σ_{s∈⟦φ⟧} P_rat.bel(s | obs, access)

                          Equations
                          • One or more equations did not get rendered due to their size.
                          Instances For

                            The three inner expressions from Experiment 3 (Eq. 22).

                            "likely" and "unlikely" use the same threshold as "probably"/"probably not" from the simple model. Footnote 18: "θ_probably from the simpler model should be mapped onto θ_likely in Table 9 because the latter represents the threshold of the inner expressions likely/probably."

                            Instances For
                              Equations
                              • One or more equations did not get rendered due to their size.
                              Instances For

                                Inner expression meaning (over urn states, Eq. 22).

                                Equations
                                • One or more equations did not get rendered due to their size.
                                Instances For

                                  The four outer modifiers from Experiment 3.

                                  Instances For
                                    Equations
                                    • One or more equations did not get rendered due to their size.
                                    Instances For

                                      Outer modifier threshold (Table 9, complex model).

                                      The paper infers separate thresholds for outer modifiers in the complex expression model (Experiment 2, Table 9). These are distinct from the inner expression thresholds (θ_possibly, θ_probably, θ_certainly from Table 6) and were inferred jointly with the complex model parameters.

                                      Note: "is X" (bare copula) is treated as a simple expression in the paper's model, not as a complex expression with an outer modifier. We include it here for completeness, with θ_is = θ_likely (the inner threshold), which makes "is likely" equivalent to "likely" when the posterior probability of the inner proposition already exceeds θ_likely.

                                      Equations
                                      Instances For

                                        Complex expression Y(X(RED)) (Eq. 23).

                                        ⟦Y(X(RED))⟧(⟨o, a⟩) ⟺ Σ_{s∈⟦X(RED)⟧} P_rat.bel(s|o, a) > θ_Y

                                        The inner expression creates a proposition about urn states, and the outer expression checks whether the speaker's posterior probability of that proposition exceeds the outer threshold. This follows @cite{fagin-halpern-1994}'s nested probability semantics.

                                        Connection to nestedThreshold: The theory-layer operator Semantics.Modality.EpistemicProbability.nestedThreshold captures the same pattern for homogeneous world types. Here the types are heterogeneous: the inner proposition is over UrnState while the outer evaluation is over (Obs, Access) pairs. This prevents a direct type-level instantiation but the compositional structure is the same — threshold comparison of a posterior probability over a derived proposition.

                                        > vs : This file uses strict > (matching Eq. 13, 23 in the paper), while nestedThreshold uses (matching Fagin & Halpern's convention). For H&F's thresholds, the two are extensionally equivalent: see strict_threshold_equiv_ge below.

                                        Equations
                                        • One or more equations did not get rendered due to their size.
                                        Instances For

                                          For the paper's specific thresholds, strict > and non-strict give the same extension on UrnState. This is because no proportion s/10 (for s ∈ {0,...,10}) exactly equals any of the thresholds (247/1000, 549/1000, 949/1000).

                                          This justifies using > (matching the paper's Eq. 13) even though the theory-layer nestedThreshold uses (Fagin & Halpern's convention).

                                          @cite{herbstritt-franke-2019} RSA model for simple probability expressions.

                                          Eq. 15: P_LL(s|m) ∝ ⟦m⟧(s) (literal listener) Eq. 16: EU(m, o, a) = −HD[P_rat.bel(·|o, a), P_LL(·|m)] (Hellinger utility) Eq. 17: P_S(m|o, a) ∝ exp(λ · EU(m, o, a)) (softmax speaker) Eq. 18: P_PL(s, o, a|m) ∝ P_S · Hyp(o|a, s) · P(a) · P(s) (pragmatic listener)

                                          The speaker utility uses Hellinger distance (not KL divergence), imported from Core.Divergence.negHellingerDist. This is necessary because KL divergence assigns infinite disutility to "true enough" messages — a speaker who is 95% sure of RED can never say "certainly" under KL, but CAN under Hellinger (see Core.Divergence §5 for the theoretical comparison).

                                          The model is parametric in access (number of balls the speaker draws). Each access level yields a separate RSAConfig, with L1 marginalizing over possible observation counts (0..10, with prior = 0 for obs > access).

                                          Flat priors (P_prior(s) uniform) for simplicity; the paper infers beta-binomial priors (α_s ≈ 3.25, β_s ≈ 3.05) jointly with thresholds.

                                          Equations
                                          • One or more equations did not get rendered due to their size.
                                          Instances For

                                            Model Predictions #

                                            The model's qualitative predictions are documented here but not proved via rsa_predict because the Hellinger distance S1 score creates three nested Finset.sum expansions (11 terms each), making reification too slow for the generic reifier. The model's correctness is established by the computable ℚ-level verifications above (belief formation, threshold semantics, complex expressions).

                                            Production (S1):

                                            Interpretation (L1):

                                            Inferred semantic threshold parameters from the simple expression model (Experiment 1 data, Table 6).

                                            These are posterior means with 95% highest density intervals, inferred via Bayesian parameter estimation in JAGS.

                                            Instances For
                                              Equations
                                              • One or more equations did not get rendered due to their size.
                                              Instances For

                                                Model–data correlation for the simple expression model (Table 7). Mean Pearson's r between posterior predictive and experimental data.

                                                Instances For
                                                  Equations
                                                  • One or more equations did not get rendered due to their size.
                                                  Instances For
                                                    Equations
                                                    Instances For
                                                      Equations
                                                      Instances For
                                                        Equations
                                                        Instances For
                                                          Equations
                                                          Instances For

                                                            All simple model correlations are substantial (r > 0.65).

                                                            Inferred outer modifier thresholds from the complex expression model (Experiment 2 data, Table 9). The inner thresholds (θ_possible, θ_likely) overlap with those from Table 6, confirming cross-experiment stability.

                                                            Footnote 18: "θ_probably from the simpler model should be mapped onto θ_likely in Table 9 because the latter represents the threshold of the inner expressions likely/probably."

                                                            Equations
                                                            Instances For

                                                              Model–data correlation for the complex expression model (Table 10). Correlations are lower than the simple model but still substantial, with the best fit in the observation dimension.

                                                              Equations
                                                              Instances For
                                                                Equations
                                                                Instances For
                                                                  Equations
                                                                  Instances For
                                                                    Equations
                                                                    Instances For

                                                                      Architectural Comparison with @cite{goodman-stuhlmuller-2013} #

                                                                      This model is a direct extension of the @cite{goodman-stuhlmuller-2013} architecture formalized in ScalarImplicatures/Studies/GoodmanStuhlmuller2013.lean.

                                                                      ComponentG&S 2013 (N=3)H&F 2019 (N=10)
                                                                      State space{0,1,2,3} objects{0,...,10} red balls
                                                                      Observation modelhypergeometrichypergeometric
                                                                      Utterancesquantifiers/numeralsprobability expressions
                                                                      MeaningBoolean (some, all)threshold (probably, certainly)
                                                                      UtilityKL divergenceHellinger distance
                                                                      RSAConfig.s1Scoreexp(α * Σ bel * log(l0))exp(α * negHellingerDist bel l0)
                                                                      rsa_predictall 11 findings provedtoo slow (3 nested Σ₁₁)
                                                                      Higher-orderaccess modulates implicatureaccess modulates expression choice

                                                                      The key structural difference is the speaker utility:

                                                                      Both models use Core.Distributions.hypergeometric for the observation model and share the same RSAConfig pattern (access-parametric, Latent = Obs).

                                                                      theorem Phenomena.Modality.Studies.HerbstrittFranke2019.belief_uses_general_hypergeometric (access : ) (s : UrnState) (obs : Obs) (h : obs access) :
                                                                      obsPrior access s obs = (Core.Distributions.hypergeometric 10 (↑s) access obs)

                                                                      The hypergeometric observation model generalizes across both papers.

                                                                      Hellinger vs KL: Why the Divergence Measure Matters #

                                                                      The choice of divergence measure is not a free parameter — it determines which messages the speaker can consider. See Core.Divergence §5 for the full theoretical analysis.

                                                                      Example: Consider a speaker who observes 9/10 red balls (access=10). Her belief is a point mass at s=9.

                                                                      @cite{herbstritt-franke-2019} argues that "certainly" IS pragmatically appropriate when the speaker is nearly but not absolutely sure, and that Hellinger distance correctly captures this by making "certainly" available as a message with bounded disutility.

                                                                      The paper notes that the compositional model consistently overpredicts the frequency of "might be possible" in production data (Section 6). This is because the compositional semantics assigns "might be possible" a very weak truth condition (posterior probability of "possible" exceeds θ_might), making it true in almost all conditions.

                                                                      The explanation: participants may give "might be possible" a modal concord reading, collapsing the two modals to a single "possible". See Phenomena.Modality.ModalConcord for the general phenomenon. Modal concord occurs when two modals of the same logical type combine to express a single modal meaning (e.g., "must necessarily" ≈ "must").

                                                                      In the context of @cite{herbstritt-franke-2019}, the candidates for modal concord are "probably likely" and "might be possible", both of which combine modals of similar strength.