Documentation

Linglib.Comparisons.RSAandPDS

RSA's Graded φ = Probability of Boolean Property #

RSA implementations often use φ : U → W → ℚ as a graded meaning function. This graded value emerges from:

  1. A Boolean literal semantics L0 : U → W → Bool
  2. A distribution over worlds P(W)

The graded value is: φ(u, w) = E_{w'}[L0(u, w')] for some subset of worlds.

More precisely, for a single world w, we often have φ(u, w) ∈ {0, 1}. The graded nature comes from uncertainty over which world is actual.

A Boolean RSA scenario: literal semantics is Boolean (true/false).

This is the "basic" RSA setup before introducing graded meanings.

Instances For

    The "graded" meaning of an utterance emerges as the probability that it's true across worlds.

    This is exactly PDS's probProp.

    Equations
    Instances For

      At a specific world, the Boolean meaning is just L0.

      Equations
      Instances For

        Key theorem: Graded meaning is probability of Boolean meaning.

        This formalizes that RSA's graded φ emerges from Boolean L0 + world uncertainty.

        L0 = Observe #

        RSA's literal listener L0 conditions on the utterance being true. This is exactly the monadic observe operation.

        L0(w | u) ∝ P(w) · 1[L0(u,w)]

        In the probability monad:

        L0_posterior u := do
          w ← prior
          observe (literal u w)
          pure w
        

        L0 as conditioning: the posterior over worlds given utterance u.

        For each world, the weight is prior(w) if L0(u,w) = true, else 0. This is exactly how observe works in the probability monad.

        Equations
        Instances For
          theorem Comparisons.RSAandPDS.l0_filters_worlds (rsa : BooleanRSA) (u : rsa.Utterance) (w : rsa.World) :
          rsa.L0 u w = falsel0Posterior rsa u w = 0

          L0 filtering: only worlds where the utterance is literally true survive.

          theorem Comparisons.RSAandPDS.l0_preserves_prior (rsa : BooleanRSA) (u : rsa.Utterance) (w : rsa.World) :
          rsa.L0 u w = truel0Posterior rsa u w = rsa.worldPrior w

          L0 preserves prior for true worlds.

          Threshold Uncertainty = Graded Truth (Lassiter & Goodman) #

          The key result: if you have

          Then the "graded" truth value emerges:

          This is @cite{lassiter-goodman-2017}'s central insight, formalized here.

          A threshold-based adjective: true iff measure exceeds threshold.

          Instances For

            Boolean semantics at a specific threshold.

            Equations
            Instances For

              Graded semantics: probability that entity exceeds threshold.

              This is exactly probProp applied to threshold uncertainty.

              Equations
              Instances For

                Key theorem: Graded semantics IS probability of Boolean semantics.

                This formalizes Lassiter & Goodman's insight that graded truth values emerge from Boolean semantics + parameter uncertainty.

                theorem Comparisons.RSAandPDS.ThresholdAdjective.graded_eq_boolean_certain (adj : ThresholdAdjective) (θ₀ : adj.Threshold) (x : adj.Entity) (h_point : ∀ (θ : adj.Threshold), adj.thresholdPrior θ = if θ = θ₀ then 1 else 0) :

                With certainty about threshold, graded = Boolean.

                If the prior is a point mass at θ₀, the graded value is just the Boolean value.

                Constructing RSAConfig from Boolean Semantics #

                The BooleanRSA.toRSAScenario definition that converted Boolean RSA to the old RSAScenario type has been removed. Boolean RSA should be constructed using RSAConfig from RSA.Core.Config instead.

                RSA as Monadic Program #

                The RSA recursion L0 → S1 → L1 can be expressed as a composition of monadic operations. This makes the computational structure explicit.

                The Operations #

                RSA LevelMonadic OperationDescription
                L0observeFilter worlds by literal truth
                S1chooseSample utterance by utility
                L1observe + invertCondition on speaker's choice

                Insight #

                S1 is NOT just conditioning - it's choosing from a softmax distribution. This requires the choose operation, not just observe.

                S1(u | w) ∝ exp(α · utility(u, w))
                

                This is choose (λ u => exp(α * utility u w)).

                RSA scenario with explicit monadic structure.

                This packages the data needed to write RSA as a monadic program.

                Instances For

                  L0: Literal Listener #

                  L0 conditions on the utterance being literally true.

                  L0 u := do
                    w ← worldPrior
                    observe (literal u w)
                    pure w
                  

                  The posterior is: L0(w | u) ∝ P(w) · 1[literal(u,w)]

                  L0 posterior weight for a world given utterance

                  Equations
                  Instances For
                    theorem Comparisons.RSAandPDS.MonadicRSA.L0_filters (rsa : MonadicRSA) (u : rsa.Utterance) (w : rsa.World) :
                    rsa.literal u w = falsersa.L0_weight u w = 0

                    L0 is conditioning: zero weight for worlds where utterance is false

                    S1: Pragmatic Speaker #

                    S1 chooses an utterance to maximize informativity minus cost.

                    S1 w := do
                      u ← choose (λ u => exp(α * (log(L0 u w) - cost u)))
                      pure u
                    

                    The distribution is: S1(u | w) ∝ exp(α · utility(u, w))

                    Note: This is choose, not observe - it's sampling, not filtering.

                    Utility of utterance at world (log-probability minus cost)

                    Equations
                    Instances For

                      S1 weight (pre-softmax) for utterance at world

                      Equations
                      Instances For

                        L1: Pragmatic Listener #

                        L1 inverts S1: given utterance, infer world.

                        L1 u := do
                          w ← worldPrior
                          -- Condition on "speaker would choose u at world w"
                          observe (S1 w chose u) -- Condition on speaker choice
                          pure w
                        

                        The posterior is: L1(w | u) ∝ S1(u | w) · P(w)

                        The key insight: L1 conditions on the probability that S1 would choose u, not just whether it's possible.

                        L1 posterior weight for world given utterance

                        Equations
                        Instances For

                          The Monadic Structure #

                          The RSA recursion has this structure:

                          program := do
                            w ← worldPrior -- Sample world
                            u ← S1_at w -- Speaker chooses utterance (choose, not observe)
                            w' ← L1_given u -- Listener infers world (observe + Bayes)
                            pure (w, u, w')
                          

                          The key asymmetry:

                          This is why S1 and L1 have different computational signatures in PDS.

                          The computational asymmetry: S1 chooses, L1 conditions.

                          S1's output is a distribution over utterances (one per world). L1's output is a distribution over worlds (one per utterance).

                          Structural Correspondence: RSA ↔ Monadic RSA #

                          The RSA L0 computes:

                          scores w = worldPrior w * φ i l u w * speakerCredence a w
                          

                          For Boolean semantics (φ ∈ {0,1}), this is exactly:

                          L0 u := do
                            w ← worldPrior
                            observe (φ u w = 1) -- filter by truth
                            pure w
                          

                          The Correspondence Table #

                          RSAMonadic OperationPDS Concept
                          L0 scores w = prior w * φ u wobserve (φ u w)Conditioning
                          S1 scores u = φ u w * utility^αchoose (utility)Softmax sampling
                          L1 scores w = Σ priors * S1(u\|w)observe + marginalizeBayes' rule

                          Key Structural Insights #

                          1. L0 is multiplication by indicator: prior w * (if φ then 1 else 0) This IS the probability monad's observe operation.

                          2. S1 is weighted sampling: The softmax exp(α * utility) is choose. Unlike L0, S1 doesn't filter—it reweights.

                          3. L1 is Bayesian inversion: Sum over latents = marginalization in monad.

                          theorem Comparisons.RSAandPDS.L0_is_observe (rsa : BooleanRSA) (u : rsa.Utterance) (w : rsa.World) :
                          l0Posterior rsa u w = rsa.worldPrior w * if rsa.L0 u w = true then 1 else 0

                          L0 as conditioning: filtering worlds by literal truth.

                          This is the fundamental correspondence: L0's score formula implements the probability monad's observe operation.

                          theorem Comparisons.RSAandPDS.L0_false_zero (rsa : BooleanRSA) (u : rsa.Utterance) (w : rsa.World) :
                          rsa.L0 u w = falsel0Posterior rsa u w = 0

                          L0 filters: false literals get zero weight.

                          theorem Comparisons.RSAandPDS.L0_true_prior (rsa : BooleanRSA) (u : rsa.Utterance) (w : rsa.World) :
                          rsa.L0 u w = truel0Posterior rsa u w = rsa.worldPrior w

                          L0 preserves: true literals keep the prior weight.

                          S1 Structure #

                          S1 in RSA:

                          scores u = φ(u,w) * L0_projected(w|u)^α / (1 + cost(u))^α
                          

                          This is softmax choice: sample u proportional to utility. The choose operation in the monad.

                          theorem Comparisons.RSAandPDS.S1_utility_weighted (rsa : MonadicRSA) (w : rsa.World) (u : rsa.Utterance) :
                          rsa.S1_weight w u = rsa.α * rsa.utility u w

                          S1 assigns weight based on utility, not truth filtering.

                          Unlike L0 which zeros out false worlds, S1 can assign non-zero weight to any utterance (modulated by utility).

                          L1 Structure #

                          L1 in RSA:

                          scores w = Σ_{i,l,a,q} priors(w,i,l,a,q) * S1(u | w,i,l,a,q)
                          

                          This is:

                          1. Joint prior over (World × Interp × Lexicon × BeliefState × Goal)
                          2. Condition on S1 choosing utterance u
                          3. Marginalize to get posterior over World

                          In monadic terms:

                          L1 u := do
                            (w, i, l, a, q) ← jointPrior
                            weight ← S1_prob w i l a q u -- soft conditioning
                            pure (w, weight)
                          

                          L1 sums over latent variables.

                          The summation structure in RSA L1 is marginalization.

                          Benefits of the Monadic RSA View #

                          1. Compositionality #

                          Complex RSA variants (lexical uncertainty, QUD, nested belief) are just different monadic programs using the same primitives:

                          -- Standard RSA
                          standard := observe ∘ literal
                          
                          -- Lexical Uncertainty (Bergen et al.)
                          lexUncertainty := do
                            lex ← choose lexiconPrior
                            observe (literal_lex lex u w)
                          
                          -- QUD-sensitive (Kao et al.)
                          qudSensitive := do
                            g ← choose goalPrior
                            observe (relevant g (literal u w))
                          

                          2. δ-Rules #

                          Grove & White's δ-rules are program transformations that preserve meaning. For example:

                          δ-observe: observe true; k = k
                          δ-fail: observe false; k = fail
                          δ-bind: pure v >>= k = k v
                          

                          These let us simplify RSA computations while proving correctness.

                          3. Separation of Concerns #

                          Different implementations of P (exact enumeration, Monte Carlo, variational) all satisfy the same monad laws.

                          4. Connection to PPLs #

                          The monadic RSA programs can compile to probabilistic programming languages:

                          The monad laws ensure the compilation preserves semantics.

                          What This Module Shows #

                          Main Results #

                          1. RSA's graded φ is probProp: The graded meaning function in RSA is exactly the probability of the Boolean meaning being true.

                          2. L0 is conditioning: RSA's literal listener L0 corresponds to the monadic observe operation that filters worlds by truth.

                          3. Threshold → Graded: Lassiter & Goodman's result that threshold semantics + uncertainty yields graded semantics is a special case of probProp.

                          Implications #

                          Connection to Literature #

                          RSA ConceptPDS Concept
                          L0(w | u)observe (L0 u w)
                          φ(u,w)probProp L0
                          S1 softmaxbind + observe
                          L1 posteriorBayesian update

                          What This Does NOT Show (Yet) #