RSA's Graded φ = Probability of Boolean Property #
RSA implementations often use φ : U → W → ℚ as a graded meaning function. This graded value emerges from:
- A Boolean literal semantics L0 : U → W → Bool
- A distribution over worlds P(W)
The graded value is: φ(u, w) = E_{w'}[L0(u, w')] for some subset of worlds.
More precisely, for a single world w, we often have φ(u, w) ∈ {0, 1}. The graded nature comes from uncertainty over which world is actual.
A Boolean RSA scenario: literal semantics is Boolean (true/false).
This is the "basic" RSA setup before introducing graded meanings.
- Utterance : Type
- World : Type
- uttDecEq : DecidableEq self.Utterance
- worldDecEq : DecidableEq self.World
Boolean literal semantics
Prior over worlds
Instances For
The "graded" meaning of an utterance emerges as the probability that it's true across worlds.
This is exactly PDS's probProp.
Equations
- rsa.gradedMeaning u = Semantics.Dynamic.Probabilistic.probProp rsa.worldPrior fun (w : rsa.World) => rsa.L0 u w
Instances For
At a specific world, the Boolean meaning is just L0.
Equations
- rsa.booleanMeaningAt u w = rsa.L0 u w
Instances For
Key theorem: Graded meaning is probability of Boolean meaning.
This formalizes that RSA's graded φ emerges from Boolean L0 + world uncertainty.
L0 = Observe #
RSA's literal listener L0 conditions on the utterance being true.
This is exactly the monadic observe operation.
L0(w | u) ∝ P(w) · 1[L0(u,w)]
In the probability monad:
L0_posterior u := do
w ← prior
observe (literal u w)
pure w
L0 as conditioning: the posterior over worlds given utterance u.
For each world, the weight is prior(w) if L0(u,w) = true, else 0.
This is exactly how observe works in the probability monad.
Equations
- Comparisons.RSAandPDS.l0Posterior rsa u w = rsa.worldPrior w * if rsa.L0 u w = true then 1 else 0
Instances For
L0 filtering: only worlds where the utterance is literally true survive.
L0 preserves prior for true worlds.
Threshold Uncertainty = Graded Truth (Lassiter & Goodman) #
The key result: if you have
- Boolean semantics with a threshold parameter:
tall_θ(x) = height(x) > θ - Uncertainty over the threshold:
P(θ)
Then the "graded" truth value emerges:
tall(x) = E_θ[tall_θ(x)] = P(height(x) > θ)
This is @cite{lassiter-goodman-2017}'s central insight, formalized here.
A threshold-based adjective: true iff measure exceeds threshold.
- Entity : Type
- Threshold : Type
- thresholdDecEq : DecidableEq self.Threshold
The measure function (e.g., height)
The threshold values
Prior over thresholds
Instances For
Boolean semantics at a specific threshold.
Equations
- adj.booleanSemantics θ x = decide (adj.measure x > adj.thresholds θ)
Instances For
Graded semantics: probability that entity exceeds threshold.
This is exactly probProp applied to threshold uncertainty.
Equations
- adj.gradedSemantics x = Semantics.Dynamic.Probabilistic.probProp adj.thresholdPrior fun (θ : adj.Threshold) => adj.booleanSemantics θ x
Instances For
Key theorem: Graded semantics IS probability of Boolean semantics.
This formalizes Lassiter & Goodman's insight that graded truth values emerge from Boolean semantics + parameter uncertainty.
With certainty about threshold, graded = Boolean.
If the prior is a point mass at θ₀, the graded value is just the Boolean value.
Constructing RSAConfig from Boolean Semantics #
The BooleanRSA.toRSAScenario definition that converted Boolean RSA to the old
RSAScenario type has been removed. Boolean RSA should be constructed using
RSAConfig from RSA.Core.Config instead.
RSA as Monadic Program #
The RSA recursion L0 → S1 → L1 can be expressed as a composition of monadic operations. This makes the computational structure explicit.
The Operations #
| RSA Level | Monadic Operation | Description |
|---|---|---|
| L0 | observe | Filter worlds by literal truth |
| S1 | choose | Sample utterance by utility |
| L1 | observe + invert | Condition on speaker's choice |
Insight #
S1 is NOT just conditioning - it's choosing from a softmax distribution.
This requires the choose operation, not just observe.
S1(u | w) ∝ exp(α · utility(u, w))
This is choose (λ u => exp(α * utility u w)).
L0: Literal Listener #
L0 conditions on the utterance being literally true.
L0 u := do
w ← worldPrior
observe (literal u w)
pure w
The posterior is: L0(w | u) ∝ P(w) · 1[literal(u,w)]
L0 is conditioning: zero weight for worlds where utterance is false
S1: Pragmatic Speaker #
S1 chooses an utterance to maximize informativity minus cost.
S1 w := do
u ← choose (λ u => exp(α * (log(L0 u w) - cost u)))
pure u
The distribution is: S1(u | w) ∝ exp(α · utility(u, w))
Note: This is choose, not observe - it's sampling, not filtering.
L1: Pragmatic Listener #
L1 inverts S1: given utterance, infer world.
L1 u := do
w ← worldPrior
-- Condition on "speaker would choose u at world w"
observe (S1 w chose u) -- Condition on speaker choice
pure w
The posterior is: L1(w | u) ∝ S1(u | w) · P(w)
The key insight: L1 conditions on the probability that S1 would choose u, not just whether it's possible.
L1 posterior weight for world given utterance
Equations
- rsa.L1_weight u w = rsa.worldPrior w * rsa.S1_weight w u
Instances For
The Monadic Structure #
The RSA recursion has this structure:
program := do
w ← worldPrior -- Sample world
u ← S1_at w -- Speaker chooses utterance (choose, not observe)
w' ← L1_given u -- Listener infers world (observe + Bayes)
pure (w, u, w')
The key asymmetry:
- Speaker (S1): Uses
choose- actively samples from softmax - Listener (L0, L1): Uses
observe- passively conditions on evidence
This is why S1 and L1 have different computational signatures in PDS.
The computational asymmetry: S1 chooses, L1 conditions.
S1's output is a distribution over utterances (one per world). L1's output is a distribution over worlds (one per utterance).
Structural Correspondence: RSA ↔ Monadic RSA #
The RSA L0 computes:
scores w = worldPrior w * φ i l u w * speakerCredence a w
For Boolean semantics (φ ∈ {0,1}), this is exactly:
L0 u := do
w ← worldPrior
observe (φ u w = 1) -- filter by truth
pure w
The Correspondence Table #
| RSA | Monadic Operation | PDS Concept |
|---|---|---|
L0 scores w = prior w * φ u w | observe (φ u w) | Conditioning |
S1 scores u = φ u w * utility^α | choose (utility) | Softmax sampling |
L1 scores w = Σ priors * S1(u\|w) | observe + marginalize | Bayes' rule |
Key Structural Insights #
L0 is multiplication by indicator:
prior w * (if φ then 1 else 0)This IS the probability monad'sobserveoperation.S1 is weighted sampling: The softmax
exp(α * utility)ischoose. Unlike L0, S1 doesn't filter—it reweights.L1 is Bayesian inversion: Sum over latents = marginalization in monad.
L0 as conditioning: filtering worlds by literal truth.
This is the fundamental correspondence: L0's score formula implements the probability monad's observe operation.
L0 filters: false literals get zero weight.
L0 preserves: true literals keep the prior weight.
S1 Structure #
S1 in RSA:
scores u = φ(u,w) * L0_projected(w|u)^α / (1 + cost(u))^α
This is softmax choice: sample u proportional to utility.
The choose operation in the monad.
S1 assigns weight based on utility, not truth filtering.
Unlike L0 which zeros out false worlds, S1 can assign non-zero weight to any utterance (modulated by utility).
L1 Structure #
L1 in RSA:
scores w = Σ_{i,l,a,q} priors(w,i,l,a,q) * S1(u | w,i,l,a,q)
This is:
- Joint prior over (World × Interp × Lexicon × BeliefState × Goal)
- Condition on S1 choosing utterance u
- Marginalize to get posterior over World
In monadic terms:
L1 u := do
(w, i, l, a, q) ← jointPrior
weight ← S1_prob w i l a q u -- soft conditioning
pure (w, weight)
L1 sums over latent variables.
The summation structure in RSA L1 is marginalization.
Benefits of the Monadic RSA View #
1. Compositionality #
Complex RSA variants (lexical uncertainty, QUD, nested belief) are just different monadic programs using the same primitives:
-- Standard RSA
standard := observe ∘ literal
-- Lexical Uncertainty (Bergen et al.)
lexUncertainty := do
lex ← choose lexiconPrior
observe (literal_lex lex u w)
-- QUD-sensitive (Kao et al.)
qudSensitive := do
g ← choose goalPrior
observe (relevant g (literal u w))
2. δ-Rules #
Grove & White's δ-rules are program transformations that preserve meaning. For example:
δ-observe: observe true; k = k
δ-fail: observe false; k = fail
δ-bind: pure v >>= k = k v
These let us simplify RSA computations while proving correctness.
3. Separation of Concerns #
- Semantics: What does the program mean? (Monadic structure)
- Inference: How do we compute it? (Implementation of P)
Different implementations of P (exact enumeration, Monte Carlo,
variational) all satisfy the same monad laws.
4. Connection to PPLs #
The monadic RSA programs can compile to probabilistic programming languages:
- WebPPL (Goodman & Stuhlmüller)
- Stan (Grove & White)
- Pyro, NumPyro
The monad laws ensure the compilation preserves semantics.
What This Module Shows #
Main Results #
RSA's graded φ is
probProp: The graded meaning function in RSA is exactly the probability of the Boolean meaning being true.L0 is conditioning: RSA's literal listener L0 corresponds to the monadic
observeoperation that filters worlds by truth.Threshold → Graded: Lassiter & Goodman's result that threshold semantics + uncertainty yields graded semantics is a special case of
probProp.
Implications #
- RSA is not introducing new semantic machinery
- Graded truth values emerge from Boolean semantics + uncertainty
- The PDS monad provides the right abstraction for RSA computations
Connection to Literature #
| RSA Concept | PDS Concept |
|---|---|
| L0(w | u) | observe (L0 u w) |
| φ(u,w) | probProp L0 |
| S1 softmax | bind + observe |
| L1 posterior | Bayesian update |
What This Does NOT Show (Yet) #
- Full correspondence between RSA recursion and PDS monadic programs
- Optimality of RSA strategies in PDS terms
- Connection to Grove & White's discourse dynamics