The prejacent proposition
Instances For
Local exhMW: exhaustification using only local alternatives.
Equations
- RSA.Compositional.localExhMW node = Exhaustification.exhMW node.toAltSet node.prejacent_prop
Instances For
LU-RSA: What It Can and Cannot Do #
@cite{potts-etal-2016} @cite{bergen-levy-goodman-2016}
The LU Approach #
LU-RSA models uncertainty over the lexicon:
- Lexicon 1: "some" means "at least one" (weak)
- Lexicon 2: "some" means "at least one but not all" (strong)
The listener marginalizes over lexica: L(w | m) ∝ P(w) × Σ_L P(L) × S₁(m | w, L)
What LU Can Do #
For simple cases like "Exactly one player hit some of his shots":
- With strong lexicon, "some" means "some but not all"
- This composes with "exactly one" to give the local reading
- Potts et al. show .96 Pearson correlation with human judgments
LU does capture simple embedded implicatures.
What LU Cannot Do: The Global Lexicon Problem #
LU applies one lexicon to all occurrences.
Consider: "Every student read some book and some paper"
- Two occurrences of "some"
- EXH-based theory: can exhaustify each independently
- LU-based theory: both must have the same meaning
With LU:
- Lexicon 1 (weak): both "some"s mean "at least one"
- Lexicon 2 (strong): both "some"s mean "some but not all"
LU cannot express: "every student read some-but-not-all books AND at-least-some papers" (independent exhaustification).
Scope vs Lexicon #
For "Every student read some book":
EXH analysis (scope-sensitive):
- Global: EXH [∀x. some(x)] → "not everyone read all"
- Local: ∀x. [EXH some(x)] → "everyone read some-but-not-all"
- The same word "some" (meaning: at least one), different EXH positions
LU analysis (scope-blind):
- Weak lexicon → "everyone read at-least-one"
- Strong lexicon → "everyone read some-but-not-all"
- Different meanings for "some", no notion of where enrichment applies
For single scalar items, these are observationally equivalent. For multiple scalar items or nested structures, they diverge.
| Approach | What varies | Multiple scalars? | Scope interactions? |
|---|---|---|---|
| Standard RSA | Nothing | No | No |
| LU-RSA | Lexicon globally | Same for all | No |
| Compositional RSA | Where EXH applies | Independent | Yes |
LU treats implicature as a lexical property (of words), while EXH treats it as a structural property (of positions).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
LU meaning: what "some" means under each lexicon
Equations
- RSA.Compositional.someLUMeaning RSA.Compositional.SomeLexicon.weak x✝ = true
- RSA.Compositional.someLUMeaning RSA.Compositional.SomeLexicon.strong RSA.Compositional.StudentReading.some_not_all = true
- RSA.Compositional.someLUMeaning RSA.Compositional.SomeLexicon.strong RSA.Compositional.StudentReading.all = false
Instances For
The Single-Scalar Equivalence (Potts et al.'s Success Case) #
For "every student read some book" with a single scalar item, LU with strong lexicon gives the same result as local EXH.
LU would give:
- Lexicon 1 (weak "some"): every student read at-least-one → SS, SA, AS, AA
- Lexicon 2 (strong "some"): every student read some-but-not-all → SS only
This matches @cite{potts-levy-2015}, who show strong empirical fits.
Where LU Fails: The Principled Derivation Problem #
No principled lexicon selection: With flat priors, why prefer strong? Potts et al. rely on neo-Gricean constraints on refinement sets.
Global lexicon: All occurrences of "some" get the same meaning. EXH can apply at different positions independently.
No scope interaction: LU cannot model EXH interacting with quantifier scope, negation scope, etc.
The theorem lu_strong_equals_local below shows the equivalence for
single-scalar cases. The limitation emerges with multiple scalars.
LU-RSA scenario for "Every student read some book"
- lexicon : SomeLexicon
Which lexicon is being used
- world : Core.EmbeddedSI.EmbeddedSIWorld
The world state
Instances For
LU meaning: depends on lexicon choice
Equations
- RSA.Compositional.luMeaning { lexicon := RSA.Compositional.SomeLexicon.weak, world := world } = true
- RSA.Compositional.luMeaning { lexicon := RSA.Compositional.SomeLexicon.strong, world := RSA.Core.EmbeddedSI.EmbeddedSIWorld.SS } = true
- RSA.Compositional.luMeaning { lexicon := RSA.Compositional.SomeLexicon.strong, world := RSA.Core.EmbeddedSI.EmbeddedSIWorld.SA } = false
- RSA.Compositional.luMeaning { lexicon := RSA.Compositional.SomeLexicon.strong, world := RSA.Core.EmbeddedSI.EmbeddedSIWorld.AS } = false
- RSA.Compositional.luMeaning { lexicon := RSA.Compositional.SomeLexicon.strong, world := RSA.Core.EmbeddedSI.EmbeddedSIWorld.AA } = false
Instances For
LU with weak lexicon allows all worlds (like literal L0)
LU with strong lexicon gives local-EXH-like result
The Single-Scalar Success #
For single-scalar cases, LU achieves the same result as local EXH:
lu_strong_equals_localproves this formally- This explains the excellent fits to human data in @cite{potts-levy-2015}
The Compositional Question: Where Does Marginalization Happen? #
LU approach (Potts et al.):
- Uncertainty over complete lexica (global)
- Compose meanings using standard semantics
- Marginalize over lexica at the top level: L(w | m) ∝ Σ_L P(L) × S₁(m | w, L)
Compositional RSA/EXH approach:
- Apply RSA/EXH reasoning at intermediate compositional nodes
- Local alternatives matter at each node
- Compose the already-exhaustified meanings
The question: Does top-level marginalization over lexica give the same results as node-by-node pragmatic reasoning?
How Multiple Uncertainties Compose #
In LU, when you have multiple scalar items, each can be refined. The space of lexica is the product of refinement choices.
A single lexicon L is used for the whole utterance. The listener reasons about which L the speaker is using globally.
The structural question is whether pragmatic reasoning should happen:
- Globally: marginalize over complete lexica at the end (LU)
- Locally: apply RSA at each compositional node, then compose
For simple cases these may coincide. The divergence appears when:
- Multiple scalar items interact
- Quantifier scope affects which alternatives are relevant
- Contextual factors differ at different structural positions
The architectural difference:
- LU: P(L) is a distribution over complete lexica
- Compositional: each node has its own local alternatives
This affects how multiple sources of uncertainty interact.
- lexicon : SomeLexicon
A lexicon assigns meanings to all scalar items
- globalMarg : Bool
Marginalization happens at the top
Instances For
The Structural Difference #
What LU gets right:
- Single-scalar embedded implicatures
- Probabilistic weighting over interpretations
- Integration with RSA framework
- Strong empirical fits for tested cases
The architectural question:
- LU: global marginalization over lexica
- Compositional RSA: local reasoning at each node
For the cases Potts et al. test (single scalar under quantifier), these approaches are empirically equivalent. The question is whether they diverge for more complex compositional structures.
Open question (cf. @cite{franke-bergen-2020} "Global Intentions"): Does the locus of pragmatic reasoning (global vs. local) matter for predicting human behavior in complex embedded contexts?
Compositional RSA #
EXH is a structural operator, not a lexical one. Modeling where in the derivation EXH applies requires a compositional approach:
- Build a derivation tree with local alternatives at each node
- Apply RSA/EXH at each node with its local alternatives
- Compose the results via standard semantics
For "Every student read some book" #
Step 1: At the "some" node (per student)
- Local alternatives: {some, all}
- Local EXH: some → some-but-not-all
Step 2: Compose with "every"
- Result: every x [some-but-not-all(x)]
- This is the local reading
Why This Works #
The @cite{franke-2011} limit theorem is parametric in the alternative set: IBR(alternatives) → exhMW(alternatives)
Instantiating with local alternatives at each node yields local exhMW at each node. Composition gives the local reading.
Inner alternatives: some vs all
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Inner meaning for a single student
Equations
- RSA.Compositional.singleStudentMeaning RSA.Compositional.QuantAlt.some_ x✝ = true
- RSA.Compositional.singleStudentMeaning RSA.Compositional.QuantAlt.all RSA.Compositional.StudentReading.some_not_all = false
- RSA.Compositional.singleStudentMeaning RSA.Compositional.QuantAlt.all RSA.Compositional.StudentReading.all = true
Instances For
Local node for a single student's reading behavior
Equations
- One or more equations did not get rendered due to their size.
Instances For
Map aggregate world to per-student readings
Equations
- RSA.Compositional.aliceReading RSA.Core.EmbeddedSI.EmbeddedSIWorld.SS = RSA.Compositional.StudentReading.some_not_all
- RSA.Compositional.aliceReading RSA.Core.EmbeddedSI.EmbeddedSIWorld.SA = RSA.Compositional.StudentReading.some_not_all
- RSA.Compositional.aliceReading RSA.Core.EmbeddedSI.EmbeddedSIWorld.AS = RSA.Compositional.StudentReading.all
- RSA.Compositional.aliceReading RSA.Core.EmbeddedSI.EmbeddedSIWorld.AA = RSA.Compositional.StudentReading.all
Instances For
Equations
- RSA.Compositional.bobReading RSA.Core.EmbeddedSI.EmbeddedSIWorld.SS = RSA.Compositional.StudentReading.some_not_all
- RSA.Compositional.bobReading RSA.Core.EmbeddedSI.EmbeddedSIWorld.SA = RSA.Compositional.StudentReading.all
- RSA.Compositional.bobReading RSA.Core.EmbeddedSI.EmbeddedSIWorld.AS = RSA.Compositional.StudentReading.some_not_all
- RSA.Compositional.bobReading RSA.Core.EmbeddedSI.EmbeddedSIWorld.AA = RSA.Compositional.StudentReading.all
Instances For
The composed local interpretation: "Every student read some-but-not-all" = Alice read some-but-not-all AND Bob read some-but-not-all
Equations
- RSA.Compositional.composedLocalInterp RSA.Core.EmbeddedSI.EmbeddedSIWorld.SS = true
- RSA.Compositional.composedLocalInterp RSA.Core.EmbeddedSI.EmbeddedSIWorld.SA = false
- RSA.Compositional.composedLocalInterp RSA.Core.EmbeddedSI.EmbeddedSIWorld.AS = false
- RSA.Compositional.composedLocalInterp RSA.Core.EmbeddedSI.EmbeddedSIWorld.AA = false
Instances For
Composed local interpretation matches localExhMeaning
Local reading is strictly stronger than global
LU vs Compositional RSA #
LU-RSA (Potts, Lassiter, Levy, Frank):
- Mechanism: Uncertainty over lexicon (word meanings)
- For embedded SI: Choose between weak/strong "some"
- Problem: Both lexicons give the same scope relations
- Weak "some" + global EXH = global reading
- Strong "some" = local-like, but no scope interaction
- LU is scope-blind. It varies meanings, not structure.
Compositional RSA:
- Mechanism: Apply EXH at each node with local alternatives
- For embedded SI: EXH at "some" node before composing with "every"
- Result: True local reading via compositional structure
- The Franke limit theorem is parametric; instantiating with local alternatives yields local EXH.
The Expressivity Hierarchy #
Standard RSA ⊂ LU-RSA ≈ Standard RSA ⊂ Compositional RSA ≈ EXH
↓ ↓ ↓
scope-blind scope-blind scope-sensitive
↓ ↓ ↓
global only global only global AND local
LU-RSA and standard RSA have the same structural expressivity. LU adds uncertainty over word meanings, not scope.
LU-RSA is scope-blind: it cannot distinguish global from local based on structural scope position.
Proof: LU's two lexicons (weak, strong) correspond to (no SI, local-like SI), but the choice is lexical not structural. There is no principled way to derive the local reading.
The expressivity gain: Compositional RSA can express local readings that LU-RSA cannot principally derive.
Why Standard RSA Cannot Derive Local Readings #
No sentence-level alternative can distinguish SS from SA using global worlds. This limitation motivates either:
- Stipulating EXH as a grammatical primitive (the standard view)
- Developing "local RSA" that operates inside composition
The Setup #
For "every student read some book":
- Worlds: SS (both some-not-all), SA (Alice some, Bob all), AS, AA
- Alternatives: nested Aristotelians {none, some, all} × {none, some, all}
The Problem #
RSA excludes a world w when hearing utterance u if: ∃ u' ∈ Alt(u). ⟦u'⟧(w) ∧ informativity(u') > informativity(u)
To exclude SA when hearing "every...some", we'd need an alternative u' where: ⟦u'⟧(SA) = true AND ⟦u'⟧(SS) = false (or u' more informative)
No such sentence exists in the nested Aristotelians.
Consequences #
If RSA cannot distinguish SS from SA at the sentence level, then either:
- EXH is a separate grammatical mechanism (stipulation)
- RSA must operate sub-sententially with "local" alternatives
Option 2 is "RSA all the way down" -- deriving EXH-like behavior from RSA applied at each compositional node.
The 9 nested Aristotelian sentences as our alternative set
- NN : NestedAristotelian
- NS : NestedAristotelian
- NA : NestedAristotelian
- SN : NestedAristotelian
- SS : NestedAristotelian
- SA : NestedAristotelian
- AN : NestedAristotelian
- AS : NestedAristotelian
- AA : NestedAristotelian
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Literal (global) meaning of nested Aristotelians. Using the same world type as RSAExhExpressivity.
Equations
- RSA.Compositional.nestedMeaning RSA.Compositional.NestedAristotelian.NN x✝ = false
- RSA.Compositional.nestedMeaning RSA.Compositional.NestedAristotelian.NS x✝ = false
- RSA.Compositional.nestedMeaning RSA.Compositional.NestedAristotelian.NA x✝ = false
- RSA.Compositional.nestedMeaning RSA.Compositional.NestedAristotelian.SN x✝ = false
- RSA.Compositional.nestedMeaning RSA.Compositional.NestedAristotelian.SS x✝ = true
- RSA.Compositional.nestedMeaning RSA.Compositional.NestedAristotelian.SA RSA.Core.EmbeddedSI.EmbeddedSIWorld.SA = true
- RSA.Compositional.nestedMeaning RSA.Compositional.NestedAristotelian.SA RSA.Core.EmbeddedSI.EmbeddedSIWorld.AS = true
- RSA.Compositional.nestedMeaning RSA.Compositional.NestedAristotelian.SA RSA.Core.EmbeddedSI.EmbeddedSIWorld.AA = true
- RSA.Compositional.nestedMeaning RSA.Compositional.NestedAristotelian.SA RSA.Core.EmbeddedSI.EmbeddedSIWorld.SS = false
- RSA.Compositional.nestedMeaning RSA.Compositional.NestedAristotelian.AN x✝ = false
- RSA.Compositional.nestedMeaning RSA.Compositional.NestedAristotelian.AS x✝ = true
- RSA.Compositional.nestedMeaning RSA.Compositional.NestedAristotelian.AA RSA.Core.EmbeddedSI.EmbeddedSIWorld.AA = true
- RSA.Compositional.nestedMeaning RSA.Compositional.NestedAristotelian.AA x✝ = false
Instances For
Key observation: SS and SA are indistinguishable by sentence-level meaning.
For every nested Aristotelian sentence, if it is true at SS, it is also true at SA (or vice versa in a way that does not help RSA exclude SA).
SA "dominates" SS: if some students did X in SS, then some students did X in SA (with potentially more).
Sentences true at SA but not SS don't help RSA exclude SA.
"some...all" is true at SA and false at SS. But this does not help: RSA excludes worlds where stronger alternatives are true. "some...all" is weaker than "some...some" for the inner quantifier, so it does not trigger exclusion.
The only sentence that could help exclude SA would be one that:
- Is true at SS
- Is false at SA
- Is stronger than "some...some"
No such sentence exists.
The RSA exclusion principle: RSA excludes world w when hearing u if there exists a stronger alternative u' true at w.
For "some...some" (SS sentence), the only stronger alternative is "all...all". "all...all" is false at both SS and SA. Therefore, RSA cannot exclude SA when hearing "some...some".
The core theorem: Standard RSA with sentence-level alternatives cannot derive the local reading {SS}.
Proof: RSA's posterior after hearing "some...some" includes all worlds where "some...some" is true and no strictly stronger alternative is true.
- "some...some" is true at: SS, SA, AS, AA (literally: everyone drank some)
- "all...all" is the strongest alternative, true only at AA
- RSA excludes AA (stronger alternative was available)
- RSA keeps: SS, SA, AS -- this is the global reading, not local
To get the local reading {SS}, we would need to exclude SA and AS. No sentence-level alternative can do this.
The Expressivity Gap #
Standard RSA posterior for "some...some":
{w : ⟦some...some⟧(w) ∧ ¬⟦all...all⟧(w)} = {SS, SA, AS}
Local EXH reading:
{SS}
Gap: {SA, AS} -- worlds RSA cannot exclude without sub-sentential reasoning
This gap is why linguists posit EXH as a grammatical primitive. "Local RSA" -- RSA applied at each compositional node -- can derive the local reading without stipulating EXH. EXH then emerges as the α → ∞ limit of local RSA.
@cite{franke-bergen-2020}: Global Intentions Model #
Franke & Bergen's "Global Intentions" (GI) model provides machinery for reasoning over where EXH applies.
The GI Architecture #
Grammar generates parses: For each utterance, the grammar generates multiple readings based on where EXH is inserted:
Speaker chooses (utterance, parse) jointly:
P_S(m, p | t; α) ∝ [P(t | ⟦m⟧^p)]^αThe speaker picks both what to say and how to mean it.
Listener infers (world, parse) jointly:
P_L(t, p | m; α) ∝ P(t) × P_S(m, p | t; α)
Finding #
GI outperforms simpler models (vanilla RSA, LU) because it can use strong readings like ⟦SS⟧^M = {██-} that uniquely identify states. However, GI assumes EXH is a grammatical primitive.
The "RSA All The Way Down" Reinterpretation #
Bergen & Franke's "parse" variable can be reinterpreted as "where RSA reasoning applies":
Bergen & Franke: RSA All The Way Down:
───────────────── ─────────────────────────
parse p ∈ {lit,M,O,I,...} config c ∈ {RSA application sites}
⟦m⟧^p = EXH at positions ⟦m⟧^c = RSA(α) at positions
Speaker chooses parse Speaker chooses where to be pragmatic
The Equivalence #
In the α → ∞ limit:
- RSA at node → EXH at node (@cite{franke-2011} limit theorem)
- RSA at multiple nodes → EXH at multiple nodes (composition)
- Reasoning over "where RSA applies" → reasoning over "where EXH is"
Bergen & Franke's GI model equals "RSA all the way down" in the limit.
The Derivation Direction #
| Approach | Starting Point | Pragmatic Reasoning |
|---|---|---|
| Bergen & Franke | EXH is primitive | RSA reasons over EXH positions |
| RSA All The Way Down | RSA is primitive | EXH emerges in α → ∞ limit |
The compositional structure is the same. The difference is foundational:
- GI: Grammar + Pragmatics (EXH given, RSA selects)
- RSAATWD: Pragmatics only (EXH derived from RSA)
Consequences #
EXH is not a separate grammatical mechanism, but emergent behavior of rational communication at high precision.
GI's empirical success (Bayes factor analysis) is evidence for the compositional structure, not for EXH-as-primitive.
One mechanism (RSA) at different α values, rather than two mechanisms (grammar-EXH + pragmatics-RSA).
Formal Correspondence #
The GI model's joint distribution over (utterance, parse) corresponds to compositional RSA's joint distribution over (utterance, enrichment-sites).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Equations
- One or more equations did not get rendered due to their size.
Instances For
Convert RSA config to equivalent parse (in the α → ∞ limit)
Equations
- { matrixRSA := false, outerRSA := false, innerRSA := false }.toParse = RSA.Compositional.Parse.lit
- { matrixRSA := true, outerRSA := false, innerRSA := false }.toParse = RSA.Compositional.Parse.M
- { matrixRSA := false, outerRSA := true, innerRSA := false }.toParse = RSA.Compositional.Parse.O
- { matrixRSA := false, outerRSA := false, innerRSA := true }.toParse = RSA.Compositional.Parse.I
- { matrixRSA := true, outerRSA := true, innerRSA := false }.toParse = RSA.Compositional.Parse.MO
- { matrixRSA := true, outerRSA := false, innerRSA := true }.toParse = RSA.Compositional.Parse.MI
- { matrixRSA := false, outerRSA := true, innerRSA := true }.toParse = RSA.Compositional.Parse.OI
- { matrixRSA := true, outerRSA := true, innerRSA := true }.toParse = RSA.Compositional.Parse.MOI
Instances For
The Global Intentions model: speaker chooses (utterance, parse)
- utterance : Utt
- parse : Parse
Instances For
RSA-All-The-Way-Down: speaker chooses (utterance, config)
- utterance : Utt
- config : RSAConfig
Instances For
In the α → ∞ limit, RSAATWD choice maps to GI choice
Instances For
The Limit Theorem (Conceptual) #
theorem rsaatwd_limit_is_gi :
∀ (scenario :...) (m : Utt) (t : World),
lim_{α→∞} P_RSAATWD(m, c | t; α) = P_GI(m, c.toParse | t)
Bergen & Franke's machinery for reasoning over parses emerges naturally from "RSA at every compositional node" in the limit.
Implications #
GI's empirical success (they show GI >> LI >> LU >> RSA in Bayes factors) is evidence for compositional pragmatic reasoning, interpretable as:
- B&F interpretation: Grammar generates EXH, RSA selects among readings
- Our interpretation: RSA applies compositionally, EXH-like behavior emerges
GI wins because ⟦SS⟧^M uniquely identifies state ██-. This reading requires matrix exhaustification.
- B&F: Matrix EXH is a grammatical option
- Us: High-α RSA at the matrix level produces the same strengthening
Parse selection = RSA site selection: B&F's listener inferring the intended parse is equivalent to inferring where the speaker applied pragmatic reasoning.
Bergen & Franke's key empirical finding: GI model assigns high probability to 'SS' for state ██- because ⟦SS⟧^M = {██-}.
In RSAATWD terms: with high-α RSA at matrix level, the speaker strongly prefers 'SS' for this state because it's uniquely identifying.
Algebraic Structure of Compositional RSA #
Does the joint distribution over (utterance, config) decompose in a principled algebraic way?
The Product Hypothesis #
If pragmatic reasoning at different nodes is independent, then:
P_S(m, c | t; α) ∝ ∏_{node n ∈ c} P_RSA(choice_n | local_state_n; α)
This would mean:
- Each node contributes independently to the overall probability
- Informativity MULTIPLIES across nodes
- The joint inference FACTORS into local inferences
Informativity Multiplication #
For a sentence with multiple scalar items, if informativity multiplies:
informativity(m, c) = ∏_{n ∈ c} informativity_n(choice_n)
Then the softmax over (m, c) decomposes:
P_S(m, c | t; α) ∝ exp(α × log informativity(m, c))
= exp(α × Σ_n log informativity_n(choice_n))
= ∏_n exp(α × log informativity_n(choice_n))
= ∏_n P_RSA_n(choice_n | local_t_n; α)
This is the algebraic signature of compositional RSA.
Connection to GradedMonad (RSAFree) #
The RSAFree graded monad (see GradedMonad.lean) has exactly this structure:
-- RSAFree is a graded monad where the grade tracks alternatives
-- Binding (seq) composes RSA computations
-- Informativity multiplies under seq
seq : RSAFree W A → (A → RSAFree W B) → RSAFree W B
The monad laws ensure:
- Associativity: (m >>= f) >>= g = m >>= (λx. f x >>= g)
- Identity: return x >>= f = f x
These correspond to:
- Composition is associative: parsing order doesn't matter
- Literal meaning is identity: no RSA = base meaning
The Decomposition Theorem (Conceptual) #
theorem informativity_multiplicative :
∀ (m₁ : RSAFree W A) (m₂ : A → RSAFree W B) (a : A) (b : B),
informativity (seq m₁ m₂) (a, b) =
informativity m₁ a × informativity (m₂ a) b
This says: the informativity of a composed utterance (with choices at multiple levels) is the product of the local informativities.
The joint distribution therefore factors.
Why Decomposition Matters #
Computational tractability: If the distribution factors, inference is polynomial rather than exponential in the number of nodes.
Theoretical parsimony: Local RSA at each node, composed via standard semantic composition. No special global mechanism.
Empirical predictions: Local RSA predicts independence effects that global reasoning would not.
Connection to grammar: The algebraic structure mirrors compositional semantics. RSA "rides along" with semantic composition.
The Monoid of Informativities #
Informativity values form a multiplicative monoid:
- Identity: informativity(tautology) = 1
- Multiplication: informativity(φ ∧ ψ) ≤ informativity(φ) × informativity(ψ) (with equality when φ and ψ are independent)
This monoid structure is what makes compositional RSA work:
- Each node contributes a factor
- Composition multiplies factors
- High-α RSA selects the maximum product = maximum informativity
The multiplicative monoid structure on informativities
Instances For
Unit informativity (tautology)
Equations
- RSA.Compositional.Informativity.one = { value := 1, pos := RSA.Compositional.Informativity.one._proof_1, le_one := RSA.Compositional.Informativity.one._proof_2 }
Instances For
Informativity values multiply like rationals
Informativity multiplication is associative (on values)
Value of unit informativity is 1
Informativity multiplication has unit (on values)
The Factorization Picture #
Sentence: "Every student read some book"
│
┌────┴────┐
│ Matrix │ ← RSA here? (config.matrixRSA)
│ EXH? │
└────┬────┘
│
┌──────────┴──────────┐
│ │
┌────┴────┐ ┌────┴────┐
│ "every" │ │ VP │
│ student │ │ │
└─────────┘ └────┬────┘
│
┌─────────┴─────────┐
│ │
┌────┴────┐ ┌────┴────┐
│ "read" │ │ "some" │ ← RSA here?
└─────────┘ │ book │ (config.innerRSA)
└─────────┘
P_S(m, c | t; α) = P_RSA_matrix(EXH? |...; α) × P_RSA_inner(EXH? |...; α)
The distribution factors along the tree structure. Each node's RSA decision is (conditionally) independent.
Connection to Bergen & Franke #
Their finding that GI >> LI >> LU >> RSA can be reinterpreted:
- RSA → LU: Adding lexical uncertainty helps, but doesn't factor right
- LU → LI: Adding local lexical choice helps (starts to factor)
- LI → GI: Adding matrix EXH completes the factorization
GI wins because it has the right algebraic structure: the full product over all compositional nodes.
Summary: The Two Perspectives #
@cite{franke-bergen-2020} #
- Primitives: Grammar (generates EXH parses) + Pragmatics (RSA selects)
- Architecture: P_S(m, p | t) where p is a grammatically-given parse
- EXH status: Primitive grammatical operator
- RSA role: Selects among grammatically-generated readings
RSA All The Way Down (This File) #
- Primitives: Pragmatics only (RSA at each compositional node)
- Architecture: P_S(m, c | t) where c is where RSA applies
- EXH status: Emergent behavior of RSA as α → ∞
- RSA role: Fundamental mechanism; EXH is its limit
The Mathematical Connection #
The joint inference machinery is identical:
P_S(m, x | t; α) ∝ [P(t | meaning(m, x))]^α
P_L(t, x | m; α) ∝ P(t) × P_S(m, x | t; α)
where x = parse (B&F) or x = RSA-config (RSAATWD).
In the α → ∞ limit, these are the same model, because local RSA → local EXH at each node.
Bergen & Franke's empirical validation of GI is also validation of compositional RSA, without requiring EXH as a grammatical primitive.