RSAConfigData — Computable RSA Configuration #
A computable (ℚ-valued) parallel to RSAConfig (ℝ-valued). Models define
RSAConfigData and derive RSAConfig via .toRSAConfig.
Design #
RSAConfig uses a freeform s1Score lambda, which is flexible but
noncomputable. RSAConfigData replaces this with S1ScoreSpec, an
enumeration of the scoring patterns actually used across all implementations.
Each S1ScoreSpec variant has:
- A computable ℚ implementation (for
native_decideverification) - A noncomputable ℝ expansion (for
toRSAConfig) - A soundness bridge (the ℝ expansion matches the original
s1Score)
S1ScoreSpec Variants #
| Variant | Formula | Papers |
|---|---|---|
beliefBased | L0(w\|u)^α | @cite{frank-goodman-2012}, @cite{beller-gerstenberg-2025} |
beliefAction | if L0=0 then 0 else exp(α·(log L0 - cost u)) | @cite{qing-franke-2015} σ_b |
actionBased | exp(α·(L0(w\|u) - cost u)) | @cite{qing-franke-2015} σ_a |
weightedBeliefAction | if L0=0 then 0 else exp(α·(γ·log L0 + bonus u)) | @cite{hawkins-etal-2025} |
beliefWeighted | if qualOk then exp(α·Σ_s b(l,s)·log L0(u,s)) else 0 | @cite{goodman-stuhlmuller-2013} |
QUD projection is orthogonal to scoring: set RSAConfigData.qudProject to apply
L0(w|u) → Σ_{w'∼w} L0(w'|u) before S1 scoring. Used by @cite{kao-etal-2014-hyperbole}.
Utility term component for combinedUtility scoring.
Each term contributes an additive component to the S1 utility:
score = exp(α · Σ_i term_i).
- logInformativity
{U : Type u_1}
{W : Type u_2}
{L : Type u_3}
(weight : L → ℚ)
: S1UtilityTerm U W L
weight(l) · log L0(u, w) — log-informativity at the true world.
- expectedValue
{U : Type u_1}
{W : Type u_2}
{L : Type u_3}
(weight : L → ℚ)
(value : W → ℚ)
: S1UtilityTerm U W L
weight(l) · Σ_w' L0(u,w') · value(w') — expected value under L0.
- constant
{U : Type u_1}
{W : Type u_2}
{L : Type u_3}
(fn : L → U → ℚ)
: S1UtilityTerm U W L
fn(l, u) — per-(latent, utterance) constant (cost, bonus, etc.).
Instances For
Does this term use log(L0), requiring L0 > 0?
Instances For
S1 scoring specification. Each variant captures a standard RSA scoring pattern. The type parameters are:
U: utterance typeW: world typeL: latent variable type
- beliefBased
{U : Type u_1}
{W : Type u_2}
{L : Type u_3}
: S1ScoreSpec U W L
score = L0(w|u)^α. Standard belief-based informativity. Used by @cite{frank-goodman-2012}, @cite{beller-gerstenberg-2025}.
- beliefAction
{U : Type u_1}
{W : Type u_2}
{L : Type u_3}
(cost : U → ℚ)
: S1ScoreSpec U W L
score = if L0(w|u) = 0 then 0 else exp(α · (log L0(w|u) - cost u)). Belief-oriented with utterance cost (gated). Used by @cite{qing-franke-2015} σ_b.
- actionBased
{U : Type u_1}
{W : Type u_2}
{L : Type u_3}
(cost : U → ℚ)
: S1ScoreSpec U W L
score = exp(α · (L0(w|u) - cost u)). Action-oriented: raw L0 probability, no log. Used by @cite{qing-franke-2015} σ_a.
- weightedBeliefAction
{U : Type u_1}
{W : Type u_2}
{L : Type u_3}
(infWeight : ℚ)
(bonus : U → ℚ)
: S1ScoreSpec U W L
score = if L0(w|u) = 0 then 0 else exp(α · (infWeight · log L0(w|u) + bonus u)). Weighted belief-action: informativity weight γ on log L0, plus a per-utterance bonus that can encode action relevance, cost, or any u-dependent term. Subsumes
beliefAction:beliefAction(cost) = weightedBeliefAction 1 (fun u => -cost u). Used by @cite{hawkins-etal-2025} (PRIOR-PQ). - beliefWeighted
{U : Type u_1}
{W : Type u_2}
{L : Type u_3}
(belief : L → W → ℚ)
(quality : L → U → Bool)
: S1ScoreSpec U W L
score = if quality(l, u) then exp(α · Σ_w belief(l, w) · log L0(u, w)) else 0. Belief-weighted expected log-informativity, gated by quality. Used by @cite{goodman-stuhlmuller-2013}.
- combinedUtility
{U : Type u_1}
{W : Type u_2}
{L : Type u_3}
(terms : List (S1UtilityTerm U W L))
(logActive : L → Bool := fun (x : L) => true)
: S1ScoreSpec U W L
score = if (logActive(l) ∧ L0=0) then 0 else exp(α · Σ_i term_i(l, u, w, L0)). Arbitrary sum of utility terms. Subsumes
weightedBeliefAction. Used by @cite{yoon-etal-2020} (politeness).The
logActivegate controls when L0=0 forces score=0. When the informativity weight is 0 (pure social speaker), settinglogActive l = falseallows utterances incompatible with the true state to receive positive scores. Default: always gate (safe for models where all latent values have non-zero informativity weight).
Instances For
Utility term component for S2ScoreSpec.utilityMaximizing scoring.
Each term contributes an additive component to the S2 utility,
computed w.r.t. L1 marginals (not L0).
- logStateMarginal
{U : Type u_1}
{W : Type u_2}
{L : Type u_3}
(weight : ℚ)
: S2UtilityTerm U W L
ω · ln P_L1(w|u) — log-informativity at the true world, w.r.t. L1.
- expectedValue
{U : Type u_1}
{W : Type u_2}
{L : Type u_3}
(weight : ℚ)
(value : W → ℚ)
: S2UtilityTerm U W L
ω · Σ_w' P_L1(w'|u) · V(w') — expected value under L1 state marginals.
- logLatentMarginal
{U : Type u_1}
{W : Type u_2}
{L : Type u_3}
(weight : ℚ)
(target : L)
: S2UtilityTerm U W L
ω · ln P_L1(l̂|u) — log probability of target latent under L1.
- constant
{U : Type u_1}
{W : Type u_2}
{L : Type u_3}
(fn : U → ℚ)
: S2UtilityTerm U W L
f(u) — per-utterance constant (cost, etc.).
Instances For
S2 scoring specification. Determines how the S2 speaker scores utterances.
endorsement: S2(u|w) ∝ L1(w|u) — same as L1 endorsement (no extra utility).utilityMaximizing: S2(u|w) ∝ exp(α · Σ terms) — full utility model.
- endorsement
{U : Type u_1}
{W : Type u_2}
{L : Type u_3}
: S2ScoreSpec U W L
S2 score = L1(w|u). Simple endorsement (no presentational utility).
- utilityMaximizing
{U : Type u_1}
{W : Type u_2}
{L : Type u_3}
(α : ℚ)
(terms : List (S2UtilityTerm U W L))
: S2ScoreSpec U W L
S2 score = exp(α · Σ_i term_i). Utility-maximizing with L1-derived terms.
Instances For
Computable RSA configuration with ℚ-valued data fields.
Mirrors RSAConfig but all fields are computable, enabling
native_decide verification. The toRSAConfig lift casts ℚ → ℝ
and expands s1Spec into the appropriate s1Score lambda.
- Latent : Type
Latent variable type (default Unit).
Fintype instance for Latent.
- latentDecEq : DecidableEq self.Latent
DecidableEq instance for Latent.
Literal semantics φ(l, u, w) ≥ 0, ℚ-valued.
Meaning values are non-negative.
- s1Spec : S1ScoreSpec U W self.Latent
S1 scoring pattern.
Optional QUD projection: sums L0 over equivalence classes before S1 scoring. When
some project, L0(w|u) becomes Σ_{w'∼w} L0(w'|u).- α : ℚ
Rationality parameter (positive rational).
Rationality is positive.
- s2Spec : Option (S2ScoreSpec U W self.Latent)
Optional S2 scoring specification. When
some, enables S2 speaker layer. - worldPrior : W → ℚ
Prior over worlds (unnormalized, ℚ-valued).
World prior is non-negative.
Prior over latent variables (unnormalized, ℚ-valued).
Latent prior is non-negative.
Instances For
Evaluate a utility term in ℝ, given L0 policy and current (l, u, w).
Equations
- (RSA.S1UtilityTerm.logInformativity weight).evalR l0 l u w = ↑(weight l) * Real.log (l0 u w)
- (RSA.S1UtilityTerm.expectedValue weight value).evalR l0 l u w = ↑(weight l) * ∑ w' : W, l0 u w' * ↑(value w')
- (RSA.S1UtilityTerm.constant fn).evalR l0 l u w = ↑(fn l u)
Instances For
Evaluate an S2 utility term in ℝ, given L1 state and latent marginals.
Equations
- (RSA.S2UtilityTerm.logStateMarginal ω).evalR l1 l1_latent w u = ↑ω * Real.log (l1 u w)
- (RSA.S2UtilityTerm.expectedValue ω value).evalR l1 l1_latent w u = ↑ω * ∑ w' : W, l1 u w' * ↑(value w')
- (RSA.S2UtilityTerm.logLatentMarginal ω target).evalR l1 l1_latent w u = ↑ω * Real.log (l1_latent u target)
- (RSA.S2UtilityTerm.constant fn).evalR l1 l1_latent w u = ↑(fn u)
Instances For
QUD projection: sum L0 over the equivalence class of w under latent l. {w' | project w' l = project w l}
Equations
- RSA.qudProjectR project l l0 u w = {w' : W | project w' l = project w l}.sum (l0 u)
Instances For
Effective L0 after optional QUD projection. When qudProj = none, returns
L0 unchanged. When qudProj = some project, sums L0 over the equivalence
class {w' | project w' l = project w l}.
Equations
- RSA.effectiveL0 none l l0 = l0
- RSA.effectiveL0 (some project) l l0 = fun (u : U) (w : W) => RSA.qudProjectR project l l0 u w
Instances For
Expand S1ScoreSpec to the ℝ-valued s1Score function expected by RSAConfig. Each variant produces the corresponding formula using ℝ operations.
Equations
- One or more equations did not get rendered due to their size.
- RSA.S1ScoreSpec.beliefBased.toS1Score = fun (l0 : U → W → ℝ) (α : ℝ) (_l : L) (w : W) (u : U) => (l0 u w).rpow α
- (RSA.S1ScoreSpec.beliefAction cost).toS1Score = fun (l0 : U → W → ℝ) (α : ℝ) (_l : L) (w : W) (u : U) => if l0 u w = 0 then 0 else Real.exp (α * (Real.log (l0 u w) - ↑(cost u)))
- (RSA.S1ScoreSpec.actionBased cost).toS1Score = fun (l0 : U → W → ℝ) (α : ℝ) (_l : L) (w : W) (u : U) => Real.exp (α * (l0 u w - ↑(cost u)))
Instances For
Non-negativity of s1Score for each ScoreSpec variant.
Lift RSAConfigData to RSAConfig by casting ℚ → ℝ and expanding s1Spec.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Decompose d.toRSAConfig = cfg into per-field hypotheses.
Used by the rsa_predict tactic's Tier 2 bridge: after building an
RSAConfigData with matching ℚ data, this theorem reduces the config
equality to field-by-field proofs that the tactic constructs individually.
Requires d.Latent and cfg.Latent to be definitionally equal (always
true in practice since both are the same concrete type).
Fields use HEq where their types depend on Latent (meaning, s1Score,
latentPrior). Proof fields (meaning_nonneg, s1Score_nonneg, α_pos,
worldPrior_nonneg, latentPrior_nonneg) are handled by proof irrelevance.
The Fintype Latent instance is handled by Subsingleton.elim.
S2 utility for comparison-based prediction.
For endorsement models, returns the L1 posterior (S2 score ∝ L1). For utility-maximizing models, returns the raw utility sum U_S2(w, u) = Σ_i term_i(L1, L1_latent, w, u). Since S2 score ∝ exp(α₂·U) and exp is monotone, U₁ > U₂ iff S2(u₁|w) > S2(u₂|w) (when α₂ > 0).
This enables rsa_predict to handle S2 predictions via:
d.S2Utility w u₁ > d.S2Utility w u₂
Equations
- One or more equations did not get rendered due to their size.