Combined utility: weighted interpolation of two utility components.
U_combined = (1-λ)·U_A + λ·U_B - cost
This is the standard form used across multiple RSA papers:
- Sumers: U_A = truthfulness, U_B = relevance
- PRIOR-PQ: U_A = informativity, U_B = action-relevance
- Yoon: U_A = informativity, U_B = social utility
Instances For
Combined utility equals U_A when λ = 0
Combined utility equals U_B when λ = 1
Compare two conditions by their λ values
Instances For
Compare two conditions by their λ values
Instances For
Combined utility with three components (for richer models).
U = w_A · U_A + w_B · U_B + w_C · U_C - cost
Used when there are three competing objectives.
Equations
Instances For
Goal-oriented speaker utility: U_epi + β · U_goal.
This parameterization naturally models argumentative/persuasive speakers:
- @cite{barnett-griffiths-hawkins-2022}: U_goal = ln P_L0(w*|u), β controls persuasive bias
- @cite{cummins-franke-2021}: U_goal = argStr(u, G), β → ∞ for pure argStr speaker
Equivalent to combinedWeighted(1, β, U_epi, U_goal). The parameter β controls the cooperativity spectrum:
- β = 0: fully cooperative (standard RSA)
- 0 < β < ∞: partially argumentative
- β → ∞: purely argumentative
Equations
- RSA.CombinedUtility.goalOrientedUtility uEpi uGoal β = uEpi + β * uGoal
Instances For
Goal-oriented utility = combinedWeighted(1, β,...)
At β=0, goal-oriented utility reduces to pure epistemic (cooperative RSA)
Higher β increases utility of goal-supporting utterances (U_goal > 0)
Negative U_goal DECREASES utility as β increases — the speaker is penalized for utterances that argue AGAINST the goal.
Convert additive bias parameter β ∈ [0,∞) to convex weight λ ∈ [0,1).
β/(1+β) maps [0,∞) → [0,1): β=0 ↦ 0, β=1 ↦ 1/2, β→∞ ↦ 1.
This bridges goalOrientedUtility (additive: U + β·V) and combined
(convex: (1-λ)·U + λ·V).
Equations
- RSA.CombinedUtility.betaToLam β = β / (1 + β)
Instances For
Convert convex weight λ ∈ [0,1) back to additive bias parameter β.
λ/(1-λ) maps [0,1) → [0,∞): λ=0 ↦ 0, λ=1/2 ↦ 1.
Equations
- RSA.CombinedUtility.lamToBeta lam = lam / (1 - lam)
Instances For
The key bridge: goalOrientedUtility = (1+β) · combined(β/(1+β),...).
U_epi + β·U_goal = (1+β) · ((1 - β/(1+β))·U_epi + β/(1+β)·U_goal)
Scaling by (1+β) > 0 preserves utterance rankings, so the additive and convex forms are strategically equivalent.
Utterance ranking equivalence: for β ≥ 0, goalOrientedUtility and combined rank any two utility pairs the same way (scaling by (1+β) > 0 preserves ordering).
If U_epi + β·U_goal > U_epi' + β·U_goal', then combined(β/(1+β), U_epi, U_goal) > combined(β/(1+β), U_epi', U_goal').