Core Decision Theory #
@cite{van-rooy-2003} @cite{blackwell-1953}
Theory-neutral decision-theoretic infrastructure: decision problems, expected utility, maximin, and mention-some/mention-all classification.
Promoted from Theories.Semantics.Questions.DecisionTheory so that any module
(RSA, causal decision theory, explanation models) can use decision problems
without pulling in question-semantic types.
Design: Fintype + Finset #
Functions that sum over the full universe use [Fintype W] with ∑ w : W.
Functions that operate over action sets or world subsets use Finset.
questionUtility and expectedVSI take Finset (Finset W) (cells as sets).
Question W = List (W → Bool) remains for ordering-sensitive operations
(questionMaximin, isMentionSome).
- @cite{van-rooy-2003}. Questioning to Resolve Decision Problems. L&P 26.
- @cite{blackwell-1953}. Equivalent Comparisons of Experiments.
Decision Problems #
A uniform prior over a Fintype of worlds
Equations
- Core.DecisionTheory.DecisionProblem.uniformPrior = if Fintype.card W = 0 then fun (x : W) => 0 else fun (x : W) => 1 / ↑(Fintype.card W)
Instances For
Create a decision problem with uniform prior
Equations
- Core.DecisionTheory.DecisionProblem.withUniformPrior utility = { utility := utility, prior := Core.DecisionTheory.DecisionProblem.uniformPrior }
Instances For
Expected Utility #
Expected utility of action a given beliefs.
Equations
- Core.DecisionTheory.expectedUtility dp a = ∑ w : W, dp.prior w * dp.utility w a
Instances For
Value of a decision problem: max EU over actions, or 0 if empty.
Equations
- Core.DecisionTheory.dpValue dp actions = if h : actions.Nonempty then actions.sup' h (Core.DecisionTheory.expectedUtility dp) else 0
Instances For
Conditional expected utility of action a given cell membership.
Equations
Instances For
Value of decision problem after learning cell (best EU among actions)
Equations
- Core.DecisionTheory.valueAfterLearning dp actions cell = if h : actions.Nonempty then actions.sup' h (Core.DecisionTheory.conditionalEU dp cell) else 0
Instances For
UV(C) = V(D|C) - V(D), the utility value of learning proposition C.
Equations
- Core.DecisionTheory.utilityValue dp actions cell = Core.DecisionTheory.valueAfterLearning dp actions cell - Core.DecisionTheory.dpValue dp actions
Instances For
Probability of a cell in the partition
Equations
- Core.DecisionTheory.cellProbability dp cell = cell.sum dp.prior
Instances For
Maximin #
S(a) = min_w U(w, a), security level of action a.
Equations
- Core.DecisionTheory.securityLevel dp worlds a = if h : worlds.Nonempty then worlds.inf' h fun (w : W) => dp.utility w a else 0
Instances For
MV = max_a min_w U(w, a), maximin value.
Equations
- Core.DecisionTheory.maximinValue dp worlds actions = if h : actions.Nonempty then actions.sup' h fun (a : A) => Core.DecisionTheory.securityLevel dp worlds a else 0
Instances For
Conditional security level: worst case within cell C
Equations
- Core.DecisionTheory.conditionalSecurityLevel dp worlds a c = Core.DecisionTheory.securityLevel dp ({w ∈ worlds | c w = true}) a
Instances For
Maximin value after learning C
Equations
- Core.DecisionTheory.maximinAfterLearning dp worlds actions c = Core.DecisionTheory.maximinValue dp ({w ∈ worlds | c w = true}) actions
Instances For
Maximin utility value of learning C
Equations
- Core.DecisionTheory.maximinUtilityValue dp worlds actions c = Core.DecisionTheory.maximinAfterLearning dp worlds actions c - Core.DecisionTheory.maximinValue dp worlds actions
Instances For
Mention-Some / Mention-All #
Theory-neutral question type: a list of characteristic functions (cells).
Used by ordering-sensitive operations (questionMaximin, isMentionSome).
Equations
- Core.DecisionTheory.Question W = List (W → Bool)
Instances For
Convert a list of predicates to a Finset (Finset W) of cells.
Equations
Instances For
C resolves decision problem if some action dominates after learning C.
Per @cite{van-rooy-2003}: C resolves DP iff after learning C, there exists an action a ∈ A that weakly dominates all other actions on every world in C.
Equations
Instances For
Set of answers that resolve the decision problem
Equations
- Core.DecisionTheory.resolvingAnswers dp worlds actions q = List.filter (fun (cell : W → Bool) => Core.DecisionTheory.resolves dp worlds actions cell) q
Instances For
A question has mention-some reading if multiple non-disjoint cells resolve the DP.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Mention-all reading: need the complete partition to resolve DP
Equations
- Core.DecisionTheory.isMentionAll dp worlds actions q = !Core.DecisionTheory.isMentionSome dp worlds actions q
Instances For
Question Utility #
EUV(Q) = Sum_{q in Q} P(q) * UV(q), expected utility value of question Q.
Equations
- Core.DecisionTheory.questionUtility dp actions cells = ∑ cell ∈ cells, Core.DecisionTheory.cellProbability dp cell * Core.DecisionTheory.utilityValue dp actions cell
Instances For
MV(Q) = min_{q in Q} MV(q), maximin question value.
Equations
- One or more equations did not get rendered due to their size.
- Core.DecisionTheory.questionMaximin dp worlds actions [] = 0
Instances For
Value of Sample Information #
Optimal action: the one with highest expected utility.
Noncomputable because it extracts a witness from an existential
(Finset.exists_max_image via Classical.choice).
Equations
Instances For
VSI(C) = V(D|C) - EU(a⁰|C): the value of sample information from learning proposition C, where a⁰ is the current optimal action.
Unlike UV(C) = V(D|C) - V(D), VSI is always non-negative: learning C before choosing can never hurt relative to the current best action applied within C. @cite{van-rooy-2003}, p. 742.
Equations
- One or more equations did not get rendered due to their size.
Instances For
EVSI(Q) = Σ P(C) · VSI(C): the expected value of sample information from asking question Q. @cite{van-rooy-2003}, p. 742.
Equations
- Core.DecisionTheory.expectedVSI dp actions cells = ∑ cell ∈ cells, Core.DecisionTheory.cellProbability dp cell * Core.DecisionTheory.valueSampleInfo dp actions cell
Instances For
EUV = EVSI #
EUV(Q) = EVSI(Q): the expected utility value of a question equals its expected value of sample information.
@cite{van-rooy-2003}, p. 742: "The expected utility value of a question is equal to its expected value of sample information."
The two hypotheses are the law of total expectation (Σ P(C)·EU(a|C) = EU(a)
for all actions) and cell probability normalization (Σ P(C) = 1).
Maximin Monotonicity #
Security level and maximin value are anti-monotone in the world set: restricting to a subset can only improve worst-case guarantees. These lemmas support the Blackwell maximin forward theorem.
Security level is ≤ the utility of any world in the finset.
Security level on a subset ≥ security level on the superset.
min over fewer elements ≥ min over more elements.
Maximin value on a subset ≥ maximin value on the superset.
Since securityLevel increases on subsets (fewer worst cases), maximinValue
(max over actions of security levels) also increases.
Maximin utility value is monotone under cell containment: learning a more specific proposition (subset of worlds) gives higher MUV.
Maximin value of information is non-negative for nonempty cells.
When the cell is nonempty, worlds.filter c ⊆ worlds, and the maximin over a
subset considers fewer worst cases, so maximinAfterLearning ≥ maximinValue.
The question maximin value is ≤ MUV of each cell in the question.
Special Decision Problems #
An epistemic DP where the agent wants to know the exact world state.
Equations
Instances For
A complete-information DP where only exact-state knowledge is useful.
Equations
Instances For
A mention-some DP: any satisfier resolves the problem.
Equations
Instances For
Binary Question Value Decomposition #
For a binary partition [P, ¬P], the probability-weighted sum of conditional DP values equals Van Rooy's question utility plus the baseline DP value. This is the structural identity connecting "the value of asking a yes/no question" to the decision-theoretic question framework of @cite{van-rooy-2003}.
Cell probabilities of a binary partition [P, ¬P] sum to 1 when the prior is a proper distribution.
Binary question value decomposition: for a binary partition {P, ¬P},
Σ P(cell) · V(D|cell) = EUV({P,¬P}, D) + V(D)
where the LHS is the probability-weighted sum of conditional DP values,
EUV is questionUtility, and V(D) is dpValue.
Answer and Question Orderings #
@cite{van-rooy-2003}'s relevance-based orderings on answers and questions.
C >_Q D: answer C is strictly more relevant than D given question Q.
Equations
- Core.DecisionTheory.moreRelevantAnswer dp actions c d = decide (Core.DecisionTheory.utilityValue dp actions c > Core.DecisionTheory.utilityValue dp actions d)
Instances For
Q > Q': question Q is strictly better than Q'.
Equations
- Core.DecisionTheory.betterQuestion dp actions q q' = decide (Core.DecisionTheory.questionUtility dp actions q > Core.DecisionTheory.questionUtility dp actions q')