Documentation

Linglib.Core.Agent.DecisionTheory

Core Decision Theory #

@cite{van-rooy-2003} @cite{blackwell-1953}

Theory-neutral decision-theoretic infrastructure: decision problems, expected utility, maximin, and mention-some/mention-all classification.

Promoted from Theories.Semantics.Questions.DecisionTheory so that any module (RSA, causal decision theory, explanation models) can use decision problems without pulling in question-semantic types.

Design: Fintype + Finset #

Functions that sum over the full universe use [Fintype W] with ∑ w : W. Functions that operate over action sets or world subsets use Finset. questionUtility and expectedVSI take Finset (Finset W) (cells as sets). Question W = List (W → Bool) remains for ordering-sensitive operations (questionMaximin, isMentionSome).

@cite{van-rooy-2003}. Questioning to Resolve Decision Problems. L&P 26.
@cite{blackwell-1953}. Equivalent Comparisons of Experiments.

Decision Problems #

structure Core.DecisionTheory.DecisionProblem (W : Type u_1) (A : Type u_2) :

Type (max u_1 u_2)

A decision problem D = (W, A, U, π) with utility and prior.

utility : W → A → ℚ
Utility of action a in world w
prior : W → ℚ
Prior beliefs over worlds (should sum to 1 for proper probability)

Instances For

def Core.DecisionTheory.DecisionProblem.uniformPrior {W : Type u_1} [Fintype W] [DecidableEq W] :

W → ℚ

A uniform prior over a Fintype of worlds

Equations

Core.DecisionTheory.DecisionProblem.uniformPrior = if Fintype.card W = 0 then fun (x : W) => 0 else fun (x : W) => 1 / ↑(Fintype.card W)

Instances For

def Core.DecisionTheory.DecisionProblem.withUniformPrior {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] (utility : W → A → ℚ) :

DecisionProblem W A

Create a decision problem with uniform prior

Equations

Core.DecisionTheory.DecisionProblem.withUniformPrior utility = { utility := utility, prior := Core.DecisionTheory.DecisionProblem.uniformPrior }

Instances For

Expected Utility #

def Core.DecisionTheory.expectedUtility {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] (dp : DecisionProblem W A) (a : A) :

Expected utility of action a given beliefs.

Equations

Core.DecisionTheory.expectedUtility dp a = ∑ w : W, dp.prior w * dp.utility w a

Instances For

def Core.DecisionTheory.dpValue {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] (dp : DecisionProblem W A) (actions : Finset A) :

Value of a decision problem: max EU over actions, or 0 if empty.

Equations

Core.DecisionTheory.dpValue dp actions = if h : actions.Nonempty then actions.sup' h (Core.DecisionTheory.expectedUtility dp) else 0

Instances For

def Core.DecisionTheory.conditionalEU {W : Type u_1} {A : Type u_2} [DecidableEq W] (dp : DecisionProblem W A) (cell : Finset W) (a : A) :

Conditional expected utility of action a given cell membership.

Equations

Core.DecisionTheory.conditionalEU dp cell a = if cell.sum dp.prior = 0 then 0 else ∑ w ∈ cell, dp.prior w / cell.sum dp.prior * dp.utility w a

Instances For

def Core.DecisionTheory.valueAfterLearning {W : Type u_1} {A : Type u_2} [DecidableEq W] (dp : DecisionProblem W A) (actions : Finset A) (cell : Finset W) :

Value of decision problem after learning cell (best EU among actions)

Equations

Core.DecisionTheory.valueAfterLearning dp actions cell = if h : actions.Nonempty then actions.sup' h (Core.DecisionTheory.conditionalEU dp cell) else 0

Instances For

def Core.DecisionTheory.utilityValue {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] (dp : DecisionProblem W A) (actions : Finset A) (cell : Finset W) :

UV(C) = V(D|C) - V(D), the utility value of learning proposition C.

Equations

Core.DecisionTheory.utilityValue dp actions cell = Core.DecisionTheory.valueAfterLearning dp actions cell - Core.DecisionTheory.dpValue dp actions

Instances For

def Core.DecisionTheory.cellProbability {W : Type u_1} {A : Type u_2} [DecidableEq W] (dp : DecisionProblem W A) (cell : Finset W) :

Probability of a cell in the partition

Equations

Core.DecisionTheory.cellProbability dp cell = cell.sum dp.prior

Instances For

Maximin #

def Core.DecisionTheory.securityLevel {W : Type u_1} {A : Type u_2} (dp : DecisionProblem W A) (worlds : Finset W) (a : A) :

S(a) = min_w U(w, a), security level of action a.

Equations

Core.DecisionTheory.securityLevel dp worlds a = if h : worlds.Nonempty then worlds.inf' h fun (w : W) => dp.utility w a else 0

Instances For

def Core.DecisionTheory.maximinValue {W : Type u_1} {A : Type u_2} (dp : DecisionProblem W A) (worlds : Finset W) (actions : Finset A) :

MV = max_a min_w U(w, a), maximin value.

Equations

Core.DecisionTheory.maximinValue dp worlds actions = if h : actions.Nonempty then actions.sup' h fun (a : A) => Core.DecisionTheory.securityLevel dp worlds a else 0

Instances For

def Core.DecisionTheory.conditionalSecurityLevel {W : Type u_1} {A : Type u_2} [DecidableEq W] (dp : DecisionProblem W A) (worlds : Finset W) (a : A) (c : W → Bool) :

Conditional security level: worst case within cell C

Equations

Core.DecisionTheory.conditionalSecurityLevel dp worlds a c = Core.DecisionTheory.securityLevel dp ({w ∈ worlds | c w = true}) a

Instances For

def Core.DecisionTheory.maximinAfterLearning {W : Type u_1} {A : Type u_2} [DecidableEq W] (dp : DecisionProblem W A) (worlds : Finset W) (actions : Finset A) (c : W → Bool) :

Maximin value after learning C

Equations

Core.DecisionTheory.maximinAfterLearning dp worlds actions c = Core.DecisionTheory.maximinValue dp ({w ∈ worlds | c w = true}) actions

Instances For

def Core.DecisionTheory.maximinUtilityValue {W : Type u_1} {A : Type u_2} [DecidableEq W] (dp : DecisionProblem W A) (worlds : Finset W) (actions : Finset A) (c : W → Bool) :

Maximin utility value of learning C

Equations

Core.DecisionTheory.maximinUtilityValue dp worlds actions c = Core.DecisionTheory.maximinAfterLearning dp worlds actions c - Core.DecisionTheory.maximinValue dp worlds actions

Instances For

Mention-Some / Mention-All #

@[reducible, inline]

abbrev Core.DecisionTheory.Question (W : Type u_1) :

Type u_1

Theory-neutral question type: a list of characteristic functions (cells). Used by ordering-sensitive operations (questionMaximin, isMentionSome).

Equations

Core.DecisionTheory.Question W = List (W → Bool)

Instances For

def Core.DecisionTheory.questionToFinset {W : Type u_1} [Fintype W] [DecidableEq W] (q : Question W) :

Finset (Finset W)

Convert a list of predicates to a Finset (Finset W) of cells.

Equations

Core.DecisionTheory.questionToFinset q = (List.map (fun (cell : W → Bool) => {w : W | cell w = true}) q).toFinset

Instances For

def Core.DecisionTheory.resolves {W : Type u_1} {A : Type u_2} [DecidableEq W] [DecidableEq A] (dp : DecisionProblem W A) (worlds : Finset W) (actions : Finset A) (c : W → Bool) :

C resolves decision problem if some action dominates after learning C.

Per @cite{van-rooy-2003}: C resolves DP iff after learning C, there exists an action a ∈ A that weakly dominates all other actions on every world in C.

Equations

Core.DecisionTheory.resolves dp worlds actions c = decide (¬actions.Nonempty ∨ ∃ a ∈ actions, ∀ b ∈ actions, ∀ w ∈ {w ∈ worlds | c w = true}, dp.utility w a ≥ dp.utility w b)

Instances For

def Core.DecisionTheory.resolvingAnswers {W : Type u_1} {A : Type u_2} [DecidableEq W] [DecidableEq A] (dp : DecisionProblem W A) (worlds : Finset W) (actions : Finset A) (q : Question W) :

List (W → Bool)

Set of answers that resolve the decision problem

Equations

Core.DecisionTheory.resolvingAnswers dp worlds actions q = List.filter (fun (cell : W → Bool) => Core.DecisionTheory.resolves dp worlds actions cell) q

Instances For

def Core.DecisionTheory.isMentionSome {W : Type u_1} {A : Type u_2} [DecidableEq W] [DecidableEq A] (dp : DecisionProblem W A) (worlds : Finset W) (actions : Finset A) (q : Question W) :

A question has mention-some reading if multiple non-disjoint cells resolve the DP.

Equations

One or more equations did not get rendered due to their size.

Instances For

def Core.DecisionTheory.isMentionAll {W : Type u_1} {A : Type u_2} [DecidableEq W] [DecidableEq A] (dp : DecisionProblem W A) (worlds : Finset W) (actions : Finset A) (q : Question W) :

Mention-all reading: need the complete partition to resolve DP

Equations

Core.DecisionTheory.isMentionAll dp worlds actions q = !Core.DecisionTheory.isMentionSome dp worlds actions q

Instances For

Question Utility #

def Core.DecisionTheory.questionUtility {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] [DecidableEq A] (dp : DecisionProblem W A) (actions : Finset A) (cells : Finset (Finset W)) :

EUV(Q) = Sum_{q in Q} P(q) * UV(q), expected utility value of question Q.

Equations

Core.DecisionTheory.questionUtility dp actions cells = ∑ cell ∈ cells, Core.DecisionTheory.cellProbability dp cell * Core.DecisionTheory.utilityValue dp actions cell

Instances For

def Core.DecisionTheory.questionMaximin {W : Type u_1} {A : Type u_2} [DecidableEq W] (dp : DecisionProblem W A) (worlds : Finset W) (actions : Finset A) (q : Question W) :

MV(Q) = min_{q in Q} MV(q), maximin question value.

Equations

One or more equations did not get rendered due to their size.
Core.DecisionTheory.questionMaximin dp worlds actions [] = 0

Instances For

Value of Sample Information #

noncomputable def Core.DecisionTheory.optimalAction {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] (dp : DecisionProblem W A) (actions : Finset A) :

Optimal action: the one with highest expected utility.

Noncomputable because it extracts a witness from an existential (Finset.exists_max_image via Classical.choice).

Equations

Core.DecisionTheory.optimalAction dp actions = if h : actions.Nonempty then some ⋯.choose else none

Instances For

noncomputable def Core.DecisionTheory.valueSampleInfo {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] (dp : DecisionProblem W A) (actions : Finset A) (cell : Finset W) :

VSI(C) = V(D|C) - EU(a⁰|C): the value of sample information from learning proposition C, where a⁰ is the current optimal action.

Unlike UV(C) = V(D|C) - V(D), VSI is always non-negative: learning C before choosing can never hurt relative to the current best action applied within C. @cite{van-rooy-2003}, p. 742.

Equations

One or more equations did not get rendered due to their size.

Instances For

noncomputable def Core.DecisionTheory.expectedVSI {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] (dp : DecisionProblem W A) (actions : Finset A) (cells : Finset (Finset W)) :

EVSI(Q) = Σ P(C) · VSI(C): the expected value of sample information from asking question Q. @cite{van-rooy-2003}, p. 742.

Equations

Core.DecisionTheory.expectedVSI dp actions cells = ∑ cell ∈ cells, Core.DecisionTheory.cellProbability dp cell * Core.DecisionTheory.valueSampleInfo dp actions cell

Instances For

EUV = EVSI #

theorem Core.DecisionTheory.euv_eq_evsi {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] [DecidableEq A] (dp : DecisionProblem W A) (actions : Finset A) (cells : Finset (Finset W)) (hLTE : ∀ (a : A), ∑ cell ∈ cells, cellProbability dp cell * conditionalEU dp cell a = expectedUtility dp a) (hSum : ∑ cell ∈ cells, cellProbability dp cell = 1) :

questionUtility dp actions cells = expectedVSI dp actions cells

EUV(Q) = EVSI(Q): the expected utility value of a question equals its expected value of sample information.

@cite{van-rooy-2003}, p. 742: "The expected utility value of a question is equal to its expected value of sample information."

The two hypotheses are the law of total expectation (Σ P(C)·EU(a|C) = EU(a) for all actions) and cell probability normalization (Σ P(C) = 1).

Maximin Monotonicity #

Security level and maximin value are anti-monotone in the world set: restricting to a subset can only improve worst-case guarantees. These lemmas support the Blackwell maximin forward theorem.

theorem Core.DecisionTheory.securityLevel_le_utility {W : Type u_1} {A : Type u_2} (dp : DecisionProblem W A) (worlds : Finset W) (a : A) {w : W} (hw : w ∈ worlds) :

securityLevel dp worlds a ≤ dp.utility w a

Security level is ≤ the utility of any world in the finset.

theorem Core.DecisionTheory.securityLevel_subset_ge {W : Type u_1} {A : Type u_2} (dp : DecisionProblem W A) (S₁ S₂ : Finset W) (a : A) (hne : S₁.Nonempty) (hsub : S₁ ⊆ S₂) :

securityLevel dp S₁ a ≥ securityLevel dp S₂ a

Security level on a subset ≥ security level on the superset.

min over fewer elements ≥ min over more elements.

theorem Core.DecisionTheory.maximinValue_subset_ge {W : Type u_1} {A : Type u_2} (dp : DecisionProblem W A) (S₁ S₂ : Finset W) (actions : Finset A) (hne : S₁.Nonempty) (hsub : S₁ ⊆ S₂) :

maximinValue dp S₁ actions ≥ maximinValue dp S₂ actions

Maximin value on a subset ≥ maximin value on the superset.

Since securityLevel increases on subsets (fewer worst cases), maximinValue (max over actions of security levels) also increases.

theorem Core.DecisionTheory.maximinUtilityValue_monotone_cell {W : Type u_1} {A : Type u_2} [DecidableEq W] (dp : DecisionProblem W A) (worlds : Finset W) (actions : Finset A) (c1 c2 : W → Bool) (hSub : ∀ (w : W), c1 w = true → c2 w = true) (hNe : {w ∈ worlds | c1 w = true}.Nonempty) :

maximinUtilityValue dp worlds actions c1 ≥ maximinUtilityValue dp worlds actions c2

Maximin utility value is monotone under cell containment: learning a more specific proposition (subset of worlds) gives higher MUV.

theorem Core.DecisionTheory.maximinUtilityValue_nonneg {W : Type u_1} {A : Type u_2} [DecidableEq W] (dp : DecisionProblem W A) (worlds : Finset W) (actions : Finset A) (c : W → Bool) (hNonempty : {w ∈ worlds | c w = true}.Nonempty) :

maximinUtilityValue dp worlds actions c ≥ 0

Maximin value of information is non-negative for nonempty cells.

When the cell is nonempty, worlds.filter c ⊆ worlds, and the maximin over a subset considers fewer worst cases, so maximinAfterLearning ≥ maximinValue.

theorem Core.DecisionTheory.le_foldl_min {α : Type u_1} (f : α → ℚ) (xs : List α) (init b : ℚ) (hinit : b ≤ init) (hxs : ∀ x ∈ xs, b ≤ f x) :

b ≤ List.foldl (fun (m : ℚ) (x : α) => min m (f x)) init xs

A lower bound of all values is ≤ foldl min.

theorem Core.DecisionTheory.questionMaximin_le_muv {W : Type u_1} {A : Type u_2} [DecidableEq W] (dp : DecisionProblem W A) (worlds : Finset W) (actions : Finset A) (q : Question W) (cell : W → Bool) (hcell : cell ∈ q) :

questionMaximin dp worlds actions q ≤ maximinUtilityValue dp worlds actions cell

The question maximin value is ≤ MUV of each cell in the question.

Special Decision Problems #

def Core.DecisionTheory.epistemicDP {W : Type u_1} {A : Type u_2} [DecidableEq W] (target : W) :

DecisionProblem W A

An epistemic DP where the agent wants to know the exact world state.

Equations

Core.DecisionTheory.epistemicDP target = { utility := fun (w : W) (x : A) => if (w == target) = true then 1 else 0, prior := fun (x : W) => 1 }

Instances For

def Core.DecisionTheory.completeInformationDP {W : Type u_1} [DecidableEq W] :

DecisionProblem W W

A complete-information DP where only exact-state knowledge is useful.

Equations

Core.DecisionTheory.completeInformationDP = { utility := fun (w a : W) => if (a == w) = true then 1 else 0, prior := fun (x : W) => 1 }

Instances For

def Core.DecisionTheory.mentionSomeDP {W : Type u_1} (satisfies : W → Bool) :

DecisionProblem W Bool

A mention-some DP: any satisfier resolves the problem.

Equations

Core.DecisionTheory.mentionSomeDP satisfies = { utility := fun (w : W) (a : Bool) => if (a && satisfies w) = true then 1 else 0, prior := fun (x : W) => 1 }

Instances For

Binary Question Value Decomposition #

For a binary partition [P, ¬P], the probability-weighted sum of conditional DP values equals Van Rooy's question utility plus the baseline DP value. This is the structural identity connecting "the value of asking a yes/no question" to the decision-theoretic question framework of @cite{van-rooy-2003}.

theorem Core.DecisionTheory.binary_cellProb_sum {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] (dp : DecisionProblem W A) (P : W → Bool) (hPrior : Finset.univ.sum dp.prior = 1) :

cellProbability dp {w : W | P w = true} + cellProbability dp {w : W | (!P w) = true} = 1

Cell probabilities of a binary partition [P, ¬P] sum to 1 when the prior is a proper distribution.

theorem Core.DecisionTheory.binary_question_value_decomposition {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] [DecidableEq A] (dp : DecisionProblem W A) (actions : Finset A) (P : W → Bool) (hPrior : Finset.univ.sum dp.prior = 1) :

have yesCell := {w : W | P w = true}; have noCell := {w : W | (!P w) = true}; cellProbability dp yesCell * valueAfterLearning dp actions yesCell + cellProbability dp noCell * valueAfterLearning dp actions noCell = questionUtility dp actions {yesCell, noCell} + dpValue dp actions

Binary question value decomposition: for a binary partition {P, ¬P},

Σ P(cell) · V(D|cell) = EUV({P,¬P}, D) + V(D)

where the LHS is the probability-weighted sum of conditional DP values, EUV is questionUtility, and V(D) is dpValue.

Answer and Question Orderings #

@cite{van-rooy-2003}'s relevance-based orderings on answers and questions.

def Core.DecisionTheory.moreRelevantAnswer {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] [DecidableEq A] (dp : DecisionProblem W A) (actions : Finset A) (c d : Finset W) :

C >_Q D: answer C is strictly more relevant than D given question Q.

Equations

Core.DecisionTheory.moreRelevantAnswer dp actions c d = decide (Core.DecisionTheory.utilityValue dp actions c > Core.DecisionTheory.utilityValue dp actions d)

Instances For

def Core.DecisionTheory.betterQuestion {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] [DecidableEq A] (dp : DecisionProblem W A) (actions : Finset A) (q q' : Finset (Finset W)) :

Q > Q': question Q is strictly better than Q'.

Equations

Core.DecisionTheory.betterQuestion dp actions q q' = decide (Core.DecisionTheory.questionUtility dp actions q > Core.DecisionTheory.questionUtility dp actions q')

Instances For