Documentation

Linglib.Core.Agent.DecisionTheory

Core Decision Theory #

@cite{van-rooy-2003} @cite{blackwell-1953}

Theory-neutral decision-theoretic infrastructure: decision problems, expected utility, maximin, and mention-some/mention-all classification.

Promoted from Theories.Semantics.Questions.DecisionTheory so that any module (RSA, causal decision theory, explanation models) can use decision problems without pulling in question-semantic types.

Design: Fintype + Finset #

Functions that sum over the full universe use [Fintype W] with ∑ w : W. Functions that operate over action sets or world subsets use Finset. questionUtility and expectedVSI take Finset (Finset W) (cells as sets). Question W = List (W → Bool) remains for ordering-sensitive operations (questionMaximin, isMentionSome).

Decision Problems #

structure Core.DecisionTheory.DecisionProblem (W : Type u_1) (A : Type u_2) :
Type (max u_1 u_2)

A decision problem D = (W, A, U, π) with utility and prior.

  • utility : WA

    Utility of action a in world w

  • prior : W

    Prior beliefs over worlds (should sum to 1 for proper probability)

Instances For

    A uniform prior over a Fintype of worlds

    Equations
    Instances For
      def Core.DecisionTheory.DecisionProblem.withUniformPrior {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] (utility : WA) :

      Create a decision problem with uniform prior

      Equations
      Instances For

        Expected Utility #

        def Core.DecisionTheory.expectedUtility {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] (dp : DecisionProblem W A) (a : A) :

        Expected utility of action a given beliefs.

        Equations
        Instances For
          def Core.DecisionTheory.dpValue {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] (dp : DecisionProblem W A) (actions : Finset A) :

          Value of a decision problem: max EU over actions, or 0 if empty.

          Equations
          Instances For
            def Core.DecisionTheory.conditionalEU {W : Type u_1} {A : Type u_2} [DecidableEq W] (dp : DecisionProblem W A) (cell : Finset W) (a : A) :

            Conditional expected utility of action a given cell membership.

            Equations
            Instances For
              def Core.DecisionTheory.valueAfterLearning {W : Type u_1} {A : Type u_2} [DecidableEq W] (dp : DecisionProblem W A) (actions : Finset A) (cell : Finset W) :

              Value of decision problem after learning cell (best EU among actions)

              Equations
              Instances For
                def Core.DecisionTheory.utilityValue {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] (dp : DecisionProblem W A) (actions : Finset A) (cell : Finset W) :

                UV(C) = V(D|C) - V(D), the utility value of learning proposition C.

                Equations
                Instances For
                  def Core.DecisionTheory.cellProbability {W : Type u_1} {A : Type u_2} [DecidableEq W] (dp : DecisionProblem W A) (cell : Finset W) :

                  Probability of a cell in the partition

                  Equations
                  Instances For

                    Maximin #

                    def Core.DecisionTheory.securityLevel {W : Type u_1} {A : Type u_2} (dp : DecisionProblem W A) (worlds : Finset W) (a : A) :

                    S(a) = min_w U(w, a), security level of action a.

                    Equations
                    Instances For
                      def Core.DecisionTheory.maximinValue {W : Type u_1} {A : Type u_2} (dp : DecisionProblem W A) (worlds : Finset W) (actions : Finset A) :

                      MV = max_a min_w U(w, a), maximin value.

                      Equations
                      Instances For
                        def Core.DecisionTheory.conditionalSecurityLevel {W : Type u_1} {A : Type u_2} [DecidableEq W] (dp : DecisionProblem W A) (worlds : Finset W) (a : A) (c : WBool) :

                        Conditional security level: worst case within cell C

                        Equations
                        Instances For
                          def Core.DecisionTheory.maximinAfterLearning {W : Type u_1} {A : Type u_2} [DecidableEq W] (dp : DecisionProblem W A) (worlds : Finset W) (actions : Finset A) (c : WBool) :

                          Maximin value after learning C

                          Equations
                          Instances For
                            def Core.DecisionTheory.maximinUtilityValue {W : Type u_1} {A : Type u_2} [DecidableEq W] (dp : DecisionProblem W A) (worlds : Finset W) (actions : Finset A) (c : WBool) :

                            Maximin utility value of learning C

                            Equations
                            Instances For

                              Mention-Some / Mention-All #

                              @[reducible, inline]
                              abbrev Core.DecisionTheory.Question (W : Type u_1) :
                              Type u_1

                              Theory-neutral question type: a list of characteristic functions (cells). Used by ordering-sensitive operations (questionMaximin, isMentionSome).

                              Equations
                              Instances For

                                Convert a list of predicates to a Finset (Finset W) of cells.

                                Equations
                                Instances For
                                  def Core.DecisionTheory.resolves {W : Type u_1} {A : Type u_2} [DecidableEq W] [DecidableEq A] (dp : DecisionProblem W A) (worlds : Finset W) (actions : Finset A) (c : WBool) :

                                  C resolves decision problem if some action dominates after learning C.

                                  Per @cite{van-rooy-2003}: C resolves DP iff after learning C, there exists an action a ∈ A that weakly dominates all other actions on every world in C.

                                  Equations
                                  Instances For
                                    def Core.DecisionTheory.resolvingAnswers {W : Type u_1} {A : Type u_2} [DecidableEq W] [DecidableEq A] (dp : DecisionProblem W A) (worlds : Finset W) (actions : Finset A) (q : Question W) :
                                    List (WBool)

                                    Set of answers that resolve the decision problem

                                    Equations
                                    Instances For
                                      def Core.DecisionTheory.isMentionSome {W : Type u_1} {A : Type u_2} [DecidableEq W] [DecidableEq A] (dp : DecisionProblem W A) (worlds : Finset W) (actions : Finset A) (q : Question W) :

                                      A question has mention-some reading if multiple non-disjoint cells resolve the DP.

                                      Equations
                                      • One or more equations did not get rendered due to their size.
                                      Instances For
                                        def Core.DecisionTheory.isMentionAll {W : Type u_1} {A : Type u_2} [DecidableEq W] [DecidableEq A] (dp : DecisionProblem W A) (worlds : Finset W) (actions : Finset A) (q : Question W) :

                                        Mention-all reading: need the complete partition to resolve DP

                                        Equations
                                        Instances For

                                          Question Utility #

                                          def Core.DecisionTheory.questionUtility {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] [DecidableEq A] (dp : DecisionProblem W A) (actions : Finset A) (cells : Finset (Finset W)) :

                                          EUV(Q) = Sum_{q in Q} P(q) * UV(q), expected utility value of question Q.

                                          Equations
                                          Instances For
                                            def Core.DecisionTheory.questionMaximin {W : Type u_1} {A : Type u_2} [DecidableEq W] (dp : DecisionProblem W A) (worlds : Finset W) (actions : Finset A) (q : Question W) :

                                            MV(Q) = min_{q in Q} MV(q), maximin question value.

                                            Equations
                                            Instances For

                                              Value of Sample Information #

                                              noncomputable def Core.DecisionTheory.optimalAction {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] (dp : DecisionProblem W A) (actions : Finset A) :

                                              Optimal action: the one with highest expected utility.

                                              Noncomputable because it extracts a witness from an existential (Finset.exists_max_image via Classical.choice).

                                              Equations
                                              Instances For
                                                noncomputable def Core.DecisionTheory.valueSampleInfo {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] (dp : DecisionProblem W A) (actions : Finset A) (cell : Finset W) :

                                                VSI(C) = V(D|C) - EU(a⁰|C): the value of sample information from learning proposition C, where a⁰ is the current optimal action.

                                                Unlike UV(C) = V(D|C) - V(D), VSI is always non-negative: learning C before choosing can never hurt relative to the current best action applied within C. @cite{van-rooy-2003}, p. 742.

                                                Equations
                                                • One or more equations did not get rendered due to their size.
                                                Instances For
                                                  noncomputable def Core.DecisionTheory.expectedVSI {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] (dp : DecisionProblem W A) (actions : Finset A) (cells : Finset (Finset W)) :

                                                  EVSI(Q) = Σ P(C) · VSI(C): the expected value of sample information from asking question Q. @cite{van-rooy-2003}, p. 742.

                                                  Equations
                                                  Instances For

                                                    EUV = EVSI #

                                                    theorem Core.DecisionTheory.euv_eq_evsi {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] [DecidableEq A] (dp : DecisionProblem W A) (actions : Finset A) (cells : Finset (Finset W)) (hLTE : ∀ (a : A), cellcells, cellProbability dp cell * conditionalEU dp cell a = expectedUtility dp a) (hSum : cellcells, cellProbability dp cell = 1) :
                                                    questionUtility dp actions cells = expectedVSI dp actions cells

                                                    EUV(Q) = EVSI(Q): the expected utility value of a question equals its expected value of sample information.

                                                    @cite{van-rooy-2003}, p. 742: "The expected utility value of a question is equal to its expected value of sample information."

                                                    The two hypotheses are the law of total expectation (Σ P(C)·EU(a|C) = EU(a) for all actions) and cell probability normalization (Σ P(C) = 1).

                                                    Maximin Monotonicity #

                                                    Security level and maximin value are anti-monotone in the world set: restricting to a subset can only improve worst-case guarantees. These lemmas support the Blackwell maximin forward theorem.

                                                    theorem Core.DecisionTheory.securityLevel_le_utility {W : Type u_1} {A : Type u_2} (dp : DecisionProblem W A) (worlds : Finset W) (a : A) {w : W} (hw : w worlds) :
                                                    securityLevel dp worlds a dp.utility w a

                                                    Security level is ≤ the utility of any world in the finset.

                                                    theorem Core.DecisionTheory.securityLevel_subset_ge {W : Type u_1} {A : Type u_2} (dp : DecisionProblem W A) (S₁ S₂ : Finset W) (a : A) (hne : S₁.Nonempty) (hsub : S₁ S₂) :
                                                    securityLevel dp S₁ a securityLevel dp S₂ a

                                                    Security level on a subset ≥ security level on the superset.

                                                    min over fewer elements ≥ min over more elements.

                                                    theorem Core.DecisionTheory.maximinValue_subset_ge {W : Type u_1} {A : Type u_2} (dp : DecisionProblem W A) (S₁ S₂ : Finset W) (actions : Finset A) (hne : S₁.Nonempty) (hsub : S₁ S₂) :
                                                    maximinValue dp S₁ actions maximinValue dp S₂ actions

                                                    Maximin value on a subset ≥ maximin value on the superset.

                                                    Since securityLevel increases on subsets (fewer worst cases), maximinValue (max over actions of security levels) also increases.

                                                    theorem Core.DecisionTheory.maximinUtilityValue_monotone_cell {W : Type u_1} {A : Type u_2} [DecidableEq W] (dp : DecisionProblem W A) (worlds : Finset W) (actions : Finset A) (c1 c2 : WBool) (hSub : ∀ (w : W), c1 w = truec2 w = true) (hNe : {wworlds | c1 w = true}.Nonempty) :
                                                    maximinUtilityValue dp worlds actions c1 maximinUtilityValue dp worlds actions c2

                                                    Maximin utility value is monotone under cell containment: learning a more specific proposition (subset of worlds) gives higher MUV.

                                                    theorem Core.DecisionTheory.maximinUtilityValue_nonneg {W : Type u_1} {A : Type u_2} [DecidableEq W] (dp : DecisionProblem W A) (worlds : Finset W) (actions : Finset A) (c : WBool) (hNonempty : {wworlds | c w = true}.Nonempty) :
                                                    maximinUtilityValue dp worlds actions c 0

                                                    Maximin value of information is non-negative for nonempty cells.

                                                    When the cell is nonempty, worlds.filter c ⊆ worlds, and the maximin over a subset considers fewer worst cases, so maximinAfterLearningmaximinValue.

                                                    theorem Core.DecisionTheory.le_foldl_min {α : Type u_1} (f : α) (xs : List α) (init b : ) (hinit : b init) (hxs : xxs, b f x) :
                                                    b List.foldl (fun (m : ) (x : α) => min m (f x)) init xs

                                                    A lower bound of all values is ≤ foldl min.

                                                    theorem Core.DecisionTheory.questionMaximin_le_muv {W : Type u_1} {A : Type u_2} [DecidableEq W] (dp : DecisionProblem W A) (worlds : Finset W) (actions : Finset A) (q : Question W) (cell : WBool) (hcell : cell q) :
                                                    questionMaximin dp worlds actions q maximinUtilityValue dp worlds actions cell

                                                    The question maximin value is ≤ MUV of each cell in the question.

                                                    Special Decision Problems #

                                                    def Core.DecisionTheory.epistemicDP {W : Type u_1} {A : Type u_2} [DecidableEq W] (target : W) :

                                                    An epistemic DP where the agent wants to know the exact world state.

                                                    Equations
                                                    Instances For

                                                      A complete-information DP where only exact-state knowledge is useful.

                                                      Equations
                                                      Instances For
                                                        def Core.DecisionTheory.mentionSomeDP {W : Type u_1} (satisfies : WBool) :

                                                        A mention-some DP: any satisfier resolves the problem.

                                                        Equations
                                                        Instances For

                                                          Binary Question Value Decomposition #

                                                          For a binary partition [P, ¬P], the probability-weighted sum of conditional DP values equals Van Rooy's question utility plus the baseline DP value. This is the structural identity connecting "the value of asking a yes/no question" to the decision-theoretic question framework of @cite{van-rooy-2003}.

                                                          theorem Core.DecisionTheory.binary_cellProb_sum {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] (dp : DecisionProblem W A) (P : WBool) (hPrior : Finset.univ.sum dp.prior = 1) :
                                                          cellProbability dp {w : W | P w = true} + cellProbability dp {w : W | (!P w) = true} = 1

                                                          Cell probabilities of a binary partition [P, ¬P] sum to 1 when the prior is a proper distribution.

                                                          theorem Core.DecisionTheory.binary_question_value_decomposition {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] [DecidableEq A] (dp : DecisionProblem W A) (actions : Finset A) (P : WBool) (hPrior : Finset.univ.sum dp.prior = 1) :
                                                          have yesCell := {w : W | P w = true}; have noCell := {w : W | (!P w) = true}; cellProbability dp yesCell * valueAfterLearning dp actions yesCell + cellProbability dp noCell * valueAfterLearning dp actions noCell = questionUtility dp actions {yesCell, noCell} + dpValue dp actions

                                                          Binary question value decomposition: for a binary partition {P, ¬P},

                                                          Σ P(cell) · V(D|cell) = EUV({P,¬P}, D) + V(D)
                                                          

                                                          where the LHS is the probability-weighted sum of conditional DP values, EUV is questionUtility, and V(D) is dpValue.

                                                          Answer and Question Orderings #

                                                          @cite{van-rooy-2003}'s relevance-based orderings on answers and questions.

                                                          def Core.DecisionTheory.moreRelevantAnswer {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] [DecidableEq A] (dp : DecisionProblem W A) (actions : Finset A) (c d : Finset W) :

                                                          C >_Q D: answer C is strictly more relevant than D given question Q.

                                                          Equations
                                                          Instances For
                                                            def Core.DecisionTheory.betterQuestion {W : Type u_1} {A : Type u_2} [Fintype W] [DecidableEq W] [DecidableEq A] (dp : DecisionProblem W A) (actions : Finset A) (q q' : Finset (Finset W)) :

                                                            Q > Q': question Q is strictly better than Q'.

                                                            Equations
                                                            Instances For