Documentation

Linglib.Theories.Semantics.Dynamic.Effects.Probability.Basic

The Probability Monad #

@cite{lassiter-goodman-2017}

We define P α abstractly as a structure with pure and bind operations satisfying the monad laws. This allows us to reason about probabilistic programs without committing to a specific representation (PMF, measure, etc.).

In Grove & White notation:

Abstract probability monad interface.

A probability monad provides:

  • pure: Lift a value to a trivial distribution
  • bind: Sequence probabilistic computations
  • Monad laws as equalities

This is the semantic interface; implementations may use PMFs, measures, etc.

Note: We use Type instead of Type* to avoid universe issues. For semantic work, this is typically sufficient.

  • pure {α : Type} : αP α

    Trivial distribution concentrated at a value

  • bind {α β : Type} : P α(αP β)P β

    Sequence: sample from m, then continue with k

  • pure_bind {α β : Type} (v : α) (k : αP β) : bind (pure v) k = k v

    Left identity: pure v >>= k = k v

  • bind_pure {α : Type} (m : P α) : bind m pure = m

    Right identity: m >>= pure = m

  • bind_assoc {α β γ : Type} (m : P α) (n : αP β) (o : βP γ) : bind (bind m n) o = bind m fun (x : α) => bind (n x) o

    Associativity: (m >>= n) >>= o = m >>= (λx. n x >>= o)

Instances
    def Semantics.Dynamic.Probabilistic.ProbMonad.map {P : TypeType} [ProbMonad P] {α β : Type} (f : αβ) (m : P α) :
    P β

    Map a function over a distribution

    Equations
    Instances For
      def Semantics.Dynamic.Probabilistic.ProbMonad.seq {P : TypeType} [ProbMonad P] {α β : Type} (m : P α) (n : P β) :
      P β

      Sequence two distributions, ignoring the first result

      Equations
      Instances For
        theorem Semantics.Dynamic.Probabilistic.ProbMonad.map_id {P : TypeType} [ProbMonad P] {α : Type} (m : P α) :
        map id m = m

        Map preserves identity

        theorem Semantics.Dynamic.Probabilistic.ProbMonad.map_comp {P : TypeType} [ProbMonad P] {α β γ : Type} (f : αβ) (g : βγ) (m : P α) :
        map g (map f m) = map (g f) m

        Map composes

        Parameterized State Monad #

        In Grove & White's parameterized state monad, the state type can change during computation. This models how discourse updates can modify the structure of the context (e.g., pushing questions onto the QUD stack).

        P^σ_σ' α = σ → P(α × σ')
        

        The parameters σ and σ' are the input and output state types.

        Parameterized probabilistic state monad.

        Maps input state σ to a distribution over (value, output state) pairs. The output state type σ' can differ from σ, allowing type-changing updates.

        This is Grove & White's P^σ_σ' α.

        Equations
        Instances For
          def Semantics.Dynamic.Probabilistic.PState.pure {P : TypeType} [ProbMonad P] {σ α : Type} (v : α) :
          PState P σ σ α

          Return for the parameterized state monad.

          Returns the value paired with the unchanged state. Grove & White: ⌜v⌝^σ = λs. ⌜(v, s)⌝

          Equations
          Instances For
            def Semantics.Dynamic.Probabilistic.PState.bind {P : TypeType} [ProbMonad P] {σ σ' σ'' α β : Type} (m : PState P σ σ' α) (k : αPState P σ' σ'' β) :
            PState P σ σ'' β

            Bind for the parameterized state monad.

            Sequences stateful-probabilistic computations, threading state through. Grove & White: do { x ← m; k x } = λs. (x, s') ← m(s); k(x)(s')

            Equations
            Instances For
              def Semantics.Dynamic.Probabilistic.PState.view {P : TypeType} [ProbMonad P] {σ α : Type} (proj : σα) :
              PState P σ σ α

              View (get) a component of the state.

              Returns the value of applying proj to the current state, without modification.

              Equations
              Instances For
                def Semantics.Dynamic.Probabilistic.PState.set {P : TypeType} [ProbMonad P] {σ σ' : Type} (upd : σσ') :
                PState P σ σ' Unit

                Set (put) a component of the state.

                Returns the new state created by upd, with a trivial value.

                Equations
                Instances For
                  theorem Semantics.Dynamic.Probabilistic.PState.pure_bind {P : TypeType} [ProbMonad P] {σ σ' α β : Type} (v : α) (k : αPState P σ σ' β) :
                  (pure v).bind k = k v

                  Left identity for PState: pure v >>= k = k v

                  theorem Semantics.Dynamic.Probabilistic.PState.bind_pure {P : TypeType} [ProbMonad P] {σ σ' α : Type} (m : PState P σ σ' α) :
                  m.bind pure = m

                  Right identity for PState: m >>= pure = m

                  theorem Semantics.Dynamic.Probabilistic.PState.bind_assoc {P : TypeType} [ProbMonad P] {σ σ' σ'' σ''' α β γ : Type} (m : PState P σ σ' α) (n : αPState P σ' σ'' β) (o : βPState P σ'' σ''' γ) :
                  (m.bind n).bind o = m.bind fun (x : α) => (n x).bind o

                  Associativity for PState.

                  Conditioning #

                  Grove & White's observe operation conditions a distribution on a boolean.

                  observe : Bool → P Unit
                  observe true continues; observe false blocks
                  

                  This is the mechanism for assertion: update the CG by observing the asserted proposition is true.

                  Extended probability monad with conditioning.

                  Adds observe for conditioning on boolean observations.

                  Instances

                    Observe filters: observe true then return is identity

                    Observe false blocks: any continuation after observe false gives fail

                    Choice Operations #

                    RSA's S1 isn't just conditioning - it's choosing an utterance weighted by utility. This requires a choose operation in addition to observe.

                    choose : (α → ℚ) → P α -- sample from weighted distribution
                    

                    The relationship:

                    Probability monad with choice (for speaker models).

                    Adds choose for sampling from weighted distributions. This is what S1's softmax requires.

                    Instances
                      def Semantics.Dynamic.Probabilistic.ChoiceProbMonad.softmaxChoice {P : TypeType} [ChoiceProbMonad P] {α : Type} [Fintype α] (utility : α) (temperature : ) :
                      P α

                      Softmax choice: choose with exp-scaled weights.

                      This is S1's decision rule: P(u) ∝ exp(α · utility(u))

                      Note: We use ℚ so can't compute exp directly. In practice, implementations use Float or work with log-probabilities. This definition is for the interface.

                      Equations
                      Instances For

                        Threshold Semantics #

                        Threshold semantics + threshold uncertainty produces graded truth values. This is a special case of the PDS framework.

                        For a gradable adjective like "tall":

                        With uncertainty over θ:

                        This probability is the graded truth value.

                        def Semantics.Dynamic.Probabilistic.thresholdSem {ε : Type} (measure : ε) (threshold : ) (x : ε) :

                        Threshold semantics: entity satisfies predicate if measure exceeds threshold.

                        Equations
                        Instances For
                          def Semantics.Dynamic.Probabilistic.gradedFromThreshold {ε Θ : Type} [Fintype Θ] [DecidableEq Θ] (measure : ε) (thresholds prior : Θ) (x : ε) :

                          Graded truth from threshold uncertainty.

                          Given a prior over thresholds, the graded truth is the probability that the entity's measure exceeds the threshold.

                          Equations
                          Instances For
                            theorem Semantics.Dynamic.Probabilistic.graded_eq_bool_of_point_mass {ε Θ : Type} [Fintype Θ] [DecidableEq Θ] (measure : ε) (thresholds : Θ) (θ₀ : Θ) (x : ε) :
                            have pointMass := fun (θ : Θ) => if θ = θ₀ then 1 else 0; gradedFromThreshold measure thresholds pointMass x = if measure x > thresholds θ₀ then 1 else 0

                            For a point-mass prior (no uncertainty), graded truth reduces to Boolean.

                            This shows that graded semantics reduces to Boolean semantics when there's no parameter uncertainty.

                            Connection to RSA #

                            Grove & White's framework connects to RSA as follows:

                            1. Literal meaning φ: a function ι → Bool (proposition)
                            2. Common ground: a distribution P ι over indices
                            3. Assertion: observe(φ(i)) for i ← cg
                            4. Probability computation: Pr[φ] = E_i[1_{φ(i)}]

                            RSA's graded φ emerges from:

                            This is exactly Lassiter & Goodman's "threshold + uncertainty = graded".

                            def Semantics.Dynamic.Probabilistic.probProp {ι : Type} [Fintype ι] (mass : ι) (φ : BProp ι) :

                            Probability of a proposition in a finite distribution.

                            This is Pr[φ] = E_i[1_{φ(i)}] in Grove & White notation. For finite distributions, this is the sum of masses where φ holds.

                            Equations
                            Instances For
                              theorem Semantics.Dynamic.Probabilistic.probProp_true {ι : Type} [Fintype ι] (mass : ι) :
                              (probProp mass fun (x : ι) => true) = Finset.univ.sum mass

                              Probability of a true proposition is the total mass (1 for normalized distributions).

                              theorem Semantics.Dynamic.Probabilistic.probProp_false {ι : Type} [Fintype ι] (mass : ι) :
                              (probProp mass fun (x : ι) => false) = 0

                              Probability of a false proposition is 0.