Bayesian Theory of Mind (BToM) @cite{baker-jara-ettinger-saxe-tenenbaum-2017} #
@cite{clark-1996} @cite{houlihan-kleiman-weiner-hewitt-tenenbaum-saxe-2023} @cite{kratzer-2006} @cite{ying-zhi-xuan-wong-mansinghka-tenenbaum-2025}
Domain-general cognitive architecture for action explanation: how observers explain agents' behavior by jointly inferring mental states, shared states, and environmental constraints.
BToM is not specific to language — it applies equally to physical action understanding, game-theoretic reasoning, emotion attribution, and communication. Linguistic specialization (RSA) is established via the RSA-BToM bridge.
Latent Variable Ontology #
Latent variables in the generative model fall into three ontological categories:
- Mental: Individual mental states private to the agent — Belief, Desire, Percept. These are properties of a single mind.
- Shared: Intersubjective states maintained between agents — common ground, mutual belief, established precedents. Not reducible to either agent's individual mental state.
- Medium: Non-mental environmental constraints on action — the structure of the communication channel, conventions, physical affordances.
The Generative Model #
An agent's action is generated by the causal chain:
World → Percept → Belief
↓
Belief × Desire × Shared × Medium → Plan → Action
↓
Shared × Action × World → Shared'
The observer sees an action and jointly infers the latent variables across all three categories via Bayesian inversion:
P(p, b, d, s, m, w | a) ∝ P(a | b, d, s, m) · P(b | p) · P(p | w) · P(d | w) · P(s) · P(m) · P(w)
Note that P(d | w) is world-conditioned: in many domains, the prior
distribution over desires depends on the world state. In RSA, this corresponds
to latentPrior(w, l) — the distribution over speaker latent states may
depend on the true world (e.g., observation probability in @cite{goodman-stuhlmuller-2013}).
Causal Structure #
The perception chain W → P → B is one specific causal architecture —
it decomposes world-to-belief inference into observation then updating.
Other architectures are possible (direct world-to-belief, joint
perception-belief formation, belief from memory + percept). The current
structure follows @cite{baker-jara-ettinger-saxe-tenenbaum-2017} and is sufficient for RSA grounding,
where the chain collapses to identity (perfect perception and knowledge).
Limitations #
This is a first-order model: the observer inverts a single agent's
generative model. It does not represent recursive mentalizing ("I think
that you think that I think..."). Extending to recursive BToM would require
nesting: the agent's planModel would itself contain a BToM inference step
over the observer's mental states. This is conceptually straightforward but
computationally expensive and not needed for single-utterance RSA models.
Extensibility #
The model is designed for domain-specific extension without modification:
- Discourse dynamics:
sharedUpdatechains BToM inference steps via shared-state evolution (discourse referent introduction, QUD push/pop, presupposition accommodation, Clarkian conceptual pacts). - Emotion attribution: Post-inference appraisals (goal congruence, prediction error, counterfactual reasoning) can be computed from the inferred marginals without extending the generative model itself.
Ontological classification of latent variables in BToM.
Different theoretical positions classify the same variable differently.
For example, in the RSA-BToM bridge, an RSA model's Interp parameter
might be classified as mental (speaker's intended meaning) or medium
(structural ambiguity in the grammar), depending on one's theory of
scope ambiguity.
- mental : LatentCategory
Individual mental states, private to a single agent's mind. Decomposes into Belief, Desire, and Percept.
- medium : LatentCategory
Non-mental environmental constraints on action. Language structure, conventions, channel properties, affordances.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Temporal dynamics of a latent factor.
Orthogonal to LatentCategory: a belief can be episodic (specific to one
observation), dispositional (stable trait), or dynamic (evolving over time).
This distinction matters for multi-observation models where factors may be
shared across inference steps or updated between them.
| Dynamics | Example (mental) | Example (shared) |
|---|---|---|
| episodic | this-utterance belief/goal | current QUD |
| dispositional | personality, stable preference | cultural norms |
| dynamic | accumulating evidence | evolving common ground |
- episodic : FactorDynamics
Varies per observation. Fresh inference target at each step. Most RSA latent variables are episodic: the speaker's goal, intended interpretation, or lexicon for this specific utterance.
- dispositional : FactorDynamics
Stable trait of the agent or environment. Shared across observations. E.g., the speaker's general politeness preference, a language's conventionalized scale structure.
- dynamic : FactorDynamics
Evolves over observations via an update function. E.g., the listener's accumulating evidence about speaker knowledge, or common ground that grows through discourse.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
A BToM generative model for action explanation.
The observer explains an agent's action by inverting a generative model with six latent variable types organized into three ontological categories:
Mental (private to the agent):
Percept: what the agent observes (generated from world state)Belief: the agent's epistemic state (generated from percepts)Desire: the agent's goals
Shared (intersubjective):
Shared: common ground, mutual beliefs, established conventions
Medium (environmental constraints):
Medium: structure of the communication channel, conventions
The generative model's causal structure:
W → perceptModel → P → beliefModel → B
B × D × S × M → planModel → A
S × A × W → sharedUpdate → S'
For domains that don't use all components, set unused types to Unit
with constant-1 scoring functions.
The model is parameterized by a score type F (typically ℚ for exact
computation or ℝ for mathematical proofs).
- perceptModel : World → Percept → F
How percepts arise from the world: P(p | w). For agents with perfect perception, use a point mass
[p = w]. - beliefModel : Percept → Belief → F
How beliefs arise from percepts: P(b | p). For agents with perfect inference, use a point mass
[b = p]. - planModel : Belief → Desire → Shared → Medium → Action → F
Rational action given all latent states: P(a | b, d, s, m). This is the agent's planning/decision function — the forward model that the observer inverts.
- worldPrior : World → F
Prior over world states: P(w).
- desirePrior : World → Desire → F
Prior over desires: P(d | w). World-conditioned: the distribution over agent goals may depend on the world state. For world-independent priors, use
fun _ => prior. - mediumPrior : Medium → F
Prior over medium constraints: P(m).
Instances For
The observer's unnormalized joint score over all latent variables given an observed action:
joint(a, p, b, d, s, m, w) =
planModel(b,d,s,m,a) · beliefModel(p,b) · perceptModel(w,p) ·
worldPrior(w) · desirePrior(w,d) · sharedPrior(s) · mediumPrior(m)
This is the full generative model probability (unnormalized). All marginal posteriors are sums over subsets of this joint.
Equations
- One or more equations did not get rendered due to their size.
Instances For
World marginal: P(w | a) ∝ Σ_{p,b,d,s,m} joint(a, p, b, d, s, m, w).
This is the primary inference target: given an observed action, what is the state of the world?
Equations
- model.worldMarginal a w = ∑ p : P, ∑ b : B, ∑ d : D, ∑ s : S, ∑ m : M, model.jointScore a p b d s m w
Instances For
The world prior factors out of the world marginal:
P(w | a) ∝ P(w) · Σ_{p,b,d,s,m} P(a|b,d,s,m) · P(b|p) · P(p|w) · P(d|w) · P(s) · P(m)
Belief marginal: P(b | a) ∝ Σ_{p,d,s,m,w} joint(a, p, b, d, s, m, w).
Equations
- model.beliefMarginal a b = ∑ p : P, ∑ d : D, ∑ s : S, ∑ m : M, ∑ w : W, model.jointScore a p b d s m w
Instances For
Desire marginal: P(d | a) ∝ Σ_{p,b,s,m,w} joint(a, p, b, d, s, m, w).
Equations
- model.desireMarginal a d = ∑ p : P, ∑ b : B, ∑ s : S, ∑ m : M, ∑ w : W, model.jointScore a p b d s m w
Instances For
Percept marginal: P(p | a) ∝ Σ_{b,d,s,m,w} joint(a, p, b, d, s, m, w).
Equations
- model.perceptMarginal a p = ∑ b : B, ∑ d : D, ∑ s : S, ∑ m : M, ∑ w : W, model.jointScore a p b d s m w
Instances For
Medium marginal: P(m | a) ∝ Σ_{p,b,d,s,w} joint(a, p, b, d, s, m, w).
Equations
- model.mediumMarginal a m = ∑ p : P, ∑ b : B, ∑ d : D, ∑ s : S, ∑ w : W, model.jointScore a p b d s m w
Instances For
Belief-weighted expectation: given a scoring function on belief states, compute the belief-marginal weighted sum.
This is the general form of LaBToM's Pr(Agent, φ) computation: instantiate f b = if φ(b) then 1 else 0 to get
the unnormalized Pr_obs(Agent, φ | a) from BToM belief marginals.
More generally, any quantity that depends on the agent's belief state (credence, expected utility, prediction error) can be computed as a belief expectation.
Equations
- model.beliefExpectation f a = ∑ b : B, model.beliefMarginal a b * f b
Instances For
Connection to Content Individuals #
BToM's Belief and Desire types correspond to content individuals — see Core.ContentIndividual for the ontological sort.
A BToMModel can be instantiated with ContentIndividual W as the Belief
type, making the observer's posterior a distribution over content individuals.