Clarification: When to Ask vs. When to Act #
@cite{raiffa-schlaifer-1961}
When agents face uncertainty about an interlocutor's goals, they choose between acting under uncertainty and asking clarification questions (CQs). Both @cite{tsvilodub-etal-2026} and @cite{dong-etal-2026} find that this choice is governed by the expected value of perfect information (EVPI): agents clarify when EVPI exceeds communication cost.
EVPI captures the interaction of uncertainty and stakes — it is high when (a) uncertainty is high AND (b) acting incorrectly is costly. This interaction is the core empirical finding shared by both papers.
Connection to existing infrastructure #
EVPI is the maximum possible questionUtility (@cite{van-rooy-2003}):
it equals questionUtility on the identity partition, where each world
is its own cell. Any specific clarification question yields at most EVPI.
Maximum utility achievable at world w across actions.
With Finset actions, this is sup' over utilities at world w.
Equations
- Phenomena.Clarification.bestUtilityAt dp actions w = if h : actions.Nonempty then actions.sup' h (dp.utility w) else 0
Instances For
Oracle value: expected utility under perfect information.
Σ_w P(w) · max_a U(w, a)
Equations
- Phenomena.Clarification.oracleValue dp actions = ∑ w : W, dp.prior w * Phenomena.Clarification.bestUtilityAt dp actions w
Instances For
Expected value of perfect information (EVPI).
EVPI = oracleValue − dpValue = Σ_w P(w) · max_a U(w,a) − max_a Σ_w P(w) · U(w,a)
Equivalently, the expected regret of the current best action (@cite{tsvilodub-etal-2026}), or the upper bound on VoI for any question (@cite{dong-etal-2026}).
@cite{raiffa-schlaifer-1961}
Equations
- Phenomena.Clarification.evpi dp actions = Phenomena.Clarification.oracleValue dp actions - Core.DecisionTheory.dpValue dp actions
Instances For
EVPI is non-negative: acting with perfect information is at least as good as acting without.
Proof sketch: For each action a, its expected utility EU(a) equals
Σ_w P(w) · U(w,a). The oracle value Σ_w P(w) · max_a' U(w,a')
is pointwise ≥ Σ_w P(w) · U(w,a) since max_a' U(w,a') ≥ U(w,a).
Therefore oracleValue ≥ EU(a) for every a, hence
oracleValue ≥ max_a EU(a) = dpValue.