@cite{beller-gerstenberg-2025}: Causation, Meaning, and Communication #
@cite{beller-gerstenberg-2025}
Formalizes Beller & Gerstenberg (2025) "Causation, Meaning, and Communication," Psychological Review 133(2), 339–381.
A counterfactual simulation model of causal language combining:
- Causal knowledge module: computes whether-causation (W), how-causation (H), and sufficient-causation (S) from counterfactual simulations
- Semantics module: defines four causal expressions as logical combinations of W, H, S (Eqs. 4–7)
- Pragmatics module: RSA inference selects the most informative true expression
Semantics (Eqs. 4–7) #
- affected(A → e) = W ∨ H ∨ S (any causal involvement)
- enabled(A → e) = W ∨ S (necessity or sufficiency)
- caused(A → e) = H ∧ (W ∨ S) ∧_σ M ∧_σ U (how-cause + counterfactual, softened by movement M and uniqueness U; σ controls softening)
- made no difference(A → e) = ¬W ∧ ¬_ν H ∧ ¬S (no causal role, with soft negation ¬_ν of how-cause; ν controls softening)
The expressions form a hierarchy of specificity: caused ⊂ enabled ⊂ affected. This hierarchy drives scalar implicatures via RSA pragmatic reasoning.
Simplification #
This formalization uses Boolean W, H, S (matching Table 1's illustrative scenarios where aspect values are 0 or 1). The full paper computes W and S as graded probabilities from noisy counterfactual simulations (Eqs. 1, 3).
The core Boolean semantics omit M (movement) and U (uniqueness), which only affect "caused" via soft conjunction. These features are process constraints that reduce the probability of "caused" when the candidate cause was stationary (M=false) or not the unique contactor (U=false). For the illustrative Table 1 scenarios, M=true and U=true throughout.
Experiments #
- Experiment 1A: Norming study (N=50). 15/20 sentence frames had median acceptability ≥ 4 (scale midpoint) for all three causal expressions.
- Experiment 1B: Semantic relations (N=53). Contradictory orderings (e.g., "caused but did not enable") rated less acceptable than consistent orderings (e.g., "affected but did not cause"). Confirms overlapping (not inconsistent) semantics for "caused" and "enabled."
- Experiment 1C: Implicature cancellation (N=51). "in fact" cancellations (e.g., "affected, in fact caused") rated more acceptable than redundant specifications (e.g., "caused, in fact affected").
- Experiment 2: Speaker task (N=62, 30 billiard scenarios). Full model r=0.87, RMSE=0.13. Fitted: θ=1.0, σ=0.65, ν=0.25, λ=40.18.
- Experiment 3: Listener task (N=50, 36 trials). Full model r=0.91, RMSE=0.10.
The four causal expressions studied in the paper (Figure 2, p. 343).
Participants chose among these to describe billiard-ball interactions:
- "Ball A caused Ball B to go through the gate."
- "Ball A enabled Ball B to go through the gate."
- "Ball A affected Ball B's going through the gate."
- "Ball A made no difference to Ball B's going through the gate."
- caused : CausalExpression
- enabled : CausalExpression
- affected : CausalExpression
- madeNoDifference : CausalExpression
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Instances For
Equations
- One or more equations did not get rendered due to their size.
Core causal aspects from the counterfactual simulation model (CSM).
- W (whether-causation): P(e' ≠ e | s, remove(A)). Was the cause counterfactually necessary? (Eq. 1)
- H (how-causation): P(Δe' ≠ Δe | s, change(A)). Did the cause affect the fine-grained outcome? Binary. (Eq. 2)
- S (sufficient-causation): P(W(A → e) | s, remove(¬A)). Would the cause have been a whether-cause if alternatives were removed? (Eq. 3)
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
- Phenomena.Causation.Studies.BellerGerstenberg2025.instBEqCausalWorld.beq x✝¹ x✝ = false
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Instances For
Equations
- One or more equations did not get rendered due to their size.
Equations
- One or more equations did not get rendered due to their size.
Boolean semantics of causal expressions (Eqs. 4–7, pp. 346–347).
This is the core of the semantics, omitting the soft constraints (σ for M/U in "caused", ν for H in "made no difference").
- affected: W ∨ H ∨ S — any causal involvement (Eq. 4)
- enabled: W ∨ S — necessity or sufficiency, not just how (Eq. 5)
- caused: H ∧ (W ∨ S) — how-cause plus counterfactual (core of Eq. 6)
- madeNoDifference: ¬W ∧ ¬H ∧ ¬S — no causal involvement (hard version of Eq. 7)
Equations
- One or more equations did not get rendered due to their size.
- Phenomena.Causation.Studies.BellerGerstenberg2025.expressionMeaning cw Phenomena.Causation.Studies.BellerGerstenberg2025.CausalExpression.affected = (cw.whether || cw.how || cw.sufficient)
- Phenomena.Causation.Studies.BellerGerstenberg2025.expressionMeaning cw Phenomena.Causation.Studies.BellerGerstenberg2025.CausalExpression.enabled = (cw.whether || cw.sufficient)
- Phenomena.Causation.Studies.BellerGerstenberg2025.expressionMeaning cw Phenomena.Causation.Studies.BellerGerstenberg2025.CausalExpression.caused = (cw.how && (cw.whether || cw.sufficient))
Instances For
"caused" implies "enabled": H ∧ (W ∨ S) → W ∨ S.
"Caused" is the most specific expression (p. 349).
"enabled" implies "affected": W ∨ S → W ∨ H ∨ S.
Full scalar chain: caused → enabled → affected.
"madeNoDifference" is the negation of "affected" (Boolean version).
Graded semantics with soft negation parameter ν (Eq. 7, p. 347).
The full paper uses ν to soften the negation of H in "made no difference": when H=true but W=false and S=false (Ball A is a how-cause only), "made no difference" gets semantic value ν instead of 0.
For "caused," the full model also softens M (movement) and U (uniqueness) via parameter σ (Eq. 6). Since the illustrative Table 1 scenarios all have M=true and U=true, we omit these here — the core H ∧ (W ∨ S) suffices.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Scenario 1: Classic Michottean launch. Ball A collides with stationary Ball B, launching it through the gate. W=1, H=1, S=1.
Equations
Instances For
Scenario 2: Ball A knocks a box out of Ball B's path. Ball B was already heading toward the gate. W=1, H=0, S=1 (double prevention).
Equations
Instances For
Scenario 3: Ball B is heading toward the gate on its own. Ball A comes up from behind and pushes it along. W=0, H=1, S=0 (how-cause only).
Equations
Instances For
Scenario 4: Ball A does not interact with Ball B at all. No causal involvement. W=0, H=0, S=0.
Equations
Instances For
Illustrative ν parameter for Table 1 (p. 348: "set to 0.2").
Equations
Instances For
In an H-only world (Scenario 3), "made no difference" has non-zero semantic value ν=1/5 due to soft negation (Eq. 7). This is the key distinction from the hard Boolean semantics.
@cite{beller-gerstenberg-2025} causal expression model as RSAConfig.
Meaning: Boolean expression semantics (1 if true, 0 if false). World prior: uniform over the 8 possible W-H-S worlds. S1 score: belief-based (rpow): score = L0(w|u)^α. α = 1 for the illustrative Table 1 computation.
The fitted model (Experiment 2) uses λ=40.18, but α=1 suffices for the qualitative predictions and matches Table 1d.
Equations
- One or more equations did not get rendered due to their size.
Instances For
S1 speaker predictions from the full 8-world Boolean model #
These theorems verify that the RSAConfig reproduces the qualitative
predictions from Table 1d using rsa_predict. The predictions arise
from the full space of 2³ = 8 causal worlds with uniform prior and
Boolean semantics.
In each scenario, the pragmatic speaker selects the most informative true expression, producing the same preference orderings as Table 1d.
Scenario 1 (full causation): S1 prefers "caused" over "enabled."
In (W=1, H=1, S=1), all positive expressions are literally true. "caused" applies to only 3/8 worlds while "enabled" applies to 6/8, so L0("caused") is more informative → S1 selects "caused."
Scenario 1: S1 prefers "enabled" over "affected."
Scenario 2 (double prevention): S1 prefers "enabled" over "affected."
In (W=1, H=0, S=1), "caused" is literally false (H=0), so the speaker chooses between "enabled" and "affected." "enabled" is more informative (6/8 vs 7/8 worlds), so S1 selects it.
Scenario 3 (H-only): S1 prefers "affected" over "caused."
In (W=0, H=1, S=0), "affected" is the only true positive expression. "caused" requires H ∧ (W ∨ S), which fails when W=S=0.
Scenario 3: "affected" also beats "madeNoDifference."
In the Boolean model (unlike the graded model with ν), "madeNoDifference" is strictly false in an H-only world: ¬W ∧ ¬H ∧ ¬S fails because H=1.
Scenario 4 (no causation): S1 prefers "madeNoDifference" over "affected."
In (W=0, H=0, S=0), only "madeNoDifference" is literally true.
L1 listener predictions: scalar implicature effects #
The pragmatic listener (L1) inverts S1 via Bayes' rule. Hearing a weaker expression triggers a scalar implicature: the listener infers the speaker chose not to use a stronger expression, shifting probability away from worlds where the stronger expression would have been true.
L1 hearing "caused": higher probability for full-causation world than no-causation world.
L1("caused") assigns positive mass only to worlds where "caused" is literally true (H ∧ (W ∨ S)). The no-causation world (F,F,F) gets zero.
L1 scalar implicature for "enabled": hearing "enabled" makes the listener prefer worlds where "caused" is false over worlds where it's true.
Both (T,F,F) and (T,T,T) make "enabled" literally true (W ∨ S). But at (T,T,T), S1 would have said "caused" instead (more informative), so L1 down-weights (T,T,T) upon hearing "enabled." This is the classic scalar implicature: "enabled" ⇝ ¬caused.
L1 hearing "madeNoDifference": identifies the no-causation world.
"madeNoDifference" is true only at (F,F,F), so L1 assigns it probability 1.
Canonical test worlds
Equations
Instances For
Equations
Instances For
Equations
Instances For
Equations
Instances For
Equations
Instances For
Equations
Instances For
In W-only world, "enabled" is true but "caused" is false. Speaker would use "enabled" (Scenario 2-like).
In H-only world, only "affected" applies among the positive expressions.
In full world, all positive expressions apply.
In none world, only "madeNoDifference" applies.
Fitted model parameters from Experiment 2 (p. 358, MLE).
θ=1.0 (counterfactual noise), σ=0.65 (softening for M/U in "caused"), ν=0.25 (softening for H in "made no difference"), λ=40.18 (RSA speaker optimality).
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Phenomena.Causation.Studies.BellerGerstenberg2025.exp2_params = { θ := 1, σ := 65 / 100, ν := 1 / 4, alpha := 4018 / 100 }
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Phenomena.Causation.Studies.BellerGerstenberg2025.exp2_fullModel = { model := "Full model", r_median := 87 / 100, rmse_median := 13 / 100 }
Instances For
Equations
- Phenomena.Causation.Studies.BellerGerstenberg2025.exp2_noPragmatics = { model := "No pragmatics", r_median := 74 / 100, rmse_median := 18 / 100 }
Instances For
Equations
- Phenomena.Causation.Studies.BellerGerstenberg2025.exp2_noSemNoPrag = { model := "No sem & no prag", r_median := 53 / 100, rmse_median := 25 / 100 }
Instances For
Full model outperforms alternatives in Experiment 2.
Experiment 3 cross-validation (Table 9, p. 366).
Equations
- Phenomena.Causation.Studies.BellerGerstenberg2025.exp3_fullModel = { model := "Full model", r_median := 91 / 100, rmse_median := 10 / 100 }
Instances For
Equations
- Phenomena.Causation.Studies.BellerGerstenberg2025.exp3_noPragmatics = { model := "No pragmatics", r_median := 83 / 100, rmse_median := 13 / 100 }
Instances For
Full model outperforms in Experiment 3 (listener task).
Bridge to Core.StructuralEquationModel #
Beller & Gerstenberg's W, H, S dimensions can be COMPUTED from structural
causal models, grounding the primitive Boolean features in the counterfactual
reasoning machinery of Core.StructuralEquationModel.
| B&G aspect | Structural definition |
|---|---|
| W (whether) | causallyNecessary — would effect still occur without cause? |
| H (how) | hasDirectLaw — does a causal law directly connect cause to effect? |
| S (sufficient) | causallySufficient — does adding cause guarantee effect? |
The mapping of H to hasDirectLaw is an approximation. B&G's how-causation
(Eq. 2) tests whether the fine-grained outcome would differ under a small
perturbation of the cause, which is a richer notion than structural directness.
For simple causal models, the two coincide.
Compute a CausalWorld from a structural causal model.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Solo cause → full causation world (W=1, H=1, S=1). When there's one direct cause and no alternatives, all three causal dimensions are active.
Overdetermination → W=false, H=true, S=true. The cause is sufficient (S) and directly connected (H), but NOT necessary (W=false) because the alternative cause in the background would produce the effect anyway.
Causal chain → W=false, H=false, S=true. Under @cite{nadathur-2024} Def 10b, the initial cause is sufficient (S) but NOT necessary (W=false): the intermediate can be set directly, bypassing the root cause. This is correct for Def 10b's domain (prerequisite semantics), though it diverges from simpler but-for tests for chain causation.
Solo cause: "caused" is literally true.
Chain causation: "caused" is NOT literally true (H=false). Despite sufficiency and necessity, the lack of direct connection means "caused" doesn't apply. Speakers would use "enabled" instead.
Chain causation: "enabled" still applies (W ∨ S = true).
Overdetermination: "caused" is literally true (H ∧ S).
Bridge between B&G's "caused" and @cite{nadathur-lauer-2020}'s make/cause.
In overdetermination, N&L's makeSem holds (sufficient) but causeSem
fails (not necessary). B&G's "caused" applies (H ∧ S = true). The
divergence reflects different questions: N&L model verb choice
(make vs cause), B&G model expression choice (caused vs enabled).
The causal expression Horn scale: ⟨affected, enabled, caused⟩.
Ordered from weakest (true in most scenarios) to strongest (true in fewest).
"madeNoDifference" is excluded: it is the Boolean complement of "affected"
(proved in madeNoDifference_iff_not_affected), not a weaker scalar
alternative.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Using "affected" implicates ¬enabled ∧ ¬caused (Experiment 1C).
Using "enabled" implicates ¬caused.
Scale ordering is grounded in semantic entailment: stronger expressions entail weaker ones at every world. This is the defining property of a Horn scale — the scale structure isn't stipulated, it's derived from the entailment relations proved in Section 4.
Structural Model → Causation Type → Expression → S1 #
Each theorem below chains across three levels of analysis:
- Structural: A
CausalDynamics(structural equation model) yields aCausalProfileviaextractProfile - Causation type: The profile determines production vs dependence
causation via
profileCausationType(fromProductionDependence.lean) - Semantic + Pragmatic: The derived
CausalWorlddetermines which expressions are literally true, and the S1 pragmatic speaker selects the most informative one
These are the deepest integration points: a change to normalDevelopment,
extractProfile, expressionMeaning, or the L0/S1 computation would
break these theorems.
Solo cause → P-CAUSE → S1 prefers "caused".
From a structural model with one direct law (a → c), the full pipeline produces the correct pragmatic prediction: the S1 speaker selects "caused" over "enabled" and "affected."
Causal chain → S1 prefers "enabled".
From a causal chain (a → intermediate → c), direct interaction is absent
(H=false): the cause operates through an intermediate. "caused" is
literally FALSE, so the S1 speaker selects "enabled" instead.
Under @cite{nadathur-2024} Def 10b, the chain root is also NOT necessary
(W=false) because the intermediate can be set directly. The causation type
becomes none (neither production nor dependence).
Overdetermination: B&G's "caused" diverges from N&L's causeSem.
With disjunctive causation (a ∨ b → c, both present), the cause is
direct and sufficient but NOT necessary. B&G's "caused" applies
(H ∧ S = true) while @cite{nadathur-lauer-2020}'s causeSem returns
false (necessity fails). This is a genuine theoretical divergence:
B&G model expression choice (how speakers describe events),
N&L model verb argument structure (make vs cause).
Overall Acceptance Rates (Figure 2) #
Proportion of "Yes" (Accurate) responses by verb. Ordering: caused > made > forced.
Rates stored as percentages (Nat) to avoid heavy Mathlib imports.
An experimental observation: verb form paired with acceptance rate (%).
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Instances For
Equations
- Phenomena.Causation.Studies.BellerGerstenberg2025.causedRate = { verb := "caused", ratePct := 48 }
Instances For
Equations
- Phenomena.Causation.Studies.BellerGerstenberg2025.madeRate = { verb := "made", ratePct := 40 }
Instances For
Equations
- Phenomena.Causation.Studies.BellerGerstenberg2025.forcedRate = { verb := "forced", ratePct := 35 }
Instances For
Acceptance ordering: caused > made > forced (Figure 2).
Main Effect Coefficients (Table 8) #
Bayesian logistic regression with verb × SUFresidALT × INT × ALT.
Coefficients stored as (numerator, denominator=100) to avoid ℚ imports.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
- Phenomena.Causation.Studies.BellerGerstenberg2025.instBEqCoefficient.beq x✝¹ x✝ = false
Instances For
A coefficient is reliable when its 95% CI excludes 0.
Equations
- c.isReliable = (decide (c.lowerCI100 > 0) && decide (c.upperCI100 > 0) || decide (c.lowerCI100 < 0) && decide (c.upperCI100 < 0))
Instances For
Equations
- Phenomena.Causation.Studies.BellerGerstenberg2025.coeff_sufResidAlt = { name := "SUFresidALT", estimate100 := 119, lowerCI100 := 89, upperCI100 := 150 }
Instances For
Equations
- Phenomena.Causation.Studies.BellerGerstenberg2025.coeff_int = { name := "INT", estimate100 := 54, lowerCI100 := 28, upperCI100 := 81 }
Instances For
Equations
- Phenomena.Causation.Studies.BellerGerstenberg2025.coeff_alt = { name := "ALT", estimate100 := -82, lowerCI100 := -111, upperCI100 := -55 }
Instances For
SUF has the largest absolute main effect.
ALT has a negative main effect (more alternatives → less acceptable).
Per-Verb Interaction Reliability (Table 9) #
Estimates of interaction intercepts by verb.
Equations
- One or more equations did not get rendered due to their size.
Instances For
made uniquely has a reliable SUFresidALT×INT interaction (Table 9).
Equations
- Phenomena.Causation.Studies.BellerGerstenberg2025.made_sufInt = { verb := "made", interaction := "SUFresidALT:INT", positive := true }
Instances For
All verbs share reliable SUFresidALT×ALT interaction.
Equations
- Phenomena.Causation.Studies.BellerGerstenberg2025.caused_sufAlt = { verb := "caused", interaction := "SUFresidALT:ALT", positive := true }
Instances For
Equations
- Phenomena.Causation.Studies.BellerGerstenberg2025.made_sufAlt = { verb := "made", interaction := "SUFresidALT:ALT", positive := true }
Instances For
Equations
- Phenomena.Causation.Studies.BellerGerstenberg2025.forced_sufAlt = { verb := "forced", interaction := "SUFresidALT:ALT", positive := true }
Instances For
All verbs share reliable INT×ALT interaction.
Equations
- Phenomena.Causation.Studies.BellerGerstenberg2025.caused_intAlt = { verb := "caused", interaction := "INT:ALT", positive := true }
Instances For
Equations
- Phenomena.Causation.Studies.BellerGerstenberg2025.made_intAlt = { verb := "made", interaction := "INT:ALT", positive := true }
Instances For
Equations
- Phenomena.Causation.Studies.BellerGerstenberg2025.forced_intAlt = { verb := "forced", interaction := "INT:ALT", positive := true }
Instances For
Acceptability Contrasts #
Non-interchangeability of causative verbs across contexts.
Acceptability judgment for a causative verb in context.
- acceptable : Acceptability
- marginal : Acceptability
- unacceptable : Acceptability
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Instances For
A single acceptability judgment: verb + context + status.
- verb : String
- context : String
- judgment : Acceptability
- source : String
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Low sufficiency, low intention context: only cause is acceptable.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
High sufficiency, high intention, low alternatives: all verbs acceptable.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
In high-SUF, high-INT, low-ALT contexts, all three verbs are acceptable.
In low-SUF contexts, only cause is acceptable.