@cite{tessler-tenenbaum-goodman-2022} — Logic, Probability, and Pragmatics in Syllogistic Reasoning #
Topics in Cognitive Science 14: 574–601.
Core Idea #
Syllogistic reasoning decomposes into two pragmatic subproblems:
- Listener: interprets premises via Bayesian update over Venn diagram states
- Speaker: selects the conclusion that best communicates beliefs to a naive listener
Three speaker models are formalized:
- S₀ (Literal Speaker, eq. 3): scores conclusions by expected literal truth
- State Communication (eq. 4): scores by expected log-likelihood (standard RSA)
- Belief Alignment (eq. 6): scores by −KL divergence (the paper's winning model, r = .82 with 3 parameters: α, φ, β)
State Communication and Belief Alignment produce identical conclusion distributions
after softmax normalization (they differ by an additive constant H(post) that cancels);
their distinct fitted parameters reflect the fitting procedure, not the functional form.
This equivalence is proved as stateCom_eq_beliefAlignment.
Grounding in Linglib #
- Syllogistic quantifiers are
every_sem/some_sem/no_semfromQuantifier.leanapplied to Venn diagram regions as entities - Subalternation (All→Some) proved via
subalternation_a_ifromQuantifier.lean - Noisy semantics via
RSA.Noise.noiseChannel - Belief Alignment utility via
Core.Divergence.klDivergence - SC ≡ BA equivalence via
Core.Divergence.kl_eq_neg_crossEntropy_plus_negEntropy - "Nothing follows" as vacuous utterance (true in every state)
The 7 non-empty regions of a three-circle (A, B, C) Venn diagram. The empty region {¬A, ¬B, ¬C} does not affect quantifier truth conditions for all, some, some...not, none, and is excluded.
Instances For
Equations
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
A Venn state: which regions are populated. 2⁷ = 128 possible states.
Equations
Instances For
Region predicates: does region r have property X?
Equations
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.hasA Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Region.A = true
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.hasA Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Region.AB = true
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.hasA Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Region.AC = true
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.hasA Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Region.ABC = true
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.hasA x✝ = false
Instances For
Equations
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.hasB Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Region.B = true
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.hasB Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Region.AB = true
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.hasB Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Region.BC = true
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.hasB Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Region.ABC = true
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.hasB x✝ = false
Instances For
Equations
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.hasC Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Region.C = true
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.hasC Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Region.AC = true
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.hasC Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Region.BC = true
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.hasC Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Region.ABC = true
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.hasC x✝ = false
Instances For
Regions as a Montague model, enabling reuse of every_sem/some_sem/no_sem.
Equations
Instances For
Equations
- One or more equations did not get rendered due to their size.
"All Xs are Ys" in state s: every populated X-region also has Y, AND there is at least one populated X-region (existential import).
Per @cite{tessler-tenenbaum-goodman-2022} footnote 4: "All As are Bs
is false if there are no As." This ensures All entails Some.
Grounded in every_sem and some_sem from Quantifier.lean.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Instances For
Equations
- One or more equations did not get rendered due to their size.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.instBEqSyllogism.beq x✝¹ x✝ = false
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Instances For
The 9 possible conclusions: 4 quantifiers × 2 term orders + NVC.
- allAC : Conclusion
- allCA : Conclusion
- someAC : Conclusion
- someCA : Conclusion
- someNotAC : Conclusion
- someNotCA : Conclusion
- noAC : Conclusion
- noCA : Conclusion
- nvc : Conclusion
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Instances For
Equations
- One or more equations did not get rendered due to their size.
Does the conclusion use A→C term order (vs C→A)? Used for the figural bias parameter β.
Equations
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Conclusion.allAC.isAC = true
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Conclusion.someAC.isAC = true
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Conclusion.someNotAC.isAC = true
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Conclusion.noAC.isAC = true
- x✝.isAC = false
Instances For
Figural bias prior weight, determined by the Aristotelian figure.
The "figural effect" biases toward conclusions whose end-term order matches the chain direction through the middle term B:
- Figure 1 (A-B, B-C): B is predicate of P1, subject of P2 → chain reads A→B→C → A-C conclusions get weight β
- Figure 4 (B-A, C-B): B is subject of P1, predicate of P2 → chain reads C→B→A → C-A conclusions get weight β
- Figures 2 & 3: B occupies the same position in both premises → no directional chain → all conclusions get weight 1
NVC always gets weight 1 (no directional bias for "nothing follows"). The paper fits β ≈ 2.01 (MAP).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Literal meaning of each conclusion in a Venn state. "Nothing follows" is true in every state — the vacuous utterance.
Equations
- One or more equations did not get rendered due to their size.
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.concMeaning Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Conclusion.nvc x✝ = true
Instances For
"Nothing follows" is always true: the key insight enabling the Belief Alignment model to rationally produce NVC when premises are uninformative.
Barbara (All A-B, All B-C ⊢ All A-C) is logically valid.
The proof chains through populated regions: if r is a populated
A-region, premise 1 gives it B, making it a populated B-region,
and premise 2 gives it C. This is a state-restricted form of
transitivity — stricter than every_transitive from Quantifier.lean,
which applies to unrestricted universal quantification. Here the
restrictors shift between premises (s ∧ hasA vs s ∧ hasB),
and the middle term B bridges them via the population predicate s.
Barbara also validates "Some A are C" (by subalternation: All → Some).
Uses barbara_valid for the All A-C premise, then extracts the
existential import — with existential import in syllAll, the
non-emptiness hypothesis is built in, so no separate witness needed.
"All A-B, All C-B" is logically invalid: no Aristotelian conclusion holds in all compatible states. Proof by two counterexamples.
Noisy semantics ℒ(u, s): a small probability φ of misjudging truth value.
Directly instantiates RSA.Noise.noiseChannel(1−φ, φ, ⟦u⟧):
ℒ(u,s) = 1−φ when ⟦u⟧(s) = true, φ when false.
Equations
- One or more equations did not get rendered due to their size.
Instances For
When noise is zero, noisy meaning reduces to literal meaning.
Noisy semantics assigns the NVC utterance a constant value in every state,
since concMeaning .nvc s = true for all s. This means L₀(s|NVC) = P(s):
hearing "nothing follows" does not update the listener's beliefs.
L₀ joint likelihood of two premises in state s (unnormalized).
Computes ℒ(u₁,s) · ℒ(u₂,s) — the likelihood term only. The full L₀ posterior (eq. 2) also includes the state prior P(s). The paper fixes θ = 0.5 per region, making P(s) = 0.5⁷ = 1/128 for all states — a uniform prior that cancels in normalization. For this reason, the likelihood alone determines the relative posterior weights.
Equations
- One or more equations did not get rendered due to their size.
Instances For
S₀ (Literal Speaker, eq. 3): scores conclusions by expected literal truth under the reasoner's posterior.
S₀(u₃ | u₁,u₂) ∝ exp[α · Σ_s ℒ(u₃,s) · L₀(s|u₁,u₂)]
Here ℒ(u₃,s) is the deterministic semantic function (not the noisy version inside L₀). This speaker samples states from the posterior and randomly selects conclusions that are literally true.
Equations
- One or more equations did not get rendered due to their size.
Instances For
State Communication (S₁, eq. 4): scores conclusions by expected log-likelihood — standard RSA informativity applied to syllogisms.
S₁(u₃ | u₁,u₂) ∝ exp[α · Σ_s L₀(s|u₁,u₂) · ln L₀(s|u₃)]
The two L₀ agents are distinct: L₀(s|u₁,u₂) is the reasoner who interpreted the premises; L₀(s|u₃) is a hypothetical naive listener who interprets just the conclusion. Both use noisy semantics (same φ).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Belief Alignment (S₁, eq. 6): the paper's winning model. Scores conclusions by negative KL divergence between the reasoner's full posterior and the naive listener's posterior given the conclusion.
S₁(u₃ | u₁,u₂) ∝ exp[α · −KL(L₀(·|u₁,u₂) ‖ L₀(·|u₃))]
Uses Core.Divergence.klDivergence directly.
Equations
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.beliefAlignmentScore premPost naivePost α c = Real.exp (α * -Core.Divergence.klDivergence premPost (naivePost c))
Instances For
State Communication and Belief Alignment differ by an additive constant (the entropy H(post)) that does not depend on the conclusion.
By kl_eq_neg_crossEntropy_plus_negEntropy from Divergence.lean:
KL(P ∥ Q) = Σ P·log P − Σ P·log Q
So: −KL(P ∥ Q) = Σ P·log Q − Σ P·log P = [State Com utility] + H(P).
Since H(P) is constant in the conclusion c, it cancels in softmax normalization: both models produce identical conclusion distributions. The paper's different fit statistics (r = .67 vs .82) reflect different optimal α values found by MCMC, not different functional forms.
The Belief Alignment score for NVC depends on how much the premises shifted beliefs from the prior. When premises are uninformative (posterior ≈ prior), KL(post ‖ prior) ≈ 0, so −KL ≈ 0, and exp(α · 0) = 1 — the maximum score. This is why the model naturally produces NVC for uninformative premise combinations.
Subalternation in the region model: "All A are C" entails "Some A are C".
With existential import built into syllAll, no separate non-emptiness
hypothesis is needed — syllAll guarantees at least one A exists.
For the Belief Alignment model, this means "All A-C" produces a more peaked L₀ posterior than "Some A-C", yielding lower KL divergence and hence higher speaker utility — explaining why Barbara participants prefer "All" over the also-valid "Some".
Truth value of premise 1 in state s.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Truth value of premise 2 in state s.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Barbara: All A-B, All B-C. Figure 1 (paradigmatic valid syllogism).
Equations
- One or more equations did not get rendered due to their size.
Instances For
All A-B, All C-B. Figure 3 (paradigmatic invalid syllogism).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Some A-B, Some B-C. Figure 1.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Unnormalized L₀ likelihood for a syllogism in state s. Computes ℒ(p₁,s) · ℒ(p₂,s) where ℒ is noisy semantics. The uniform prior (θ = 0.5) cancels in normalization.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Normalization constant for naive L₀ on a single conclusion.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Naive L₀ posterior for a conclusion: L₀(s|c) ∝ ℒ(c,s). The naive listener has heard only the conclusion, not the premises.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Belief Alignment score for conclusion c given syllogism syl. Uses the full pipeline: premises → L₀ posterior → KL → exp. Parameters: α (rationality), φ (noise), β (figural bias).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Conclusion probability: P(c|syl) = baScore(c) / Σ_c' baScore(c').
Equations
- One or more equations did not get rendered due to their size.
Instances For
MAP estimates from the Bayesian data analysis on Ragni et al. 2019 data. α ≈ 6.88, φ ≈ 0.06, β ≈ 2.01.
Equations
Instances For
Equations
Instances For
Equations
Instances For
KL divergence over allStates in Float arithmetic.
Skips states with P(s) = 0 (contributes 0 to KL by convention).
Equations
- One or more equations did not get rendered due to their size.
Instances For
All 9 conclusions as a list.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Compute conclusion distribution for a syllogism using Float arithmetic.
L₀ posteriors are computed exactly in ℚ (via l0Post, naiveL0Post),
then converted to Float for the KL divergence and softmax steps.
Parameters: α (rationality, Float), φ and β (exact in ℚ).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Short name for display.
Equations
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Conclusion.allAC.short = "Aac"
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Conclusion.allCA.short = "Aca"
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Conclusion.someAC.short = "Iac"
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Conclusion.someCA.short = "Ica"
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Conclusion.someNotAC.short = "Oac"
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Conclusion.someNotCA.short = "Oca"
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Conclusion.noAC.short = "Eac"
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Conclusion.noCA.short = "Eca"
- Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.Conclusion.nvc.short = "NVC"
Instances For
Compact string output for a syllogism's predicted distribution, showing conclusions sorted by predicted probability.
Equations
- One or more equations did not get rendered due to their size.
Instances For
For Barbara (All A-B, All B-C), every L₀-probable state satisfies
All A-C. Proof: states where both premises are literally true form
a subset of states where All A-C holds (by barbara_valid).
With noise φ, the L₀ posterior concentrates on these states: the likelihood ℒ(p₁,s)·ℒ(p₂,s) is (1−φ)² for consistent states but only (1−φ)·φ, φ·(1−φ), or φ² for inconsistent ones.
This theorem verifies computably that every state where BOTH premises are literally true also satisfies All A-C — the semantic backbone of the Belief Alignment model's "All A-C" preference for Barbara.
For the invalid syllogism (All A-B, All C-B), the L₀ posterior does NOT concentrate on any single conclusion — some consistent states satisfy All A-C while others falsify it.