Documentation

Linglib.Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022

@cite{tessler-tenenbaum-goodman-2022} — Logic, Probability, and Pragmatics in Syllogistic Reasoning #

Topics in Cognitive Science 14: 574–601.

Core Idea #

Syllogistic reasoning decomposes into two pragmatic subproblems:

  1. Listener: interprets premises via Bayesian update over Venn diagram states
  2. Speaker: selects the conclusion that best communicates beliefs to a naive listener

Three speaker models are formalized:

State Communication and Belief Alignment produce identical conclusion distributions after softmax normalization (they differ by an additive constant H(post) that cancels); their distinct fitted parameters reflect the fitting procedure, not the functional form. This equivalence is proved as stateCom_eq_beliefAlignment.

Grounding in Linglib #

The 7 non-empty regions of a three-circle (A, B, C) Venn diagram. The empty region {¬A, ¬B, ¬C} does not affect quantifier truth conditions for all, some, some...not, none, and is excluded.

Instances For
    Equations
    • One or more equations did not get rendered due to their size.
    Instances For
      Equations
      • One or more equations did not get rendered due to their size.
      Equations
      • One or more equations did not get rendered due to their size.

      "All Xs are Ys" in state s: every populated X-region also has Y, AND there is at least one populated X-region (existential import).

      Per @cite{tessler-tenenbaum-goodman-2022} footnote 4: "All As are Bs is false if there are no As." This ensures All entails Some. Grounded in every_sem and some_sem from Quantifier.lean.

      Equations
      • One or more equations did not get rendered due to their size.
      Instances For

        "Some Xs are Ys": some populated X-region also has Y.

        Equations
        • One or more equations did not get rendered due to their size.
        Instances For

          "Some Xs are not Ys": some populated X-region lacks Y.

          Equations
          • One or more equations did not get rendered due to their size.
          Instances For

            "No Xs are Ys": no populated X-region has Y.

            Equations
            • One or more equations did not get rendered due to their size.
            Instances For

              Syllogistic quantifier: the four Aristotelian quantifiers.

              Instances For
                Equations
                • One or more equations did not get rendered due to their size.
                Instances For
                  Equations
                  • One or more equations did not get rendered due to their size.

                  A syllogism is a pair of quantified premises sharing middle term B. order1AB = true means premise 1 is "Q₁ A-B"; false means "Q₁ B-A". order2BC = true means premise 2 is "Q₂ B-C"; false means "Q₂ C-B". This gives 4 × 2 × 4 × 2 = 64 syllogisms.

                  Instances For
                    Equations
                    • One or more equations did not get rendered due to their size.
                    Instances For
                      Equations
                      • One or more equations did not get rendered due to their size.
                      Instances For

                        The 9 possible conclusions: 4 quantifiers × 2 term orders + NVC.

                        Instances For
                          Equations
                          • One or more equations did not get rendered due to their size.
                          Instances For
                            Equations
                            • One or more equations did not get rendered due to their size.

                            Figural bias prior weight, determined by the Aristotelian figure.

                            The "figural effect" biases toward conclusions whose end-term order matches the chain direction through the middle term B:

                            • Figure 1 (A-B, B-C): B is predicate of P1, subject of P2 → chain reads A→B→C → A-C conclusions get weight β
                            • Figure 4 (B-A, C-B): B is subject of P1, predicate of P2 → chain reads C→B→A → C-A conclusions get weight β
                            • Figures 2 & 3: B occupies the same position in both premises → no directional chain → all conclusions get weight 1

                            NVC always gets weight 1 (no directional bias for "nothing follows"). The paper fits β ≈ 2.01 (MAP).

                            Equations
                            • One or more equations did not get rendered due to their size.
                            Instances For

                              Literal meaning of each conclusion in a Venn state. "Nothing follows" is true in every state — the vacuous utterance.

                              Equations
                              Instances For

                                "Nothing follows" is always true: the key insight enabling the Belief Alignment model to rationally produce NVC when premises are uninformative.

                                Barbara (All A-B, All B-C ⊢ All A-C) is logically valid.

                                The proof chains through populated regions: if r is a populated A-region, premise 1 gives it B, making it a populated B-region, and premise 2 gives it C. This is a state-restricted form of transitivity — stricter than every_transitive from Quantifier.lean, which applies to unrestricted universal quantification. Here the restrictors shift between premises (s ∧ hasA vs s ∧ hasB), and the middle term B bridges them via the population predicate s.

                                Barbara also validates "Some A are C" (by subalternation: All → Some). Uses barbara_valid for the All A-C premise, then extracts the existential import — with existential import in syllAll, the non-emptiness hypothesis is built in, so no separate witness needed.

                                Noisy semantics ℒ(u, s): a small probability φ of misjudging truth value. Directly instantiates RSA.Noise.noiseChannel(1−φ, φ, ⟦u⟧): ℒ(u,s) = 1−φ when ⟦u⟧(s) = true, φ when false.

                                Equations
                                • One or more equations did not get rendered due to their size.
                                Instances For

                                  When noise is zero, noisy meaning reduces to literal meaning.

                                  Noisy semantics assigns the NVC utterance a constant value in every state, since concMeaning .nvc s = true for all s. This means L₀(s|NVC) = P(s): hearing "nothing follows" does not update the listener's beliefs.

                                  L₀ joint likelihood of two premises in state s (unnormalized).

                                  Computes ℒ(u₁,s) · ℒ(u₂,s) — the likelihood term only. The full L₀ posterior (eq. 2) also includes the state prior P(s). The paper fixes θ = 0.5 per region, making P(s) = 0.5⁷ = 1/128 for all states — a uniform prior that cancels in normalization. For this reason, the likelihood alone determines the relative posterior weights.

                                  Equations
                                  • One or more equations did not get rendered due to their size.
                                  Instances For

                                    S₀ (Literal Speaker, eq. 3): scores conclusions by expected literal truth under the reasoner's posterior.

                                    S₀(u₃ | u₁,u₂) ∝ exp[α · Σ_s ℒ(u₃,s) · L₀(s|u₁,u₂)]

                                    Here ℒ(u₃,s) is the deterministic semantic function (not the noisy version inside L₀). This speaker samples states from the posterior and randomly selects conclusions that are literally true.

                                    Equations
                                    • One or more equations did not get rendered due to their size.
                                    Instances For

                                      State Communication (S₁, eq. 4): scores conclusions by expected log-likelihood — standard RSA informativity applied to syllogisms.

                                      S₁(u₃ | u₁,u₂) ∝ exp[α · Σ_s L₀(s|u₁,u₂) · ln L₀(s|u₃)]

                                      The two L₀ agents are distinct: L₀(s|u₁,u₂) is the reasoner who interpreted the premises; L₀(s|u₃) is a hypothetical naive listener who interprets just the conclusion. Both use noisy semantics (same φ).

                                      Equations
                                      • One or more equations did not get rendered due to their size.
                                      Instances For

                                        Belief Alignment (S₁, eq. 6): the paper's winning model. Scores conclusions by negative KL divergence between the reasoner's full posterior and the naive listener's posterior given the conclusion.

                                        S₁(u₃ | u₁,u₂) ∝ exp[α · −KL(L₀(·|u₁,u₂) ‖ L₀(·|u₃))]

                                        Uses Core.Divergence.klDivergence directly.

                                        Equations
                                        Instances For
                                          theorem Phenomena.Quantification.Studies.TesslerTenenbaumGoodman2022.stateCom_eq_beliefAlignment (premPost : VennState) (naivePost : ConclusionVennState) (α : ) (c : Conclusion) (hQ : ∀ (s : VennState), 0 < naivePost c s) :
                                          beliefAlignmentScore premPost naivePost α c = Real.exp (α * -s : VennState, premPost s * Real.log (premPost s)) * stateComScore premPost naivePost α c

                                          State Communication and Belief Alignment differ by an additive constant (the entropy H(post)) that does not depend on the conclusion.

                                          By kl_eq_neg_crossEntropy_plus_negEntropy from Divergence.lean: KL(P ∥ Q) = Σ P·log P − Σ P·log Q

                                          So: −KL(P ∥ Q) = Σ P·log Q − Σ P·log P = [State Com utility] + H(P).

                                          Since H(P) is constant in the conclusion c, it cancels in softmax normalization: both models produce identical conclusion distributions. The paper's different fit statistics (r = .67 vs .82) reflect different optimal α values found by MCMC, not different functional forms.

                                          The Belief Alignment score for NVC depends on how much the premises shifted beliefs from the prior. When premises are uninformative (posterior ≈ prior), KL(post ‖ prior) ≈ 0, so −KL ≈ 0, and exp(α · 0) = 1 — the maximum score. This is why the model naturally produces NVC for uninformative premise combinations.

                                          Subalternation in the region model: "All A are C" entails "Some A are C". With existential import built into syllAll, no separate non-emptiness hypothesis is needed — syllAll guarantees at least one A exists.

                                          For the Belief Alignment model, this means "All A-C" produces a more peaked L₀ posterior than "Some A-C", yielding lower KL divergence and hence higher speaker utility — explaining why Barbara participants prefer "All" over the also-valid "Some".

                                          Evaluate a syllogistic quantifier on given terms in a Venn state.

                                          Equations
                                          • One or more equations did not get rendered due to their size.
                                          Instances For

                                            Truth value of premise 1 in state s.

                                            Equations
                                            • One or more equations did not get rendered due to their size.
                                            Instances For

                                              Truth value of premise 2 in state s.

                                              Equations
                                              • One or more equations did not get rendered due to their size.
                                              Instances For

                                                Barbara: All A-B, All B-C. Figure 1 (paradigmatic valid syllogism).

                                                Equations
                                                • One or more equations did not get rendered due to their size.
                                                Instances For

                                                  All A-B, All C-B. Figure 3 (paradigmatic invalid syllogism).

                                                  Equations
                                                  • One or more equations did not get rendered due to their size.
                                                  Instances For

                                                    Some A-B, Some B-C. Figure 1.

                                                    Equations
                                                    • One or more equations did not get rendered due to their size.
                                                    Instances For

                                                      All 128 Venn diagram states, enumerated for computable summation. Each state is a function RegionBool indicating which regions are populated. Generated by the List monad over all 7 regions.

                                                      Equations
                                                      • One or more equations did not get rendered due to their size.
                                                      Instances For

                                                        Unnormalized L₀ likelihood for a syllogism in state s. Computes ℒ(p₁,s) · ℒ(p₂,s) where ℒ is noisy semantics. The uniform prior (θ = 0.5) cancels in normalization.

                                                        Equations
                                                        • One or more equations did not get rendered due to their size.
                                                        Instances For

                                                          Normalization constant: Σ_s L₀_unnorm(s). Computable via allStates.

                                                          Equations
                                                          • One or more equations did not get rendered due to their size.
                                                          Instances For

                                                            Normalized L₀ posterior: L₀(s|premises) = L₀_unnorm(s) / Z.

                                                            Equations
                                                            • One or more equations did not get rendered due to their size.
                                                            Instances For

                                                              Normalization constant for naive L₀ on a single conclusion.

                                                              Equations
                                                              • One or more equations did not get rendered due to their size.
                                                              Instances For

                                                                Naive L₀ posterior for a conclusion: L₀(s|c) ∝ ℒ(c,s). The naive listener has heard only the conclusion, not the premises.

                                                                Equations
                                                                • One or more equations did not get rendered due to their size.
                                                                Instances For

                                                                  Belief Alignment score for conclusion c given syllogism syl. Uses the full pipeline: premises → L₀ posterior → KL → exp. Parameters: α (rationality), φ (noise), β (figural bias).

                                                                  Equations
                                                                  • One or more equations did not get rendered due to their size.
                                                                  Instances For

                                                                    Conclusion probability: P(c|syl) = baScore(c) / Σ_c' baScore(c').

                                                                    Equations
                                                                    • One or more equations did not get rendered due to their size.
                                                                    Instances For

                                                                      MAP estimates from the Bayesian data analysis on Ragni et al. 2019 data. α ≈ 6.88, φ ≈ 0.06, β ≈ 2.01.

                                                                      Equations
                                                                      Instances For

                                                                        KL divergence over allStates in Float arithmetic. Skips states with P(s) = 0 (contributes 0 to KL by convention).

                                                                        Equations
                                                                        • One or more equations did not get rendered due to their size.
                                                                        Instances For

                                                                          All 9 conclusions as a list.

                                                                          Equations
                                                                          • One or more equations did not get rendered due to their size.
                                                                          Instances For

                                                                            Compute conclusion distribution for a syllogism using Float arithmetic. L₀ posteriors are computed exactly in ℚ (via l0Post, naiveL0Post), then converted to Float for the KL divergence and softmax steps. Parameters: α (rationality, Float), φ and β (exact in ℚ).

                                                                            Equations
                                                                            • One or more equations did not get rendered due to their size.
                                                                            Instances For

                                                                              Compact string output for a syllogism's predicted distribution, showing conclusions sorted by predicted probability.

                                                                              Equations
                                                                              • One or more equations did not get rendered due to their size.
                                                                              Instances For

                                                                                For Barbara (All A-B, All B-C), every L₀-probable state satisfies All A-C. Proof: states where both premises are literally true form a subset of states where All A-C holds (by barbara_valid).

                                                                                With noise φ, the L₀ posterior concentrates on these states: the likelihood ℒ(p₁,s)·ℒ(p₂,s) is (1−φ)² for consistent states but only (1−φ)·φ, φ·(1−φ), or φ² for inconsistent ones.

                                                                                This theorem verifies computably that every state where BOTH premises are literally true also satisfies All A-C — the semantic backbone of the Belief Alignment model's "All A-C" preference for Barbara.