Documentation

Linglib.Phenomena.Conditionals.Studies.EvcenBaleBarner2026

@cite{evcen-bale-barner-2026} — Conditional Perfection #

@cite{von-fintel-2001} @cite{horn-2000} @cite{cornulier-1983}

Empirical data from three experiments on conditional perfection (CP) by @cite{evcen-bale-barner-2026}, plus the bridge connecting these findings to the answer-level exhaustification theory of conditional perfection.

Paradigm #

Participants watch short videos in which a character, Mary, presses three buttons (red, blue, orange), each producing an animal sound audible only to her through headphones. Another character asks a question, and Mary responds with a conditional like "If you press the blue button, it will play a dog barking." Participants then judge whether pressing a different button will play the same sound, choosing among "Yes", "No" (= perfected), and "Can't tell" (= not perfected).

Key Findings #

  1. QUD (Experiment 1, N=98): Antecedent-focused QUDs ("Which of these buttons will play a dog sound?") yield significantly more "No" responses (M=0.65) than consequent-focused ("What will happen if I press the blue button?", M=0.22) or neutral ("What will happen if I press the buttons?", M=0.29) QUDs. No significant difference between consequent-focused and neutral (p > .05). Two follow-up experiments (each n=32) with alternative antecedent-focused phrasings replicate the effect (M=0.86, M=0.77), ruling out a uniqueness presupposition explanation.

  2. Overly informative answers (Experiment 2, N=55): Both optimally informative (M=0.92) and overly informative (M=0.84) answers trigger perfection at comparable rates under antecedent-focused QUDs (no significant difference, p = .16), suggesting overly informative answers are treated as viable alternatives for exhaustification.

  3. Speaker knowledge (Experiment 3, N=72): Speakers who have tested all buttons (full knowledge, M=0.72) yield far more "No" responses than speakers who tested only two buttons (partial knowledge, M=0.21).

All findings support @cite{von-fintel-2001}'s exhaustivity account over @cite{horn-2000}: perfection tracks the availability of alternatives (made salient by QUD) and the license to exclude them (from speaker competence).

Reported values are estimated marginal means from logistic mixed-effects regressions (on the probability scale), as reported in the paper.

QUD manipulation (Experiment 1).

The question asked before Mary's conditional answer.

  • antecedentFocused : QUDType

    "Which of these buttons will play a dog sound?" — antecedent-focus. Makes alternative antecedents (other buttons) salient.

  • consequentFocused : QUDType

    "What will happen if I press the blue button?" — consequent-focus. Makes consequences of the mentioned button salient, not alternatives.

  • neutral : QUDType

    "What will happen if I press the buttons?" — neutral. No specific focus on antecedents or consequences.

Instances For
    Equations
    • One or more equations did not get rendered due to their size.
    Instances For

      Answer type manipulation (Experiment 2).

      Whether Mary's conditional response matches the QUD's partitioning (optimally informative) or refers to a strict subset of a QUD cell (overly informative).

      • optimallyInformative : AnswerType

        Answer matches QUD cell, e.g. "If you press the triangles, it will play a dog barking" in response to "Which shapes will play a dog sound?"

      • overlyInformative : AnswerType

        Answer is more specific than QUD cell, e.g. "If you press the blue square, it will play a dog barking" (a subset of the triangle/square partition).

      Instances For
        Equations
        • One or more equations did not get rendered due to their size.
        Instances For

          Speaker knowledge manipulation (Experiment 3).

          Whether Mary has tested all three buttons or only two of them.

          • fullKnowledge : KnowledgeCondition

            Mary pressed and listened to all three buttons — full knowledge.

          • partialKnowledge : KnowledgeCondition

            Mary pressed and listened to only two buttons — partial knowledge.

          Instances For
            Equations
            • One or more equations did not get rendered due to their size.
            Instances For

              A conditional perfection data point.

              Each datum records the estimated marginal mean proportion of "No" responses (perfection) for a given experimental condition, from logistic mixed-effects regression on the probability scale.

              • description : String

                Description of the experimental condition

              • perfectionRate :

                Estimated marginal mean proportion of "No" responses (perfection rate)

              • experiment :

                Experiment number (1, 2, or 3)

              • n :

                Number of participants (post-exclusion) in the experiment

              Instances For
                Equations
                • One or more equations did not get rendered due to their size.
                Instances For

                  Experiment 1 data indexed by QUD type.

                  Between-subjects: 104 recruited, N=98 post-exclusion, randomly assigned to one of three conditions.

                  Equations
                  • One or more equations did not get rendered due to their size.
                  Instances For

                    Follow-up experiment 1a (n=32): alternative antecedent-focused phrasing. QUD: "What buttons will play a dog sound?" (omitting "of these"). Replicates the main effect, ruling out a uniqueness presupposition from "which of these." M=0.86, SE=0.11.

                    Equations
                    • One or more equations did not get rendered due to their size.
                    Instances For

                      Follow-up experiment 1b (n=32): alternative antecedent-focused phrasing. QUD: "Which buttons will play a dog sound?" (omitting "of these"). M=0.77, SE=0.13.

                      Equations
                      • One or more equations did not get rendered due to their size.
                      Instances For

                        Experiment 2 data indexed by answer type.

                        Within-subjects: 56 recruited, N=55 post-exclusion. Shapes (triangles and squares in two colors) replace buttons. QUD is always antecedent-focused: "Which of these shapes, triangles or squares, will play a dog barking?"

                        Equations
                        • One or more equations did not get rendered due to their size.
                        Instances For

                          Experiment 3 data indexed by knowledge condition.

                          Within-subjects: 75 recruited, N=72 post-exclusion. QUD is always antecedent-focused. Mary either pressed all three buttons (full knowledge) or only two (partial knowledge) before making her conditional statement.

                          Equations
                          • One or more equations did not get rendered due to their size.
                          Instances For

                            Consequent-focused and neutral QUDs produce similar (low) perfection rates: the gap between them (7pp) is smaller than either's gap to antecedent-focused (43pp, 36pp). The paper reports no significant difference between these two conditions (p > .05).

                            Follow-up experiments replicate the antecedent-focused effect with alternative QUD phrasings, ruling out a uniqueness presupposition.

                            Both answer types trigger perfection well above chance (> 0.50). The paper reports no significant difference between them (p = .16), consistent with both being treated as viable alternatives.

                            Optimally and overly informative answers produce similar perfection rates: the 8pp gap is small relative to the overall effect size.

                            The knowledge effect is larger than the QUD effect.

                            Full knowledge (72%) vs partial knowledge (21%) is a 51pp difference. Antecedent-focused (65%) vs consequent-focused (22%) is a 43pp difference. Speaker knowledge has a larger effect on perfection than QUD type, consistent with competence being a prerequisite for exhaustification.

                            Bridge: Exhaustification Theory #

                            Connects the experimental findings to the answer-level exhaustification theory of conditional perfection.

                            Full Argument Chain #

                            The paper's argument proceeds in four steps:

                            1. Semantics: The material conditional "if A then C" does not semantically entail the biconditional (established in Conditionals.Basic).

                            2. Pragmatic mechanism: Conditional perfection arises from answer-level exhaustification (@cite{von-fintel-2001}, following @cite{cornulier-1983}). The QUD "which trigger causes C?" makes alternative triggers salient; exhaustifying the answer "A causes C" against these alternatives yields "only A causes C"; combined with coverage, this entails ¬A → ¬C.

                            3. Two prerequisites for perfection:

                              • QUD: The QUD must make alternative antecedents salient (antecedent-focused), triggering exhaustification.
                              • Speaker competence: The speaker must be assumed to know about the alternative triggers, licensing exclusion of unmentioned alternatives. Without either, perfection fails.
                            4. Against @cite{horn-2000}: Horn proposes the alternative to "if A then C" is the unconditional "C regardless." This yields only an existential inference (some circumstance where ¬C), not the per-trigger universal that participants produce. The data support von Fintel's per-trigger alternatives.

                            Dependency Chain #

                            exhaustifiedAnswer (exhIE at answer level)
                                ↓ all_alt_innocently_excludable (3-button IE)
                            exhaustification_yields_perfection (IE + coverage → perfection)
                                ↓
                            theory_chain_3button_perfection (instantiated for experimental scenario)
                                ↓
                            coverage_without_exclusion_insufficient (exclusion is necessary)
                            vonFintel_strictly_stronger_than_horn (per-trigger > existential)
                                ↓
                            Prediction: perfection iff QUD provides alternatives AND speaker is competent
                                ↓
                            Data: antecedent-focused > neutral ≈ consequent-focused (Exp 1)
                                  overly informative ≈ optimally informative (Exp 2)
                                  full knowledge >> partial knowledge (Exp 3)
                            

                            Triggers in the experimental paradigm: three buttons.

                            Instances For
                              Equations
                              • One or more equations did not get rendered due to their size.
                              Instances For

                                The 6 possible worlds in a 3-button scenario.

                                Each world represents pressing one button and observing whether the target sound plays.

                                Instances For
                                  Equations
                                  • One or more equations did not get rendered due to their size.
                                  Instances For
                                    Equations
                                    • One or more equations did not get rendered due to their size.

                                    The answer space for the 3-button experimental paradigm.

                                    Maps the scenario into the AnswerSpace structure: three triggers (buttons), each with a causal relation to the target sound.

                                    Equations
                                    • One or more equations did not get rendered due to their size.
                                    Instances For

                                      Theory chain: exhaustification yields perfection in the 3-button scenario.

                                      This instantiates exhaustification_yields_perfection for the 3-button experimental paradigm. With 3 buttons, there are 2 alternative triggers (B and C). The all_alt_innocently_excludable lemma establishes that both are innocently excludable — the key step that requires the general lemma rather than the singleton version.

                                      The hypotheses map to experimental conditions:

                                      • h_exh: exhaustified answer holds (antecedent-focus QUD + speaker knowledge)
                                      • h_coverage: every sound event has a button cause (closed domain)
                                      • hnp: button A is not pressed

                                      The IE condition is discharged by all_alt_innocently_excludable: the witness pressA_plays establishes consistency of φ ∧ ∀a∈ALT. ¬a (at pressA_plays, A causes the sound but B and C do not).

                                      Direct verification: exclusion of B and C + ¬pressA → ¬sound.

                                      Verified by exhaustive case analysis on the 6-world type. Sanity check: the theory chain agrees with brute-force verification.

                                      What @cite{von-fintel-2001}'s account predicts about non-asserted triggers: each specific alternative trigger is excluded (universal, per-trigger).

                                      Equations
                                      Instances For

                                        What @cite{horn-2000}'s account predicts: some non-asserted trigger is excluded, but we don't know which (existential, unspecified).

                                        Equations
                                        Instances For
                                          theorem Phenomena.Conditionals.Studies.EvcenBaleBarner2026.vonFintel_entails_horn {Trigger : Type u_1} {W : Type u_2} (as : Semantics.Conditionals.Exhaustivity.AnswerSpace Trigger W) (t : Trigger) (w : W) (h_other : t'as.triggers, t' t) (h_vf : vonFintelPrediction as t w) :

                                          Von Fintel entails Horn: per-trigger exclusion implies existential exclusion.

                                          If we know that every specific alternative trigger is excluded (von Fintel), then certainly some trigger is excluded (Horn).

                                          Horn does NOT entail von Fintel: existential exclusion does not determine which trigger is excluded.

                                          Counterexample: in the 3-button scenario at world pressB_plays, trigger B causes the sound and C does not. Horn's prediction holds (C doesn't cause it), but von Fintel's fails (B does cause it). The existential "some other button doesn't play the sound" is strictly weaker than the universal "each other button doesn't play the sound."

                                          This is the paper's key argument against Horn: participants respond "No" to specific other buttons (per-trigger judgment), not just "some other button won't play it."

                                          Von Fintel is strictly stronger than Horn.

                                          Combining the two: von Fintel's per-trigger alternatives generate strictly stronger predictions than Horn's unconditional alternative. The 3-button paradigm discriminates between the two accounts.

                                          theorem Phenomena.Conditionals.Studies.EvcenBaleBarner2026.coverage_without_exclusion_insufficient :
                                          ∃ (W : Type) (Trigger : Type) (causes : TriggerWProp) (triggers : Set Trigger) (t : Trigger) (_ : t triggers) (p : WProp) (C : WProp), (∀ (w : W), causes t wp w) (∀ (w : W), C wt'triggers, causes t' w) ∃ (w : W), ¬p w C w

                                          Without exclusion, perfection fails.

                                          Coverage alone (every C-event has some trigger) does NOT yield ¬p → ¬C. Witness: a scenario where trigger t requires p but trigger t' fires at ¬p-worlds. Coverage holds, but ¬p ∧ C.

                                          This is the other half of the theory's prediction: perfection requires exclusion (from exhaustification), not just coverage. Without exclusion (e.g., consequent-focus QUD or partial speaker knowledge), the theory predicts no perfection — matching the experimental findings.

                                          Whether a speaker's epistemic state licenses the competence assumption.

                                          When the speaker has tested all buttons (full knowledge), the hearer can assume competence: the speaker's silence about other buttons is informative, licensing exclusion. With partial knowledge, silence reflects ignorance.

                                          Equations
                                          Instances For

                                            Exhaustification is licensed iff both prerequisites hold.

                                            The theory predicts perfection only when:

                                            1. The QUD makes alternative antecedents salient (→ alternatives for Exh)
                                            2. The speaker is assumed competent (→ exclusion of alternatives)

                                            This is derived from the conjunction of the two independent prerequisites, not stipulated as a single function.

                                            Equations
                                            • One or more equations did not get rendered due to their size.
                                            Instances For

                                              With consequent-focused QUD, exhaustification is not licensed (even with full knowledge) — because the QUD doesn't provide alternatives.

                                              With partial knowledge, exhaustification is not licensed (even with antecedent-focused QUD) — because competence isn't assumed.

                                              Data confirms: antecedent-focused QUD promotes perfection.

                                              The theory predicts higher perfection under antecedent-focused QUDs (which make alternative triggers salient, licensing exhaustification) than under consequent-focused or neutral QUDs (which don't). Experiment 1 confirms both orderings, and antecedent-focused maximizes across all QUD types.

                                              Data confirms: overly informative answers trigger perfection.

                                              The theory predicts that if exhaustification operates over QUD-relevant alternatives, both optimally and overly informative answers should trigger perfection (since both are answers to the QUD). Experiment 2 confirms: both types yield high perfection rates with no significant difference.

                                              Data confirms: speaker knowledge promotes perfection.

                                              The theory predicts higher perfection when the speaker is knowledgeable (licensing the assumption that unmentioned alternatives are false, hence exclusion) than when ignorant (no exclusion license). Experiment 3 confirms with a large effect (51pp difference).

                                              Theory predicts the data pattern: perfection is high only when both prerequisites are met.

                                              The exhaustification theory predicts perfection should be high only when exhaustificationLicensed returns true (antecedent-focused QUD + full knowledge). The data confirms: only the antecedent-focused condition (Exp 1) and the full-knowledge condition (Exp 3) show high perfection rates, while conditions missing either prerequisite show low rates.

                                              Convergent evidence: CP is pragmatic.

                                              The theory's prediction chain depends entirely on pragmatic factors (QUD, speaker knowledge, exhaustification). Two independent lines of evidence confirm the pragmatic nature:

                                              1. Formal: perfection_not_entailed_variablyStrict proves the biconditional is not semantically entailed even under Stalnaker/Lewis variably strict semantics — the framework the paper adopts. This is stronger than showing it for material implication alone (a weaker semantics could fail to entail perfection while a stronger one succeeds).

                                              2. Experimental: Perfection rates vary with QUD type and speaker knowledge. Semantic entailments are invariant across these factors.

                                              Competence Bridge #

                                              The competence assumption in conditional perfection is the same mechanism formalized in Implicature.Core.Competence and tested experimentally by @cite{bale-etal-2025} for scalar implicatures. Both paradigms:

                                              This section connects KnowledgeCondition to the shared infrastructure.

                                              speakerCompetenceAssumed agrees with the NeoGricean competent predicate applied via toBeliefStateCP. This connects the study-specific Boolean to the general implicature infrastructure.

                                              Partial knowledge: processAlternative yields weak-only inference (exclusion not licensed — silence reflects ignorance, not absence).

                                              ALT Constraint #

                                              Footnote 7 of @cite{evcen-bale-barner-2026} states the alternatives constraint:

                                              ALT(p) ⊆ ANS(QUD) ∩ {q : Ks(q) ∨ Ks(¬q)}

                                              The alternatives used for exhaustification must be both answers to the QUD (contextual salience) AND propositions the speaker is competent about (epistemic license). This is exactly what exhaustificationLicensed encodes as a conjunction: qudProvidesAlternatives qud && speakerCompetenceAssumed k.

                                              The ALT constraint as an intersection: alternatives are non-empty only when both QUD provides answers and speaker is competent. exhaustificationLicensed encodes this as a conjunction.

                                              When QUD doesn't provide alternatives, exhaustification is blocked regardless of competence — the intersection has an empty first component.

                                              When speaker lacks competence, exhaustification is blocked regardless of QUD — the intersection has an empty second component.