Documentation

Linglib.Phenomena.ScalarImplicatures.Studies.GeurtsPouscoulous2009

@cite{geurts-pouscoulous-2009} — Embedded Implicatures?! #

Theory-neutral empirical data and argumentation chain from @cite{geurts-pouscoulous-2009}.

Central Question #

Do scalar implicatures arise "locally" inside embedded clauses? Conventionalists (Chierchia, Levinson, Landman) predict SIs occur "systematically and freely in arbitrarily embedded positions." Griceans predict SIs are global pragmatic inferences only.

Argumentative Structure #

  1. Exp 1a-b (Table 2): SI endorsement rates vary wildly by embedding type (3–94%), contradicting "systematically and freely."

  2. Paradigm bias (§2): Three worries about the inference paradigm used in Exp 1; it likely inflates observed SI rates.

  3. Exp 2 (§3, n=29): Inference task (62%) vs verification task (34%) confirms paradigm bias on simple sentences.

  4. Exp 3 (Table 3, n=26): Verification shows zero local SIs in UE contexts (100% "true"), while inference shows ~50% across all conditions regardless of monotonicity. The inference paradigm produces spurious "local SIs."

  5. Paradigm correction: After accounting for bias, only think (57.5%) shows genuinely elevated SI rates. The rates for all (27%) and want (32%) may be entirely paradigm artifacts.

  6. Gricean explanation for think (§8): Global SI (¬Bsp[Bs(all)]) + competence assumption (Bs(all) ∨ Bs(¬all)) entails Bs(¬all), which looks like a local SI but is derived globally.

  7. Exp 4 (Tables 4–5, n=22): Minimal conventionalism predicts people should detect ambiguity in scalar sentences. Genuine ambiguities detected at 70%, alleged SI-ambiguities at only 6%. Both mainstream and minimal conventionalism are falsified.

The two experimental paradigms.

Instances For
    Equations
    • One or more equations did not get rendered due to their size.
    Instances For

      Monotonicity of an embedding context.

      Instances For
        Equations
        • One or more equations did not get rendered due to their size.
        Instances For

          Quantifier contexts tested in Experiments 3–4.

          Instances For
            Equations
            • One or more equations did not get rendered due to their size.
            Instances For
              Equations
              • One or more equations did not get rendered due to their size.
              Instances For

                Inference task: "Does X imply Y?"

                Equations
                • One or more equations did not get rendered due to their size.
                Instances For

                  Verification task: "Is this true of the picture?"

                  Equations
                  • One or more equations did not get rendered due to their size.
                  Instances For

                    Mainstream conventionalism predicts local SIs are preferred in non-DE contexts. In the inference paradigm, this means participants should endorse the inference that "some" implies "not all" when embedded under UE/NM quantifiers. In the verification paradigm, this means participants should reject the classical reading when it conflicts with the local SI.

                    We formalize this as: conventionalism predicts SI endorsement rates should be high (> 50%) in non-DE inference conditions, and verification rates should match the SI reading, not the classical reading.

                    Equations
                    Instances For

                      Embedding types tested in Experiments 1a-b.

                      Instances For
                        Equations
                        • One or more equations did not get rendered due to their size.
                        Instances For
                          Instances For
                            Equations
                            • One or more equations did not get rendered due to their size.
                            Instances For

                              Experiment 1a results (Table 2, n=30).

                              Equations
                              • One or more equations did not get rendered due to their size.
                              Instances For

                                Experiment 1b results (Table 2, n=31).

                                Equations
                                • One or more equations did not get rendered due to their size.
                                Instances For

                                  SI rates vary from 3% (must) to 94% (simple 1b), a 91pp range. This contradicts the conventionalist claim that SIs occur "systematically and freely in arbitrarily embedded positions." If SIs were systematic, rates should be uniformly high across all embedding types.

                                  Only think shows substantial local SI endorsement among embedded conditions. At 50% (1a) and 65% (1b), think is the only embedding above 35%. All others (all: 27%, must: 3%, want: 32%) fall below. The paper later argues (§5–8) that even these rates may be artifacts of the inference paradigm.

                                  Experiment 2 results (§3.1–3.2, pp.16-17). 29 Dutch-speaking students at the University of Nijmegen. Within-subjects: same critical sentence ("Some of the B's are in the box on the left") tested in both inference and verification tasks. McNemar's test, n = 29, p < .01.

                                  • inferenceRate :
                                  • verificationRate :
                                  • controlAccuracy :
                                  • n :
                                  Instances For
                                    Equations
                                    • One or more equations did not get rendered due to their size.
                                    Instances For
                                      Equations
                                      Instances For

                                        The inference paradigm inflates SI rates by 28pp (62% vs 34%). This confirms three a priori worries about the inference paradigm (§2): (1) endorsing an argument is easier than spontaneously drawing it, (2) the question "Does X imply Y?" makes the SI contextually relevant, (3) superficial similarity to valid inferences may cause errors.

                                        In the more neutral verification task, SI rate is below 50%. This argues against even weak defaultism.

                                        Near-perfect control accuracy rules out a positive response bias.

                                        One row of Table 3 (p.22). The table has 6 rows because "exactly two" has two verification conditions (one where the classical reading is true, one where the local-SI reading is true) but a single shared inference rate.

                                        The verificationPred field records the conventionalist prediction for verification (the parenthetical 0/1 in Table 3): should participants say "true"? Similarly inferencePred for the inference column.

                                        • quantifier : QuantifierContext
                                        • verificationTrueRate :

                                          % saying "true" in verification task

                                        • verificationPred : Bool

                                          Conventionalist prediction: should participants say "true"?

                                        • inferenceRate :

                                          % endorsing local SI in inference task

                                        • inferencePred : Bool

                                          Conventionalist prediction: should participants endorse SI?

                                        Instances For
                                          Equations
                                          • One or more equations did not get rendered due to their size.
                                          Instances For

                                            Experiment 3 results (Table 3, p.22, n=26). 26 first-year humanities students at the University of Nijmegen (§5.1, p.20). Pairwise McNemar tests (Bonferroni-corrected) all significant: all p < .005, not all p < .001, more than one p < .0005, not more than one p < .05, exactly two p < .005 (both conditions).

                                            Equations
                                            • One or more equations did not get rendered due to their size.
                                            Instances For

                                              Verification shows zero local SIs in UE contexts: 100% say "true" (accepting the classical, non-SI reading). Conventionalism predicts participants should say "false" (the local SI makes UE sentences false in the depicted situation). This is the paper's central empirical finding against mainstream conventionalism.

                                              Inference rates cluster around 50% for ALL conditions (46–62%), regardless of monotonicity. The inference paradigm produces a roughly constant endorsement rate that does not discriminate between contexts where conventionalism predicts SIs and contexts where it does not. The paper reports: "all rates, for DE and non-DE items alike, clustered around chance level, give or take 12%" (p.23).

                                              Verification perfectly tracks the classical (non-SI) truth value. When the classical reading is true (verificationPred = false, i.e., conventionalism predicts "false" but classical reading predicts "true"), the rate is ≥ 96%. When the classical reading is false (exactly-two row 2 and DE items), the rate is ≤ 4%. Participants judge truth values by the classical reading, not by any local-SI reading.

                                              After Experiment 3 establishes that the inference paradigm inflates SI rates by ~50pp, the rates observed in Experiment 1 must be corrected. The paper argues (p.23): "it is quite possible that the rates previously observed for all (27%) and want (32%) are entirely due to a paradigm bias." Only think (avg 57.5%) exceeds the observed paradigm bias level (~50% baseline in Exp 3 inference conditions).

                                              theorem Phenomena.ScalarImplicatures.Studies.GeurtsPouscoulous2009.competence_explains_think {BobBelievesAll BobBelievesNotAll : Prop} (globalSI : ¬BobBelievesAll) (competence : BobBelievesAll BobBelievesNotAll) :
                                              BobBelievesNotAll

                                              The Gricean explanation for apparent local SIs under "think"/"believe".

                                              For "Bob believes Anna ate some of the cookies", the global SI yields: ¬(Bob believes Anna ate all the cookies)

                                              Under a competence assumption (Bob has an opinion on whether she ate all): (Bob believes all) ∨ (Bob believes ¬all)

                                              From these two premises, it follows that Bob believes ¬all — which looks like a local SI but is derived entirely from global pragmatics + competence.

                                              This explains why "think" shows elevated rates (57.5%) while other embeddings do not: the competence assumption is independently plausible for attitude verbs (people typically have opinions about what they believe) but not for quantifiers ("all students heard some" does not license "each student has an opinion about whether they heard all").

                                              theorem Phenomena.ScalarImplicatures.Studies.GeurtsPouscoulous2009.competence_does_not_generalize {Customer : Type} {ShotAtAll ShotAtNotAll : CustomerProp} (globalSI : ¬∀ (c : Customer), ShotAtAll c) (strongCompetence : ∀ (c : Customer), ShotAtAll c ShotAtNotAll c) :
                                              ∃ (c : Customer), ShotAtNotAll c

                                              The competence explanation does NOT generalize to universal quantifiers.

                                              For "All the customers shot at some of the salesmen", the global SI yields: ¬∀ (c : Customer), ShotAtAll c

                                              The analogous competence assumption would require: ∀ (c : Customer), ShotAtAll c ∨ ShotAtNotAll c

                                              This is a much stronger assumption — it requires every customer to have a determinate shooting pattern. The paper notes (p.29) that this assumption is "considerably less plausible" than for belief reports.

                                              We demonstrate this by showing the proof still works formally (same logical structure) but flagging in the docstring that the premise is pragmatically implausible for quantifiers.

                                              One row of Table 4 (p.27). Response rates for critical and DE control items in Experiment 4, where participants chose among "true", "false", and "could be either". As in Experiment 3, there were two verification conditions for exactly-two.

                                              • quantifier : QuantifierContext
                                              • trueRate :

                                                % saying "true" (yes)

                                              • falseRate :

                                                % saying "false" (no)

                                              • eitherRate :

                                                % saying "could be either"

                                              Instances For
                                                Equations
                                                • One or more equations did not get rendered due to their size.
                                                Instances For

                                                  Experiment 4 results (Table 4, p.27, n=22). 22 first-year linguistics students at University College London (p.26). Wilcoxon's Exact test: W = 208, n = 20, p < .0001.

                                                  Equations
                                                  • One or more equations did not get rendered due to their size.
                                                  Instances For

                                                    Ambiguous control sentences from Table 5 (p.27). These are genuinely ambiguous sentences (e.g., "Visiting relatives can be boring") that participants should be able to recognize as ambiguous.

                                                    • sentence : String
                                                    • eitherRate :

                                                      % saying "could be either"

                                                    Instances For
                                                      Equations
                                                      • One or more equations did not get rendered due to their size.
                                                      Instances For

                                                        Table 5: ambiguous control items (p.27).

                                                        Equations
                                                        • One or more equations did not get rendered due to their size.
                                                        Instances For

                                                          People detect genuine ambiguities at 70% on average but alleged SI-induced ambiguities at only 6% (non-DE items). The 64pp gap shows that people simply do not perceive the ambiguity that conventionalism predicts. This falsifies even minimal conventionalism, which merely claims that local-SI readings exist (not that they're preferred).

                                                          Total non-DE responses consistent with conventionalism: only ~10%. The paper reports (p.26-27): "only 9 out of 88 responses (i.e. 10%) were consistent with minimal conventionalism. Moreover, all but one of these responses were associated with non-monotonic exactly two."

                                                          The gap between genuine and alleged ambiguity detection is massive: 70% vs 6% = 64pp. This is the strongest result against conventionalism.

                                                          Data point from the broader experimental literature on scalar inference, compiled in @cite{geurts-2010} Table 1 (a separate review chapter).

                                                          • citation : String
                                                          • scalarTerm : String
                                                          • upperBoundRate :

                                                            Rate of upper-bounded (SI) interpretation (percentage 0-100)

                                                          Instances For
                                                            Equations
                                                            • One or more equations did not get rendered due to their size.
                                                            Instances For

                                                              Sample of experimental data from @cite{geurts-2010} Table 1. Across the literature, SI rates are highly variable and generally below 65%, consistent with the Gricean view that SIs are context- dependent pragmatic inferences rather than defaults.

                                                              Equations
                                                              • One or more equations did not get rendered due to their size.
                                                              Instances For

                                                                Average SI rate across the literature is below 50%. This is inconsistent with defaultism's prediction that SIs are the norm and should arise at high rates.

                                                                Connects NeoGricean scalar implicature theory (@cite{geurts-2010}) to the experimental findings above.

                                                                Gricean prediction for embedding types.

                                                                Instances For
                                                                  Equations
                                                                  • One or more equations did not get rendered due to their size.
                                                                  Instances For
                                                                    Equations
                                                                    • One or more equations did not get rendered due to their size.
                                                                    Instances For

                                                                      Competence-based explanation for belief reports.

                                                                      Equations
                                                                      • One or more equations did not get rendered due to their size.
                                                                      Instances For