Documentation

Linglib.Phenomena.Presupposition.Studies.GroveWhite2025

@cite{grove-white-2025} #

Factivity, presupposition projection, and the role of discrete knowledge in gradient inference judgments. Natural Language Semantics 34:1–45.

Core Contribution #

Grove & White compare two hypotheses about the gradience observed in inference judgments for clause-embedding predicates:

The Four Models #

The paper crosses two binary choices — factivity (discrete/gradient) × world knowledge (discrete/gradient) — yielding four models:

ModelFactivityWorld knowledgeFits best?
discrete-factivitydiscrete (τ_v)gradientYes
wholly-discretediscrete (τ_v)discreteSecond
discrete-worldgradientdiscrete
wholly-gradientgradientgradientWorst

The discrete-factivity model extends the norming-gradient model (Sect. 4.2) by adding a Bernoulli switch τ_v on top of the gradient world knowledge model. The wholly-discrete model similarly extends the norming-discrete model.

Formalization Strategy #

The discrete-factivity model is structurally a ParamPred over FactivityReading:

This directly reuses Factivity.lean for the two readings and ParamPred for the parameterized semantics.

Connection to PDS #

The paper's formal framework is Probabilistic Dynamic Semantics (PDS), developed in @cite{grove-white-2025b}. The discreteFactivityPred construction is structurally equivalent to applying PDS's probProp to a Boolean predicate parameterized by reading type — graded truth emerges from marginalizing over a discrete parameter, exactly as in Semantics.Dynamic.Probabilistic.

Connection to @cite{scontras-tonhauser-2025} #

Scontras & Tonhauser's RSA model uses factivePos for know and nonFactivePos for think — exactly the two readings of clauseEmbeddingSem. Their model is the special case of the discrete-factivity model with τ=1 (know is always factive) and τ=0 (think is never factive). The bridge theorems certain_factive_eq_know and certain_nonfactive_eq_think make this connection explicit.

Key Results #

Across all four datasets — @cite{degen-tonhauser-2021} original, a replication, bleached contexts, and templatic contexts — the discrete-factivity model achieves the best ELPD (expected log pointwise predictive density), supporting the FDH over the FGH.

The Fundamental Discreteness Hypothesis (definition (7a), p. 10): factivity is a discrete property of an expression on a particular occasion of use. A given use either triggers a projective inference, or it does not. The FDH is neutral on why the resolved indeterminacy arises — it may be due to polysemy, structural ambiguity, or discourse sensitivity (QUD/common ground).

Equations
Instances For

    The Fundamental Gradience Hypothesis (definition (7b), p. 10): there is no property distinguishing factive from non-factive occurrences. Gradient distinctions reflect gradient contributions to inferences.

    Equations
    Instances For

      Possible mechanisms for resolved indeterminacy under the FDH. These are mentioned on p. 10 as different ways the discreteness could be cashed out. The FDH itself is neutral among them.

      • polysemy : ResolvedMechanism

        Polysemy: a predicate has multiple senses, at least one factive and at least one nonfactive (conventionalist account, Sect. 6.1).

      • structuralAmbiguity : ResolvedMechanism

        Structural ambiguity: a predicate occurs in multiple structures, at least one implicated in triggering projection and one not.

      • discourseSensitivity : ResolvedMechanism

        Discourse sensitivity: the predicate's complement content may or may not be entailed by a discourse construct like the QUD (conversationalist account, Sect. 6.2).

      Instances For
        Equations
        • One or more equations did not get rendered due to their size.
        Instances For

          Per-predicate factivity probability. On each occasion of use, a clause- embedding predicate is factive with probability τ_v and nonfactive with probability 1 − τ_v. This is the key parameter of the discrete-factivity model (Sect. 3.7, definition (13)).

          • τ :

            The probability of the factive reading.

          • τ_nonneg : 0 self.τ
          • τ_le_one : self.τ 1
          Instances For

            The two readings of a clause-embedding predicate under the FDH.

            Instances For
              Equations
              • One or more equations did not get rendered due to their size.
              Instances For
                Equations
                • One or more equations did not get rendered due to their size.

                Construct a ParamPred for a clause-embedding predicate from its factivity parameter τ_v. This is the discrete-factivity model: Boolean semantics parameterized by a binary reading, with a prior ⟨τ_v, 1 − τ_v⟩ over readings.

                Equations
                • One or more equations did not get rendered due to their size.
                Instances For

                  The graded truth value of a clause-embedding predicate under the discrete-factivity model equals the τ-weighted mixture of the two Boolean readings.

                  The discrete-factivity model's graded truth is exactly PDS's probProp: the probability of a Boolean predicate under a finite distribution. This is the formal content of the paper's core claim — graded inference judgments emerge from marginalizing over a discrete reading parameter.

                  The four models from the paper (Sect. 4.3–4.4), crossing factivity × world knowledge. Each model is a completion of one of the two norming models (Sect. 4.2) with a factivity component.

                  • discreteFactivity : ModelVariant

                    Discrete factivity + gradient world knowledge. Best fit. Extends norming-gradient (Sect. 4.2.1).

                  • whollyDiscrete : ModelVariant

                    Discrete factivity + discrete world knowledge. Extends norming-discrete (Sect. 4.2.2).

                  • whollyGradient : ModelVariant

                    Gradient factivity + gradient world knowledge. Worst fit. Extends norming-gradient with gradient factivity.

                  • discreteWorld : ModelVariant

                    Gradient factivity + discrete world knowledge. Extends norming-discrete with gradient factivity.

                  Instances For
                    Equations
                    • One or more equations did not get rendered due to their size.
                    Instances For

                      The best and worst models both use gradient world knowledge but differ in factivity treatment. This isolates discrete factivity as the key factor driving model fit: holding world knowledge constant at gradient, switching from discrete to gradient factivity drops ELPD from best to worst.

                      Each factivity model extends one of two norming models. The extension relationship is determined by how the model treats world knowledge: gradient world knowledge = extends norming-gradient, discrete world knowledge = extends norming-discrete.

                      • gradient : NormingModel

                        Norming-gradient (Sect. 4.2.1): world knowledge as unresolved.

                      • discrete : NormingModel

                        Norming-discrete (Sect. 4.2.2): world knowledge as resolved.

                      Instances For
                        Equations
                        • One or more equations did not get rendered due to their size.
                        Instances For

                          @cite{scontras-tonhauser-2025}'s literalMeaning .knowPos is exactly the factive reading of clauseEmbeddingSem. Their model implicitly sets τ = 1 for know.

                          @cite{scontras-tonhauser-2025}'s literalMeaning .thinkPos is exactly the nonfactive reading of clauseEmbeddingSem. Their model implicitly sets τ = 0 for think.

                          S&T's binary model is the limiting case of the discrete-factivity model: know uses τ=1 (certain factive), think uses τ=0 (certain nonfactive). The discrete-factivity model generalizes this by allowing intermediate τ values for the same predicate across occasions.

                          The discrete-factivity model's theoretical prediction: higher τ means more projection. This is a monotonicity property — if τ₁ > τ₂ and the factive reading satisfies factivePos w but the nonfactive reading does not satisfy nonFactivePos w, then the predicate with higher τ gets higher graded truth at w.

                          The empirical ordering from @cite{degen-tonhauser-2022} — know projects more than think — is consistent with the model's τ ordering. Under the discrete-factivity model, this ordering holds when τ_know > τ_think. The S&T limiting case (τ_know=1, τ_think=0) is a special case.

                          The prior-belief modulation finding from @cite{degen-tonhauser-2021} is the empirical observation that the discrete-factivity model explains: observed gradience arises from uncertainty over the discrete τ parameter interacting with world knowledge (prior beliefs about complement content). Both experiments confirm that higher prior → stronger projection.