Documentation

Linglib.Phenomena.SyntacticAmbiguity.Studies.PaapeVasishth2026

Paape & Vasishth (2026) @cite{paape-vasishth-2026} #

Context ameliorates but does not eliminate garden-pathing: Novel insights from latent-process modeling. Journal of Memory and Language 148, 104748.

Overview #

The paper replicates @cite{altmann-garnham-dennis-1992}'s CC/RC × referential context design (3×2: {amb-CC, amb-RC, unamb-RC} × {unique, non-unique referents}) with N = 319 using masked bidirectional self-paced reading (BSPR). The central contribution is a multinomial processing tree (MPT) model that decomposes reading time distributions into latent cognitive processes: attention, first-pass attachment (garden-pathing), optional triage, covert/overt reanalysis, and success/failure.

Key Findings #

The MPT decomposition yields three conclusions (p. 11) unavailable from standard factorial comparison of means:

I. Regardless of context, RC disambiguation results in a higher probability of garden-pathing (choosing an incorrect first-pass parse). II. Contextual match decreases but does not eliminate the first-pass anti-RC bias. III. The cost of in-situ reanalysis is higher for RC than for CC disambiguation, and is lower but still non-zero in cases of context match versus mismatch.

MPT vs. Surprisal #

The no-triage MPT outperforms LLM surprisal-based models (GPT-2, LLaMA-2, Mistral, Falcon) and standard factorial models in cross-validated predictive fit (PSIS-LOO elpd). The key architectural difference: the MPT assumes reading times at disambiguation are mixture distributions — bimodal with garden-pathed (slow) and non-garden-pathed (fast) components. Surprisal-based models assume unimodal distributions shifted by surprisal value, which cannot capture this bimodality (@cite{van-schijndel-linzen-2021}, @cite{huang-etal-2024}).

Connections #

MPT Structure (Fig. 1) #

The MPT models each trial as a cascade of probabilistic binary decisions. Each path through the tree yields a distinct combination of processing stages and a corresponding reading time distribution.

The full MPT includes a triage path (garden-pathed readers sometimes reject immediately without reanalysis). The best-fitting model variant (no-triage MPT) drops this path, but we include it in the type for completeness.

Outcome of a single trial through the processing cascade.

Each constructor corresponds to a leaf node in the MPT (Fig. 1), determining which processing costs accumulate and whether the sentence is ultimately accepted or rejected.

  • inattentive : TrialOutcome

    Inattentive trial: biased guess (accept with p_bias ≈ 0.79). Fast reading time, no garden-path processing.

  • correctFirstPass : TrialOutcome

    Attentive, correct first-pass parse: no garden-pathing.

  • triage : TrialOutcome

    Garden-pathed, triaged: reader rejects immediately without attempting reanalysis. Present in the full MPT but dropped in the best-fitting no-triage variant.

  • overtSuccess : TrialOutcome

    Garden-pathed, overt reanalysis (regression to earlier material), reanalysis succeeds.

  • overtFail : TrialOutcome

    Garden-pathed, overt reanalysis (regression), fails → reject.

  • covertImmediateSuccess : TrialOutcome

    Garden-pathed, covert (in-situ) reanalysis, immediate, succeeds.

  • covertImmediateFail : TrialOutcome

    Garden-pathed, covert reanalysis, immediate, fails → reject.

  • covertPostponedSuccess : TrialOutcome

    Garden-pathed, covert reanalysis, postponed to spillover, succeeds.

  • covertPostponedFail : TrialOutcome

    Garden-pathed, covert reanalysis, postponed to spillover, fails → reject.

Instances For
    Equations
    • One or more equations did not get rendered due to their size.
    Instances For

      Whether a trial outcome involves a first-pass regression (rereading earlier material). In the MPT, regressions occur only on the overt reanalysis path. Covert reanalysis is signaled by in-situ reading slowdown, not regression.

      Equations
      Instances For

        Process Probabilities #

        Each branching node in the MPT has a probability parameter. Some parameters are affected by the experimental manipulation; the predictor structure is:

        Parameters p_attentive, p_covert, p_postpone can vary between participants but are not assumed to vary by condition.

        Estimated via Bayesian inference in Stan (4 MCMC chains × 2000 iterations, 1000 burn-in each).

        The probability parameters of the MPT model. Values are stored as percentages (0–100).

        • pAttentive :
        • pGardenPath :
        • pCovert :
        • pPostpone :
        • pCovertSuccess :
        • pOvertSuccess :
        Instances For
          Equations
          • One or more equations did not get rendered due to their size.
          Instances For

            Study Design #

            Replication of @cite{altmann-garnham-dennis-1992} with N = 319 English speakers (Prolific) via masked BSPR with end-of-sentence acceptability judgments. 36 stimulus sentences adapted from the original (replacing told with varied verbs per @cite{staub-etal-2018}), crossed with 2-level context (unique vs. non-unique referents).

            Experimental conditions in the 3 × 2 design.

            Instances For
              Equations
              • One or more equations did not get rendered due to their size.
              Instances For

                Map experimental conditions to the abstract Condition type.

                Equations
                • One or more equations did not get rendered due to their size.
                Instances For

                  Bayesian Analysis Results #

                  Standard Bayesian linear mixed-effects analysis using brms, with sum contrasts for CONTEXT (unique = −1, non-unique = +1) and treatment contrasts for DISAMBIGUATION (amb-RC = baseline). Lognormal likelihood for first-pass reading times (FPRT), Bernoulli for first-pass regressions.

                  BF10 values > 3 indicate evidence for the alternative hypothesis. We record the critical region effects, which are the earliest measures potentially affected by garden-pathing.

                  Bayes factor (BF10) for key effects at the critical disambiguating region. BF10 > 3 indicates evidence; values >1000 are decisive.

                  Instances For
                    Equations
                    • One or more equations did not get rendered due to their size.
                    Instances For

                      Key Bayesian analysis results from Table 2 (critical region).

                      Equations
                      • One or more equations did not get rendered due to their size.
                      Instances For

                        All recorded Bayes factors exceed the evidence threshold of 3.

                        Parameter Estimates #

                        The no-triage MPT model achieves the best cross-validated predictive fit (Fig. 6). The following estimates are approximate posterior medians from Fig. 8–10. The intercept corresponds to the mean probability in the ambiguous RC condition across matching and mismatching contexts.

                        Baseline MPT parameters (ambiguous RC, averaged over contexts). Approximate posterior medians from the no-triage model.

                        Equations
                        Instances For

                          Garden-Path Probability by Condition #

                          Approximate cell estimates derived from the intercept + slope posteriors (Figs. 8–9). The slopes are on the logit scale; back-transformed estimates are approximate.

                          Key constraints from the paper (p. 8):

                          Finding I: RC always harder (p. 11) #

                          "Regardless of context, RC disambiguation results in a higher probability of garden-pathing, that is, choosing an incorrect first-pass attachment."

                          Finding II: Context reduces but does not eliminate (p. 11) #

                          "Contextual match decreases but does not eliminate the first-pass anti-RC bias."

                          Context does not eliminate RC garden-pathing — the rate with supporting context still far exceeds the CC baseline.

                          Context × disambiguation crossover: non-unique referents help RC but hurt CC. This is the interaction predicted by the context-sensitive attachment hypothesis.

                          The results support a graded version of context-sensitive attachment: context affects first-pass parsing (there IS an interaction), but the bias is not fully overridden (there IS a main effect of disambiguation). Neither hypothesis alone suffices.

                          Finding III: Reanalysis cost is separable (p. 11) #

                          "The cost of in-situ reanalysis is higher for RC than for CC disambiguation, and is lower but still non-zero in cases of context–disambiguation match versus mismatch."

                          Processing cost structure from posterior estimates (Fig. 10). All values in milliseconds.

                          • gardenPathCost :
                          • attentionCost :
                          • regressionCost :
                          • covertReanalysisCost :
                          Instances For
                            Equations
                            • One or more equations did not get rendered due to their size.
                            Instances For

                              Baseline costs for RC disambiguation (intercept posteriors, Fig. 10).

                              Equations
                              Instances For

                                CC disambiguation costs: covert reanalysis is ~150–200ms cheaper than RC (Fig. 10 CC vs RC slope centered around -150 to -200ms).

                                Equations
                                Instances For

                                  Finding III (second part): covert reanalysis cost is non-trivial even for CC disambiguation (i.e., still non-zero).

                                  Pure garden-pathing cost is small relative to reanalysis cost. This is a key MPT insight: the cost of being garden-pathed is tiny; the cost comes from reanalysis.

                                  Why the MPT outperforms surprisal #

                                  Surprisal theory (@cite{hale-2001}, @cite{levy-2008}) predicts that processing difficulty is proportional to the negative log probability of a word in context. For garden-path sentences, surprisal at the disambiguating region should be higher for less expected continuations.

                                  The MPT outperforms all four LLM surprisal models and the standard factorial model in cross-validated predictive fit. The key architectural difference is the mixture assumption: reading times at disambiguation are generated by a mixture of latent populations (garden-pathed vs. not), each with distinct cost distributions. Single-stage surprisal models predict a unimodal distribution shifted by the surprisal value.

                                  This is structurally analogous to the limitation identified by @cite{van-schijndel-linzen-2021} and @cite{huang-etal-2024}: single-stage surprisal models underestimate the magnitude of garden-path effects because they lack a distinct, costly reanalysis mechanism.

                                  Model rankings from cross-validated predictive fit (PSIS-LOO elpd, Fig. 6). Higher rank = better fit. Rank 1 = best.

                                  The ranking shows that even the simplest MPT model (mpt-simple) outperforms the best surprisal model, and adding an inattention component to a factorial model (factorial-inattention) only barely exceeds mpt-simple.

                                  Instances For
                                    Equations
                                    • One or more equations did not get rendered due to their size.
                                    Instances For

                                      Cross-validated predictive fit ranking (1 = best).

                                      Equations
                                      Instances For

                                        From ordinal to quantitative #

                                        The ProcessingProfile from Core.ProcessingModel captures the ordinal claim that RC is harder than CC (see rc_pareto_harder in Basic.lean). The MPT explains this ordering by decomposing it into quantitative components: RC has higher garden-path probability, higher reanalysis cost, and lower reanalysis success rate. The ordinal profile is the observable consequence; the MPT is the latent mechanism.

                                        Uniqueness presupposition grounds referential context #

                                        The experimental manipulation of referential context (unique vs. non-unique referents) is grounded in the uniqueness presupposition of definite descriptions (the_uniq in Theories/Semantics/Lexical/Determiner/Definite.lean).

                                        In the non-unique condition (He was introduced to two women. He told the woman that...), a bare definite "the woman" fails uniqueness because two entities satisfy the restrictor. An RC modifier ("the woman that he'd risked his life for") intersects the restrictor with the RC predicate, narrowing to a single entity and rescuing uniqueness.

                                        This makes the RC pragmatically necessary in non-unique contexts — the same mechanism underlying @cite{sedivy-etal-1999}'s contrastive inference: a modifier is informative when a contrast set is available. In Sedivy et al.'s visual-world paradigm, the contrast set is perceptual (tall glass vs. short glass); here, it is discourse-referential (woman₁ vs. woman₂). Both are instances of modifier informativity increasing with the availability of alternatives.

                                        The argument chain:

                                        1. the_uniq presupposes exactly one restrictor-satisfier
                                        2. Non-unique context → presupposition fails for bare NP
                                        3. RC modifier narrows restrictor → presupposition can succeed
                                        4. Therefore RC is pragmatically licensed in non-unique contexts
                                        5. This matches contextSupports .nonUniqueReferents .relativeClause = true
                                        6. Parser is biased toward RC → less garden-pathing (Finding II)

                                        Toy discourse entity for the uniqueness worked example.

                                        Instances For
                                          Equations
                                          • One or more equations did not get rendered due to their size.
                                          Instances For

                                            Non-unique context domain: a man and two women.

                                            Equations
                                            • One or more equations did not get rendered due to their size.
                                            Instances For

                                              Unique context domain: a man and one woman.

                                              Equations
                                              • One or more equations did not get rendered due to their size.
                                              Instances For

                                                In non-unique context, bare "the woman" FAILS uniqueness: two entities satisfy the restrictor, so the presupposition of the_uniq is not met.

                                                In non-unique context, modified "the woman that he'd risked his life for" SUCCEEDS: the RC modifier narrows to one entity.

                                                In unique context, bare "the woman" already SUCCEEDS: only one entity satisfies the restrictor, so no modifier needed.

                                                The full argument chain: uniqueness presupposition grounds contextSupports from Basic.lean.

                                                1. Non-unique context → bare definite fails → RC modifier needed
                                                2. This is exactly contextSupports .nonUniqueReferents .relativeClause
                                                3. Unique context → bare definite succeeds → no modifier needed
                                                4. This is exactly contextSupports .uniqueReferent .complementClause

                                                Shared mechanism with contrastive inference #

                                                modifierNecessary (defined in Definite.lean) captures the abstract predicate: a modifier rescues a failed uniqueness presupposition. Both Paape & Vasishth's context-sensitive attachment and @cite{sedivy-etal-1999}'s contrastive inference are instances of the same predicate — the modifier type differs (RC vs. scalar adjective) but the referential mechanism is identical.

                                                In non-unique context, the RC modifier is referentially necessary: bare "the woman" is ambiguous, modified "the woman that P" is unique.

                                                Toy visual-world entity for the @cite{sedivy-etal-1999} scenario.

                                                Instances For
                                                  Equations
                                                  • One or more equations did not get rendered due to their size.
                                                  Instances For

                                                    Contrast display: two glasses (tall and short) plus distractors.

                                                    Equations
                                                    • One or more equations did not get rendered due to their size.
                                                    Instances For

                                                      No-contrast display: one glass plus distractors.

                                                      Equations
                                                      • One or more equations did not get rendered due to their size.
                                                      Instances For

                                                        With a contrast set (two glasses), "tall" is referentially necessary: bare "the glass" is ambiguous, "the tall glass" is unique.

                                                        Structural identity: the same modifierNecessary predicate governs both phenomena. When alternatives are available, the modifier is necessary; when the referent is already unique, the modifier is redundant. The modifier type is irrelevant — RC (Paape & Vasishth) and scalar adjective (Sedivy et al.) behave identically at this level of abstraction.