Documentation

Linglib.Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986

Bach, Brown & Marslen-Wilson (1986) #

@cite{bach-brown-marslen-wilson-1986}

Crossed and Nested Dependencies in German and Dutch: A Psycholinguistic Study. Language and Cognitive Processes, 1(4), 249–262.

Core Finding #

Dutch crossed verb-cluster dependencies (NP₁ NP₂ NP₃ V₁ V₂ V₃) are easier to process than German nested dependencies (NP₁ NP₂ NP₃ V₃ V₂ V₁) at two or more levels of embedding, in both comprehensibility ratings and comprehension accuracy. At one level of embedding (Level 2), German/Participle does not differ from Dutch, though German/Infinitive shows a significant baseline disadvantage across all levels. This confirms @cite{evers-1975}'s intuition that crossed dependencies are easier, with the first controlled experimental evidence.

Incremental Integration Model #

The paper argues qualitatively that crossed dependencies allow incremental top-down integration while nested dependencies force bottom-up accumulation of floating propositions. We formalize this via totalIntegrationCost: the cumulative count of NPs awaiting matrix-connected integration during verb-cluster processing. This metric is our formalization, not the paper's — they argue informally about when partial interpretations become available.

The cost ratio nested/crossed is exactly 2 for all n, but the absolute difference n(n−1)/2 grows quadratically — consistent with the finding that the processing difference is undetectable at n=2 (gap = 1) but large at n=3 (gap = 3).

Dependency Length Invariance #

Crossed and nested patterns have identical total NP-verb dependency length (n²). This means the Bach et al. finding cannot be explained by dependency length minimization alone — the advantage of crossed dependencies is about when information becomes available for matrix integration, not about dependency distance.

Formal–Processing Dissociation #

Crossed dependencies require mildly context-sensitive power (@cite{shieber-1985}, @cite{bresnan-etal-1982}) while nested dependencies are context-free, yet crossed is psycholinguistically easier. This refutes models where parsing difficulty tracks the Chomsky hierarchy and provides evidence against push-down-store models of human parsing (@cite{evers-1975}).

def Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.integratedBindings :

CrossSerial.DependencyPattern → (n k : ℕ) → ℕ

NP-verb bindings connected to the matrix verb after k of n verbs heard.

Crossed (Dutch): matrix verb (V₁) arrives first → k bindings top-down. Nested (German): innermost verb first → 0 until matrix (last) → then n.

This counts only matrix-rooted integration. German listeners do build partial bottom-up structure (e.g., NP₃→V₃ after the first verb), but that proposition floats without a matrix root to attach to.

Equations

Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.integratedBindings Phenomena.WordOrder.CrossSerial.DependencyPattern.crossSerial x✝¹ x✝ = min x✝ x✝¹
Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.integratedBindings Phenomena.WordOrder.CrossSerial.DependencyPattern.nested x✝¹ x✝ = if x✝ ≥ x✝¹ then x✝¹ else 0

Instances For

def Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.unintegratedAt (pat : CrossSerial.DependencyPattern) (n k : ℕ) :

NPs awaiting matrix-connected integration at verb position k.

Equations

Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.unintegratedAt pat n k = n - Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.integratedBindings pat n k

Instances For

def Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.totalIntegrationCost (pat : CrossSerial.DependencyPattern) (n : ℕ) :

Cumulative unintegrated NPs across verb positions 1..n.

Crossed: (n−1) + (n−2) + ··· + 0 = n(n−1)/2 Nested: n + n + ··· + n + 0 = n(n−1)

Equations

Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.totalIntegrationCost pat n = ∑ k ∈ Finset.range n, Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.unintegratedAt pat n (k + 1)

Instances For

theorem Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.level2_costs :

totalIntegrationCost CrossSerial.DependencyPattern.crossSerial 2 = 1 ∧ totalIntegrationCost CrossSerial.DependencyPattern.nested 2 = 2

Level 2 (n=2): minimal gap (1 vs 2).

theorem Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.level3_costs :

totalIntegrationCost CrossSerial.DependencyPattern.crossSerial 3 = 3 ∧ totalIntegrationCost CrossSerial.DependencyPattern.nested 3 = 6

Level 3 (n=3): gap widens (3 vs 6).

theorem Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.level4_costs :

totalIntegrationCost CrossSerial.DependencyPattern.crossSerial 4 = 6 ∧ totalIntegrationCost CrossSerial.DependencyPattern.nested 4 = 12

Level 4 (n=4): gap widens further (6 vs 12).

theorem Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.gap_grows :

totalIntegrationCost CrossSerial.DependencyPattern.nested 3 - totalIntegrationCost CrossSerial.DependencyPattern.crossSerial 3 > totalIntegrationCost CrossSerial.DependencyPattern.nested 2 - totalIntegrationCost CrossSerial.DependencyPattern.crossSerial 2

The absolute cost gap grows with embedding depth.

theorem Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.crossed_lt_nested (n : ℕ) (h : n ≥ 2) :

totalIntegrationCost CrossSerial.DependencyPattern.crossSerial n < totalIntegrationCost CrossSerial.DependencyPattern.nested n

Crossed is strictly cheaper for n ≥ 2.

Proof by element-wise comparison via Finset.sum_lt_sum: at each verb position k ∈ {1,…,n}, unintegratedAt .crossSerial ≤ unintegratedAt .nested, with strict inequality at k = 1 (the first verb heard).

def Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.npVerbDistance :

CrossSerial.DependencyPattern → (n i : ℕ) → ℕ

Absolute string distance between NP_i (1-indexed) and its verb.

In a string NP₁...NPₙ V?₁...V?ₙ, NP_i is at absolute position i. Crossed: V_i is the i-th verb → position n + i → distance = n. Nested: V_{n+1−i} is the (n+1−i)-th verb → position n + (n+1−i) → distance = 2(n−i) + 1.

Equations

Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.npVerbDistance Phenomena.WordOrder.CrossSerial.DependencyPattern.crossSerial x✝¹ x✝ = x✝¹
Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.npVerbDistance Phenomena.WordOrder.CrossSerial.DependencyPattern.nested x✝¹ x✝ = 2 * (x✝¹ - x✝) + 1

Instances For

def Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.totalNPVerbDist (pat : CrossSerial.DependencyPattern) (n : ℕ) :

Total NP-verb dependency length across all n pairs.

Equations

Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.totalNPVerbDist pat n = ∑ i ∈ Finset.range n, Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.npVerbDistance pat n (i + 1)

Instances For

theorem Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.dep_length_equal_at_2 :

totalNPVerbDist CrossSerial.DependencyPattern.crossSerial 2 = totalNPVerbDist CrossSerial.DependencyPattern.nested 2

Crossed and nested have identical total dependency length.

Crossed: all distances = n → total = n × n = n². Nested: distances are 2n−1, 2n−3, ..., 3, 1 → total = Σ(2k+1) = n². The Bach et al. finding is therefore NOT about dependency distance.

theorem Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.dep_length_equal_at_3 :

totalNPVerbDist CrossSerial.DependencyPattern.crossSerial 3 = totalNPVerbDist CrossSerial.DependencyPattern.nested 3

theorem Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.dep_length_equal_at_4 :

totalNPVerbDist CrossSerial.DependencyPattern.crossSerial 4 = totalNPVerbDist CrossSerial.DependencyPattern.nested 4

theorem Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.dep_length_equal (n : ℕ) :

totalNPVerbDist CrossSerial.DependencyPattern.crossSerial n = totalNPVerbDist CrossSerial.DependencyPattern.nested n

General case: both patterns yield total distance n².

theorem Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.formal_processing_dissociation :

CrossSerial.crossSerialRequires = Core.FormalLanguageType.mildlyContextSensitive ∧ CrossSerial.nestedRequires = Core.FormalLanguageType.contextFree ∧ totalIntegrationCost CrossSerial.DependencyPattern.crossSerial 3 < totalIntegrationCost CrossSerial.DependencyPattern.nested 3

Crossed dependencies are formally harder (mildly context-sensitive) but psycholinguistically easier — formal complexity ≠ processing complexity.

Two independent arguments against PDA parsing:

Dutch is comprehensible at Level 2 despite requiring MCS power (a PDA cannot handle crossed deps at any depth)
Dutch is easier than German at Level 3+ (a PDA predicts nested should be easier or equal)

theorem Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.cost_differs_despite_equal_dep_length :

totalIntegrationCost CrossSerial.DependencyPattern.crossSerial 3 < totalIntegrationCost CrossSerial.DependencyPattern.nested 3 ∧ totalNPVerbDist CrossSerial.DependencyPattern.crossSerial 3 = totalNPVerbDist CrossSerial.DependencyPattern.nested 3

Integration cost difference is NOT explained by dependency length.

inductive Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.LangGroup :

Language group. German was tested with two verb-form versions (infinitive and past participle) due to normative disagreement among informants.

dutch : LangGroup
germanInf : LangGroup
germanPart : LangGroup

Instances For

instance Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.instDecidableEqLangGroup :

DecidableEq LangGroup

Equations

Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.instDecidableEqLangGroup x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯

def Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.instBEqLangGroup.beq :

LangGroup → LangGroup → Bool

Equations

Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.instBEqLangGroup.beq x✝ y✝ = (x✝.ctorIdx == y✝.ctorIdx)

Instances For

instance Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.instBEqLangGroup :

Equations

Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.instBEqLangGroup = { beq := Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.instBEqLangGroup.beq }

def Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.instReprLangGroup.repr :

LangGroup → ℕ → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

instance Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.instReprLangGroup :

Equations

Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.instReprLangGroup = { reprPrec := Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.instReprLangGroup.repr }

def Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.testRating :

LangGroup → Fin 4 → ℕ

Test sentence comprehensibility ratings × 100 (Table 1). Original scale: 1 = easy, 9 = hard. Levels 1–4 indexed 0–3.

Equations

Instances For

def Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.paraRating :

LangGroup → Fin 3 → ℕ

Paraphrase sentence ratings × 100 (Table 1, Levels 2–4 indexed 0–2). Paraphrases express the same propositions using right-branching structure, controlling for propositional complexity.

Equations

Instances For

def Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.testComprehension :

LangGroup → Fin 2 → ℕ

Comprehension accuracy × 100 for Test sentences (Table 3). Questions tested whether each subject NP was correctly associated with its predicate verb phrase. Levels 2–3 indexed 0–1.

Equations

Instances For

def Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.comprehensionByNP :

LangGroup → Fin 3 → ℕ

Comprehension accuracy × 100 by NP position at Level 3, Test (Table 4). NP1 = matrix subject (highest clause), NP3 = most deeply embedded.

Equations

Instances For

def Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.errorDiffByNP :

LangGroup → Fin 3 → ℕ

Test−Paraphrase error rate difference × 100 by NP at Level 3 (Table 5). Higher = more syntactic disruption (Test harder relative to Paraphrase).

Equations

Instances For

theorem Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.level2_german_part_similar :

testRating LangGroup.germanPart 1 - testRating LangGroup.dutch 1 ≤ 30

At Level 2, German/Participle does not differ from Dutch (spread = 29). German/Infinitive is slightly worse throughout (spread = 43). The paper reports a significant overall Ger/Inf disadvantage but no difference for Ger/Part vs Dutch at Level 2.

theorem Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.level3_dutch_easier_rating :

testRating LangGroup.dutch 2 < testRating LangGroup.germanInf 2 ∧ testRating LangGroup.dutch 2 < testRating LangGroup.germanPart 2

At Level 3, Dutch rates Test sentences as easier than both German groups.

theorem Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.level3_dutch_better_comprehension :

testComprehension LangGroup.dutch 1 > testComprehension LangGroup.germanInf 1 ∧ testComprehension LangGroup.dutch 1 > testComprehension LangGroup.germanPart 1

At Level 3, Dutch comprehension accuracy exceeds both German groups.

theorem Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.syntactic_effect_grows_faster_for_german :

have dutch_l2 := testRating LangGroup.dutch 1 - paraRating LangGroup.dutch 0; have dutch_l3 := testRating LangGroup.dutch 2 - paraRating LangGroup.dutch 1; have gerInf_l2 := testRating LangGroup.germanInf 1 - paraRating LangGroup.germanInf 0; have gerInf_l3 := testRating LangGroup.germanInf 2 - paraRating LangGroup.germanInf 1; have gerPart_l2 := testRating LangGroup.germanPart 1 - paraRating LangGroup.germanPart 0; have gerPart_l3 := testRating LangGroup.germanPart 2 - paraRating LangGroup.germanPart 1; gerInf_l3 - dutch_l3 > gerInf_l2 - dutch_l2 ∧ gerPart_l3 - dutch_l3 > gerPart_l2 - dutch_l2

The syntactic complexity effect (Test − Paraphrase) grows faster for both German groups than Dutch from Level 2 to Level 3, paralleling the model's prediction that the integration cost gap grows with depth.

NP2 (middle NP) is hardest for all three groups (Table 4, Test). This is an interference effect: NP2 is distinguished by neither primacy (NP1) nor recency (NP3), making it hardest to retrieve regardless of the dependency pattern.

theorem Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.dutch_np3_advantage :

errorDiffByNP LangGroup.dutch 2 = 0 ∧ errorDiffByNP LangGroup.germanInf 2 > 0 ∧ errorDiffByNP LangGroup.germanPart 2 > 0

Dutch advantage is largest for NP3 (most deeply embedded clause).

Dutch shows ZERO Test−Para error for NP3 (errorDiffByNP .dutch 2 = 0), while both German groups show substantial error (41, 36). The paper explains: in Dutch, NP3's verb (V₃) arrives last and integrates into an already-built matrix structure. In German, NP3's verb (V₃) arrives first — the proposition is immediately parseable but floats without a matrix root, so the information decays before it can be used.

The model predicts crossed < nested, the data confirms it, and dependency length cannot explain the difference.