Documentation

Linglib.Phenomena.WordOrder.Studies.ArnoldEtAl2000

Heaviness vs. Newness in Constituent Ordering #

@cite{arnold-wasow-losongco-ginstrom-2000}

A corpus analysis and an elicitation experiment disentangle two confounded predictors of English constituent ordering:

Heaviness — structural complexity, measured by relative word count
Newness — discourse status: given/inferable vs. new

These factors are naturally confounded: new referents require more descriptive material, so they tend to be heavier. Arnold et al. use logistic regression to show that in both constructions studied — dative alternation and heavy NP shift — both weight and newness independently predict construction choice.

Studies #

Corpus analysis (§2): Aligned-Hansard corpus (Canadian parliament debates). Examines dative alternation (verb give, N=269) and heavy NP shift (bring...to N=223, take...into account N=167). Both heaviness and newness significantly predict ordering in both constructions; no interactions.
Give experiment (§3): Elicitation experiment, 48 participants (24 pairs), Stanford community. Dative alternation only (give), N=1684 instructions post-exclusion. Both factors significant, plus a significant interaction: heaviness has the largest effect when both constituents share newness status.

Constructions #

Double Object (DO): V Recipient Theme — "give Mary the book"
Prepositional Dative (PD): V Theme to-Recipient — "give the book to Mary"
Nonshifted (HNPS): V DO PP — "bring the news to the committee"
Shifted (HNPS): V PP DO — "bring to the committee the news that..."

The "heavy/new last" principle: speakers place heavier and newer constituents later. In DA, DO puts the theme last; PD puts the recipient last. In HNPS, shifting puts the direct object after the PP (later position).

Central Finding #

Both heaviness and newness independently contribute to ordering in both constructions. Neither factor can be reduced to the other. The interaction between them (significant only in the experiment) shows they function as competing constraints: each factor's effect is larger when the other is less constraining.

Bridges #

Core.InformationStructure.DiscourseStatus: Arnold et al. collapse @cite{prince-1981}'s three-way given/inferable/new into two categories. Their "given" (given + inferable) is coarser than @cite{kratzer-selkirk-2020}'s partition.
DependencyLength.lean: the "heavy last" effect is DLM's short-before-long (Behaghel's Gesetz der wachsenden Glieder). But DLM cannot model the independent newness effect that Arnold et al. demonstrate.

inductive Phenomena.WordOrder.Studies.ArnoldEtAl2000.Construction :

Constructions studied in the corpus analysis.

dativeAlternation : Construction
Dative alternation with "give": DO (V Rec Theme) vs. PD (V Theme to-Rec).
heavyNPShift : Construction
Heavy NP shift: nonshifted (V DO PP) vs. shifted (V PP DO). Uses "bring...to" and "take...into account."

Instances For

instance Phenomena.WordOrder.Studies.ArnoldEtAl2000.instDecidableEqConstruction :

DecidableEq Construction

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.instDecidableEqConstruction x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.instBEqConstruction.beq :

Construction → Construction → Bool

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.instBEqConstruction.beq x✝ y✝ = (x✝.ctorIdx == y✝.ctorIdx)

Instances For

instance Phenomena.WordOrder.Studies.ArnoldEtAl2000.instBEqConstruction :

BEq Construction

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.instBEqConstruction = { beq := Phenomena.WordOrder.Studies.ArnoldEtAl2000.instBEqConstruction.beq }

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprConstruction.repr :

Construction → ℕ → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

instance Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprConstruction :

Repr Construction

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprConstruction = { reprPrec := Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprConstruction.repr }

structure Phenomena.WordOrder.Studies.ArnoldEtAl2000.VerbData :

Corpus verb token counts (Table 1).

verb : String
construction : Construction
n : ℕ

Instances For

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprVerbData.repr :

VerbData → ℕ → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

instance Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprVerbData :

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprVerbData = { reprPrec := Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprVerbData.repr }

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.bringTo :

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.bringTo = { verb := "bring...to", construction := Phenomena.WordOrder.Studies.ArnoldEtAl2000.Construction.heavyNPShift, n := 223 }

Instances For

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.takeIntoAccount :

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.takeIntoAccount = { verb := "take...into account", construction := Phenomena.WordOrder.Studies.ArnoldEtAl2000.Construction.heavyNPShift, n := 167 }

Instances For

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.giveCorpus :

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.giveCorpus = { verb := "give", construction := Phenomena.WordOrder.Studies.ArnoldEtAl2000.Construction.dativeAlternation, n := 269 }

Instances For

theorem Phenomena.WordOrder.Studies.ArnoldEtAl2000.corpus_total :

bringTo.n + takeIntoAccount.n + giveCorpus.n = 659

Total corpus examples: 659 (Table 1).

theorem Phenomena.WordOrder.Studies.ArnoldEtAl2000.hnps_total :

bringTo.n + takeIntoAccount.n = 390

HNPS subcorpus: 390 examples.

inductive Phenomena.WordOrder.Studies.ArnoldEtAl2000.DAHeaviness :

Heaviness categories for dative alternation (Table 2). Measured as relative length: theme NP length − goal NP length.

themeShorter : DAHeaviness
Theme shorter: theme − goal ≤ −2
themeEqualGoal : DAHeaviness
Theme ≈ goal: theme − goal between −1 and 1
themeLonger : DAHeaviness
Theme longer: theme − goal ≥ 2

Instances For

instance Phenomena.WordOrder.Studies.ArnoldEtAl2000.instDecidableEqDAHeaviness :

DecidableEq DAHeaviness

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.instDecidableEqDAHeaviness x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.instBEqDAHeaviness.beq :

DAHeaviness → DAHeaviness → Bool

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.instBEqDAHeaviness.beq x✝ y✝ = (x✝.ctorIdx == y✝.ctorIdx)

Instances For

instance Phenomena.WordOrder.Studies.ArnoldEtAl2000.instBEqDAHeaviness :

BEq DAHeaviness

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.instBEqDAHeaviness = { beq := Phenomena.WordOrder.Studies.ArnoldEtAl2000.instBEqDAHeaviness.beq }

instance Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprDAHeaviness :

Repr DAHeaviness

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprDAHeaviness = { reprPrec := Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprDAHeaviness.repr }

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprDAHeaviness.repr :

DAHeaviness → ℕ → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

inductive Phenomena.WordOrder.Studies.ArnoldEtAl2000.HNPSHeaviness :

Heaviness categories for heavy NP shift (Table 3). Measured as relative length: DO length − PP length.

doMuchShorter : HNPSHeaviness
DO ≪ PP: DO − PP ≤ −4
doShorter : HNPSHeaviness
DO < PP: DO − PP between −3 and −1
doEqual : HNPSHeaviness
DO = PP: DO − PP = 0
doLonger : HNPSHeaviness
DO > PP: DO − PP between 1 and 3
doMuchLonger : HNPSHeaviness
DO ≫ PP: DO − PP ≥ 4

Instances For

instance Phenomena.WordOrder.Studies.ArnoldEtAl2000.instDecidableEqHNPSHeaviness :

DecidableEq HNPSHeaviness

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.instDecidableEqHNPSHeaviness x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.instBEqHNPSHeaviness.beq :

HNPSHeaviness → HNPSHeaviness → Bool

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.instBEqHNPSHeaviness.beq x✝ y✝ = (x✝.ctorIdx == y✝.ctorIdx)

Instances For

instance Phenomena.WordOrder.Studies.ArnoldEtAl2000.instBEqHNPSHeaviness :

BEq HNPSHeaviness

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.instBEqHNPSHeaviness = { beq := Phenomena.WordOrder.Studies.ArnoldEtAl2000.instBEqHNPSHeaviness.beq }

instance Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprHNPSHeaviness :

Repr HNPSHeaviness

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprHNPSHeaviness = { reprPrec := Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprHNPSHeaviness.repr }

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprHNPSHeaviness.repr :

HNPSHeaviness → ℕ → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.fig1n :

DAHeaviness → ℕ

Figure 1 cell sizes: "give" dative corpus, by heaviness category.

Equations

Instances For

theorem Phenomena.WordOrder.Studies.ArnoldEtAl2000.fig1_sums_to_give :

fig1n DAHeaviness.themeShorter + fig1n DAHeaviness.themeEqualGoal + fig1n DAHeaviness.themeLonger = giveCorpus.n

Figure 1 cell sizes sum to the give corpus total.

theorem Phenomena.WordOrder.Studies.ArnoldEtAl2000.da_skews_theme_heavy :

fig1n DAHeaviness.themeLonger > fig1n DAHeaviness.themeShorter + fig1n DAHeaviness.themeEqualGoal

Most DA items have theme longer than goal (57%): English datives typically have longer themes, consistent with the heavy-last tendency.

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.fig2n :

HNPSHeaviness → ℕ

Figure 2 cell sizes: HNPS corpus, by heaviness category.

Equations

Instances For

Figure 2 cell sizes sum to the HNPS total.

The DO ≫ PP category is the largest single cell (133/390 = 34%), reflecting the prevalence of heavy direct objects in shifted constructions.

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.exp_participants :

48 participants (24 pairs), 42 sessions included post-exclusion, 1684 instructions in final analysis.

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.exp_participants = 48

Instances For

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.exp_pairs :

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.exp_pairs = 24

Instances For

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.exp_sessions_included :

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.exp_sessions_included = 42

Instances For

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.exp_n :

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.exp_n = 1684

Instances For

inductive Phenomena.WordOrder.Studies.ArnoldEtAl2000.ExpNewness :

Newness conditions in the experiment.

themeGiven : ExpNewness
Theme is given (= goal is new)
bothGiven : ExpNewness
Both constituents are given
goalGiven : ExpNewness
Goal is given (= theme is new)

Instances For

instance Phenomena.WordOrder.Studies.ArnoldEtAl2000.instDecidableEqExpNewness :

DecidableEq ExpNewness

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.instDecidableEqExpNewness x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.instBEqExpNewness.beq :

ExpNewness → ExpNewness → Bool

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.instBEqExpNewness.beq x✝ y✝ = (x✝.ctorIdx == y✝.ctorIdx)

Instances For

instance Phenomena.WordOrder.Studies.ArnoldEtAl2000.instBEqExpNewness :

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.instBEqExpNewness = { beq := Phenomena.WordOrder.Studies.ArnoldEtAl2000.instBEqExpNewness.beq }

instance Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprExpNewness :

Repr ExpNewness

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprExpNewness = { reprPrec := Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprExpNewness.repr }

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprExpNewness.repr :

ExpNewness → ℕ → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.fig8n :

ExpNewness → ℕ

Figure 8 cell sizes by newness condition.

Equations

Instances For

theorem Phenomena.WordOrder.Studies.ArnoldEtAl2000.fig8_sums_to_exp :

fig8n ExpNewness.themeGiven + fig8n ExpNewness.bothGiven + fig8n ExpNewness.goalGiven = exp_n

Figure 8 cell sizes sum to experiment total.

theorem Phenomena.WordOrder.Studies.ArnoldEtAl2000.both_given_rare :

fig8n ExpNewness.bothGiven * 100 / exp_n < 2

"Both given" is extremely rare (< 2%), confirming the experiment successfully manipulated newness as a between-constituent contrast.

structure Phenomena.WordOrder.Studies.ArnoldEtAl2000.RegressionResult :

Which factors were selected by the logistic regression for each analysis.

label : String
heavinessSig : Bool
Heaviness is a significant predictor
newnessSig : Bool
Newness is a significant predictor
interactionSig : Bool
Newness × heaviness interaction is significant

Instances For

instance Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprRegressionResult :

Repr RegressionResult

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprRegressionResult = { reprPrec := Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprRegressionResult.repr }

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprRegressionResult.repr :

RegressionResult → ℕ → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.instDecidableEqRegressionResult.decEq (x✝ x✝¹ : RegressionResult) :

Decidable (x✝ = x✝¹)

Equations

One or more equations did not get rendered due to their size.

Instances For

instance Phenomena.WordOrder.Studies.ArnoldEtAl2000.instDecidableEqRegressionResult :

DecidableEq RegressionResult

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.instDecidableEqRegressionResult = Phenomena.WordOrder.Studies.ArnoldEtAl2000.instDecidableEqRegressionResult.decEq

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.instBEqRegressionResult.beq :

RegressionResult → RegressionResult → Bool

Equations

One or more equations did not get rendered due to their size.
Phenomena.WordOrder.Studies.ArnoldEtAl2000.instBEqRegressionResult.beq x✝¹ x✝ = false

Instances For

instance Phenomena.WordOrder.Studies.ArnoldEtAl2000.instBEqRegressionResult :

BEq RegressionResult

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.instBEqRegressionResult = { beq := Phenomena.WordOrder.Studies.ArnoldEtAl2000.instBEqRegressionResult.beq }

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.daCorpusResult :

RegressionResult

Corpus DA: both heaviness and newness significant, no interaction.

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.daCorpusResult = { label := "DA corpus (give)", heavinessSig := true, newnessSig := true, interactionSig := false }

Instances For

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.hnpsCorpusResult :

RegressionResult

Corpus HNPS: heaviness, newness, AND verb significant, no interactions. (Verb effect: take into account has higher shifting rate than bring to, likely because it is an opaque collocation.)

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.hnpsCorpusResult = { label := "HNPS corpus (bring to + take into account)", heavinessSig := true, newnessSig := true, interactionSig := false }

Instances For

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.daExperimentResult :

RegressionResult

Experiment DA: heaviness, newness, AND their interaction significant. (Production difficulty also significant but omitted from structure.)

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.daExperimentResult = { label := "DA experiment (give)", heavinessSig := true, newnessSig := true, interactionSig := true }

Instances For

Central finding: BOTH factors significantly predict ordering in ALL analyses. Neither can be reduced to the other.

theorem Phenomena.WordOrder.Studies.ArnoldEtAl2000.corpus_no_interactions :

(!daCorpusResult.interactionSig) = true ∧ (!hnpsCorpusResult.interactionSig) = true

No interaction in either corpus analysis: heaviness and newness contribute independently.

theorem Phenomena.WordOrder.Studies.ArnoldEtAl2000.experiment_has_interaction :

daExperimentResult.interactionSig = true

The experiment finds a significant interaction: heaviness has the largest effect when both constituents share newness status, and vice versa.

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.daCorpus_heavinessLR :

−2 × Log Likelihood Ratio values (× 10 for integer encoding) from the paper's logistic regressions. Larger values = stronger predictor.

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.daCorpus_heavinessLR = 995

Instances For

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.daCorpus_newnessLR :

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.daCorpus_newnessLR = 70

Instances For

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.hnpsCorpus_heavinessLR :

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.hnpsCorpus_heavinessLR = 1209

Instances For

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.hnpsCorpus_newnessLR :

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.hnpsCorpus_newnessLR = 235

Instances For

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.hnpsCorpus_verbLR :

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.hnpsCorpus_verbLR = 314

Instances For

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.daExp_newnessLR :

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.daExp_newnessLR = 2980

Instances For

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.daExp_heavinessLR :

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.daExp_heavinessLR = 95

Instances For

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.daExp_interactionLR :

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.daExp_interactionLR = 200

Instances For

theorem Phenomena.WordOrder.Studies.ArnoldEtAl2000.corpus_heaviness_dominates :

daCorpus_heavinessLR > daCorpus_newnessLR ∧ hnpsCorpus_heavinessLR > hnpsCorpus_newnessLR

In the corpus, heaviness has a far larger effect size than newness in both constructions.

theorem Phenomena.WordOrder.Studies.ArnoldEtAl2000.experiment_newness_dominates :

daExp_newnessLR > daExp_heavinessLR

In the experiment, newness dominates: its effect is 30× larger than heaviness. This reversal reflects the narrower heaviness range in the experiment (Table 6: range −8 to 20 words) vs. corpus (−29 to 35).

theorem Phenomena.WordOrder.Studies.ArnoldEtAl2000.heaviness_stronger_in_corpus :

daCorpus_heavinessLR > daExp_heavinessLR

Heaviness effect is stronger in the corpus than in the experiment, consistent with the wider weight range in naturally occurring data.

theorem Phenomena.WordOrder.Studies.ArnoldEtAl2000.newness_stronger_in_experiment :

daExp_newnessLR > daCorpus_newnessLR

Newness effect is stronger in the experiment than in the corpus, consistent with the experiment's more controlled newness manipulation (immediate mention vs. within-agenda-item mention).

structure Phenomena.WordOrder.Studies.ArnoldEtAl2000.LengthDiffRange :

Average difference in NP length (phrase 1 − phrase 2, × 10) for each heaviness category, from Table 6. Shows the actual weight contrasts across the three data sets.

For DA: phrase 1 = theme NP, phrase 2 = goal NP. For HNPS: phrase 1 = direct object NP, phrase 2 = prepositional phrase.

label : String
rangeMin : ℤ
rangeMax : ℤ

Instances For

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprLengthDiffRange.repr :

LengthDiffRange → ℕ → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

instance Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprLengthDiffRange :

Repr LengthDiffRange

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprLengthDiffRange = { reprPrec := Phenomena.WordOrder.Studies.ArnoldEtAl2000.instReprLengthDiffRange.repr }

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.hnpsRange :

LengthDiffRange

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.hnpsRange = { label := "HNPS corpus", rangeMin := -21, rangeMax := 44 }

Instances For

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.daCorpusRange :

LengthDiffRange

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.daCorpusRange = { label := "DA corpus", rangeMin := -29, rangeMax := 35 }

Instances For

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.daExpRange :

LengthDiffRange

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.daExpRange = { label := "DA experiment", rangeMin := -8, rangeMax := 20 }

Instances For

theorem Phenomena.WordOrder.Studies.ArnoldEtAl2000.corpus_wider_range :

daCorpusRange.rangeMax - daCorpusRange.rangeMin > daExpRange.rangeMax - daExpRange.rangeMin

The corpus data spans a far wider heaviness range than the experiment. This explains why heaviness dominates in the corpus but not the experiment: with less variation in weight, there is less for the weight factor to predict.

HNPS has the widest heaviness range overall, spanning 65 words of difference between the lightest and heaviest items.

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.arnoldGiven :

Core.InformationStructure.DiscourseStatus

Arnold et al.'s "given" (previously mentioned or inferable from something mentioned within the current agenda item in the corpus; established by question or mention in the immediately preceding utterance in the experiment) maps to DiscourseStatus.given.

Their classification collapses @cite{prince-1981}'s three-way given/inferable/new into two categories: inferables are grouped with given.

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.arnoldGiven = Core.InformationStructure.DiscourseStatus.given

Instances For

def Phenomena.WordOrder.Studies.ArnoldEtAl2000.arnoldNew :

Core.InformationStructure.DiscourseStatus

Arnold et al.'s "new" (not previously mentioned and not inferable) maps to DiscourseStatus.new. This is broader than @cite{kratzer-selkirk-2020}'s .new — it includes material that K&S would mark as .focused ([FoC]-marked, contrasted).

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.arnoldNew = Core.InformationStructure.DiscourseStatus.new

Instances For

theorem Phenomena.WordOrder.Studies.ArnoldEtAl2000.given_aligns :

arnoldGiven = Core.InformationStructure.DiscourseStatus.given

theorem Phenomena.WordOrder.Studies.ArnoldEtAl2000.new_aligns :

arnoldNew = Core.InformationStructure.DiscourseStatus.new

DLM: Correct on weight, blind to discourse #

totalDepLength is defined over Dependency = (headIdx × depIdx × DepRel). The function never accesses t.words, so no property of the words — form, category, features, discourse status — enters the computation.

Arnold et al.'s finding that newness significantly predicts ordering in BOTH constructions (even after controlling for heaviness) means DLM alone is insufficient as a complete account of constituent ordering.

theorem Phenomena.WordOrder.Studies.ArnoldEtAl2000.totalDepLength_word_invariant (deps : List DepGrammar.Dependency) (rootIdx : ℕ) (words1 words2 : List Word) :

DepGrammar.DependencyLength.totalDepLength { words := words1, deps := deps, rootIdx := rootIdx } = DepGrammar.DependencyLength.totalDepLength { words := words2, deps := deps, rootIdx := rootIdx }

DLM word-invariance. totalDepLength yields the same value for any two trees sharing the same dependency structure, regardless of the words.

theorem Phenomena.WordOrder.Studies.ArnoldEtAl2000.dlm_discourse_blind (deps : List DepGrammar.Dependency) (rootIdx : ℕ) (givenWords newWords : List Word) :

DepGrammar.DependencyLength.totalDepLength { words := givenWords, deps := deps, rootIdx := rootIdx } = DepGrammar.DependencyLength.totalDepLength { words := newWords, deps := deps, rootIdx := rootIdx }

DLM assigns identical cost to trees differing only in whether NPs are discourse-given or discourse-new.

theorem Phenomena.WordOrder.Studies.ArnoldEtAl2000.depLength_ignores_relation (h d : ℕ) (r1 r2 : UD.DepRel) :

DepGrammar.DependencyLength.depLength { headIdx := h, depIdx := d, depType := r1 } = DepGrammar.DependencyLength.depLength { headIdx := h, depIdx := d, depType := r2 }

Even at the single-dependency level, depLength ignores the grammatical relation. The cost is purely |headIdx - depIdx|.

theorem Phenomena.WordOrder.Studies.ArnoldEtAl2000.dlm_predicts_heavy_shift :

DepGrammar.DependencyLength.totalDepLength DepGrammar.DependencyLength.heavyNPShiftOptimal < DepGrammar.DependencyLength.totalDepLength DepGrammar.DependencyLength.heavyNPShiftSuboptimal

DLM correctly predicts the weight direction: heavy NP shift reduces dependency length.

structure Phenomena.WordOrder.Studies.ArnoldEtAl2000.PureDiscourseModel :

A pure-discourse ordering model: the preference for placing a constituent in late position is determined solely by its discourse status.

latePref : Core.InformationStructure.DiscourseStatus → ℕ
new_after_given : self.latePref Core.InformationStructure.DiscourseStatus.new > self.latePref Core.InformationStructure.DiscourseStatus.given
The core given-before-new claim.

Instances For

theorem Phenomena.WordOrder.Studies.ArnoldEtAl2000.pure_discourse_weight_blind (m : PureDiscourseModel) (s : Core.InformationStructure.DiscourseStatus) (_weight1 _weight2 : ℕ) :

m.latePref s = m.latePref s

A pure-discourse model is weight-blind by type: for a fixed discourse status, it assigns the same preference regardless of constituent length.

theorem Phenomena.WordOrder.Studies.ArnoldEtAl2000.heaviness_refutes_pure_discourse :

daCorpusResult.heavinessSig = true ∧ hnpsCorpusResult.heavinessSig = true

Arnold et al.'s corpus results refute pure-discourse accounts: heaviness is significant in BOTH constructions even after controlling for newness. A weight-blind model cannot explain these results.

@[reducible, inline]

abbrev Phenomena.WordOrder.Studies.ArnoldEtAl2000.OrderingModel :

The minimal adequate model type: a function of both weight and discourse status, encoding Arnold et al.'s central finding.

Equations

Phenomena.WordOrder.Studies.ArnoldEtAl2000.OrderingModel = (ℕ → Core.InformationStructure.DiscourseStatus → ℕ)

Instances For