@cite{haspelmath-2021}: Explaining Argument-Coding Splits @cite{haspelmath-2021} #
Explaining argument-coding splits with role-reference associations. Linguistics 59(5): 1231–1270.
Overview #
Haspelmath proposes a single meta-universal — the Role-Reference Association Universal (Universal 1) — that subsumes differential object marking, split ergativity, ditransitive splits, and person-scenario splits under one generalization: deviations from the usual associations of role rank and referential prominence tend to be coded by longer grammatical forms.
The paper further argues that this universal itself reduces to a form-frequency correspondence (§10.2): the "usual" associations ARE the frequent ones, and frequent expressions get shorter forms (Zipf).
The Paper's 14 Universals #
The paper states 14 numbered universals organized hierarchically (Figure 1, §11.1):
Meta-universals (§2) #
- Universal 1: Role-Reference Association Universal — deviations from usual role-prominence associations get longer coding
- Universal 2: Usual role-reference associations — A/R tend to be prominent, P/T tend to be non-prominent
Single-argument coding splits (§3–5) #
- Universal 3: Single-argument coding universal (general)
- Universal 4: Split P flagging / DOM (§4.1)
- Universal 5: Scenario coding universal (§3, §6)
- Universal 6: Split A flagging / DSM (§4.2)
- Universal 7: Split R flagging (§5)
- Universal 8: Split T flagging (§5)
Scenario and voice splits (§6–9) #
- Universal 9: Monotransitive scenario splits (§6)
- Universal 10: Ditransitive person-role splits (§7)
- Universal 11: Relative scenario splits (§8)
- Universal 12: Inverse (§9)
- Universal 13: Passive (§9)
- Universal 14: Dative alternation (§9)
What We Formalize #
This file captures Universals 1–9 using existing linglib infrastructure.
Universals 4 and 6 are already proven in Aissen2003.lean and
DeHoopMalchukov2008.lean respectively — we re-export those results.
Universals 10–14 (ditransitive scenarios, inverse, passive, dative
alternation) are noted but require additional infrastructure.
Universal 1 (@cite{haspelmath-2021}, §2, statement (5)):
"Deviations from the usual associations of role rank and referential
prominence tend to be coded by longer grammatical forms (if the coding
is asymmetric)."
This is the paper's central claim — a single meta-universal that
subsumes all 13 other universals. It is formalized here as the
conjunction of two properties: (a) the direction of differential
marking matches the role's default prominence zone, and (b) the
form-frequency correspondence assigns higher frequency to the
usual (default) zone.
Universal 1: The Role-Reference Association Universal holds for all monotransitive and ditransitive roles. For each role, the direction of differential marking matches the deviation from the usual association.
- P/T: prominent end is unusual → differential marking targets prominent
- A/R: non-prominent end is unusual → differential marking targets non-prominent
Universal 1 is grounded by the form-frequency correspondence (§10.2): the frequency proxy assigns higher frequency to the "usual" (default) zone for every role. Usual = frequent = short coding.
Universal 2 (@cite{haspelmath-2021}, §2, statement (6)):
"Arguments with higher-ranked roles tend to be more referentially
prominent, and vice versa."
This defines the *baseline* for Universal 1: A/R arguments (higher role
rank) tend to be human, definite, topical. P/T arguments (lower role
rank) tend to be inanimate, indefinite, new-information.
Formalized as: high-rank roles have high default prominence, low-rank
roles have low default prominence.
Universal 2: Role rank determines default prominence direction. A and R are high-default (expect prominent referents). P and T are low-default (expect non-prominent referents).
Universal 2 is consistent with role rank: A > R > S > T > P. Higher role rank → higher default prominence expectation.
Universal 3 (@cite{haspelmath-2021}, §3, statement (13)):
"The single-argument flagging universal: If a language has an asymmetric
single-argument flagging split depending on some prominence scale, then
the coding is longer for prominent P/T-arguments or for non-prominent
A/R-arguments."
This is the general form from which Universals 4, 6, 7, 8 follow as
specific cases for each argument role. It applies to both flagging and
indexing (statement (15)).
Universal 3: The monotonicity direction is determined by role type. For P/T (low-default roles), differential marking targets the prominent end (upper set / monotone). For A/R (high-default roles), it targets the non-prominent end (lower set / anti-monotone).
This subsumes both flagging and indexing channels.
Universal 4 (@cite{haspelmath-2021}, §4.1, statement (14)):
"Split P flagging (Differential Object Marking): If a language has an
asymmetric split in P flagging depending on some prominence scale, then
the special flag is used on the prominent P-argument."
This is @cite{aissen-2003}'s core result: the OT factorial typology generates
only monotone DOM patterns — marking always starts from the prominent
end of the animacy/definiteness scale.
Universal 4: All OT-generated animacy DOM patterns are monotone (prominent Ps marked before non-prominent Ps). Re-exported from @cite{aissen-2003}.
Universal 5:
The scenario coding universal: if a language has an asymmetric scenario
split, then the coding is longest for upstream scenarios (P more
prominent than A), shortest for downstream scenarios (A more prominent
than P), and intermediate for balanced scenarios.
This is the second major branch under Universal 1 (alongside Universal
3). While Universal 3 conditions coding on the prominence of a single
argument, Universal 5 conditions it on the *combination* of A-person
and P-person.
Downstream scenario: SAP acts on 3rd (A more prominent than P).
Equations
Instances For
Upstream scenario: 3rd acts on SAP (P more prominent than A).
Equations
Instances For
Balanced scenario: both 3rd person (same prominence level).
Equations
Instances For
Local scenario: 1st acts on 2nd (both SAP; downstream sub-case).
Equations
Instances For
Universal 5: Downstream scenarios have higher A-person rank than upstream. Higher A-person rank = more "usual" = predicted shorter coding.
Universal 5: The downstream/upstream/balanced trichotomy is exhaustive for all 9 person-pair scenarios.
Universal 6 (@cite{haspelmath-2021}, §4.2, statement (21)):
"Split A flagging (Differential Subject Marking): If a language has an
asymmetric split in A flagging depending on some prominence scale, then
the special flag is used on the non-prominent A-argument."
The mirror image of Universal 4. Verified via @cite{de-hoop-malchukov-2008} Distinguish constraint: weak (non-prominent) subjects get overt
ergative marking.
Universal 6: Under Distinguish, weak subjects are marked (Fore pattern). Re-exported from @cite{de-hoop-malchukov-2008}.
Universal 7 (@cite{haspelmath-2021}, §5, statement (26)):
"Split R flagging: If a language has an asymmetric split in R flagging
depending on some prominence scale, then the special flag is used on
the non-prominent R-argument."
R behaves like A: both are high-rank roles whose differential marking
targets the non-prominent end.
**Universal 8** (@cite{haspelmath-2021}, §5, statement (27)):
"Split T flagging: If a language has an asymmetric split in T flagging
depending on some prominence scale, then the special flag is used on
the prominent T-argument."
T behaves like P: both are low-rank roles whose differential marking
targets the prominent end.
Universal 7: R targets the non-prominent end (like A).
Universal 8: T targets the prominent end (like P).
R monotonicity uses the same direction as A monotonicity.
T monotonicity uses the same direction as P monotonicity.
Universal 9: Monotransitive scenario splits.
When argument coding depends on the person combination (A-person ×
P-person), coding is longest for upstream scenarios (3→SAP) and
shortest for downstream (SAP→3). Local scenarios (SAP↔SAP) tend
to get short coding as both arguments are highly prominent.
Universal 9: Local scenarios have SAP in both slots — the most prominent combination.
Universal 9: Direct (SAP→3) scenarios are downstream.
Universal 9: Inverse (3→SAP) scenarios are upstream.
The parallel between monotransitive and ditransitive alignment is a structural consequence of the role-rank hierarchy (Universal 2):
- Indirective (R marked, T = P) parallels accusative (P marked, A = S)
- Secundative (T marked, R = P) parallels ergative (A marked, P = S)
This follows from Universals 7–8: R behaves like A, T behaves like P.
Ditransitive alignment parallels monotransitive alignment: indirective marks R (the high-rank role), secundative marks T (the low-rank role).
The correlation between DOM and accusative alignment, and between DSM and ergative alignment, is independently derived in @cite{de-hoop-malchukov-2008} via the PaIP (Primary Actant Immunity Principle). @cite{haspelmath-2021} discusses this as background but does not number it as one of his 14 universals.
This theorem re-exports the De Hoop & Malchukov result for reference.
Differential marking patterns (@cite{de-hoop-malchukov-2008}, not a numbered Haspelmath universal — included for cross-reference). Under weak BiOT, all rankings produce differential marking; the asymmetry DOM ↔ nom-acc / DSM ↔ ergative requires voice alternation (passive/antipassive) to resolve PaIP-I/D conflicts.
Usual discourse-status association: A/R tend to be given (topical); P/T tend to be new (focal). This bridges ArgumentRole (Prominence) and DiscourseStatus (InformationStructure) in the study file to keep Core loosely coupled.
Equations
- Phenomena.Case.Studies.Haspelmath2021.usualDiscourseStatus Core.Prominence.ArgumentRole.A = Core.InformationStructure.DiscourseStatus.given
- Phenomena.Case.Studies.Haspelmath2021.usualDiscourseStatus Core.Prominence.ArgumentRole.R = Core.InformationStructure.DiscourseStatus.given
- Phenomena.Case.Studies.Haspelmath2021.usualDiscourseStatus Core.Prominence.ArgumentRole.S = Core.InformationStructure.DiscourseStatus.given
- Phenomena.Case.Studies.Haspelmath2021.usualDiscourseStatus Core.Prominence.ArgumentRole.P = Core.InformationStructure.DiscourseStatus.new
- Phenomena.Case.Studies.Haspelmath2021.usualDiscourseStatus Core.Prominence.ArgumentRole.T = Core.InformationStructure.DiscourseStatus.new
Instances For
Universal 10:
"If a language has a person-role constraint in its ditransitive
construction, it is restricted to T=SAP, R=3."
Reuse `Scenario` for R×T pairs (aPerson = R, pPerson = T). When T is
SAP and R is 3rd (upstream ditransitive), coding is longer — predicted
by the role-reference association (R is high-rank, so R=3rd is unusual;
T is low-rank, so T=SAP is unusual).
Upstream ditransitive scenario: R=3rd, T=SAP (both deviate from usual).
Equations
Instances For
Downstream ditransitive: R=SAP, T=3rd (both match usual).
Equations
Instances For
Universal 11:
"If a language has a relative scenario split, then the coding of
upstream scenarios is longer (or at least not shorter) than the
coding of downstream scenarios."
Frequency class decreases monotonically: downstream > balanced > upstream.
By the form-frequency correspondence, coding length increases in the
same order.
Universal 12:
"Inverse verb forms (marking upstream scenarios) tend to be
morphologically more complex than direct verb forms."
Upstream = unusual = lower frequency class = predicted longer by FFC.
Universal 13:
"Passive voice is preferred when A is non-given (unusual for A) and/or
P is non-new (unusual for P)."
Active voice is the "usual" frame where A=given, P=new. Passive is
preferred when discourse statuses deviate from these defaults.
Passive is preferred when A's or P's discourse status deviates from the usual association.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Universal 14:
"The prepositional dative (longer form) is preferred when R is non-given
and/or T is non-new."
The double-object construction is the shorter form, preferred when
discourse statuses match the usual associations (R=given, T=new).
PP-dative (longer) is preferred when R's or T's discourse status deviates from the usual association.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Form-frequency unification (@cite{haspelmath-2021}, §10.2, statement 68):
All 14 universals reduce to Zipf's form-frequency correspondence.
The scenario-level check: for every pair of scenarios where one is
downstream and the other upstream, the downstream one has a strictly
higher frequency class.