Complementation Typology #
Cross-linguistic data on complement types, complement-taking predicates (CTPs), and subordination strategies.
Part I: CTP Typology #
@cite{dryer-haspelmath-2013} @cite{noonan-2007}
Based on:
- Noonan, M. (2007). Complementation. In T. Shopen (ed.), Language Typology and Syntactic Description, vol. 2, 2nd ed. Cambridge University Press.
Key contributions:
- Six complement types attested cross-linguistically
- Twelve CTP classes organized by semantics
- Realis/irrealis split predicts complement type selection
- Equi-deletion restricted to reduced complement types
- Implicational hierarchy on complement type distribution
- Negative raising data (fills a gap in linglib)
Part II: Subordination Strategies (WALS Chapters 94--95) #
WALS data on the cross-linguistic distribution of subordination structures:
- Ch 94: Order of Adverbial Subordinator and Clause
- Ch 95: Relationship between OV Order and Adposition Order
Part III: Complementation Strategies (WALS Chapters 124--128) #
@cite{cristofaro-2013}
WALS data on complement clause types across five subordination domains:
- Ch 124A: 'Want' Complement Subjects (283 languages)
- Ch 125A: Purpose Clauses — balanced vs deranked (170 languages)
- Ch 126A: 'When' Clauses — balanced vs deranked (174 languages)
- Ch 127A: Reason Clauses — balanced vs deranked (169 languages)
- Ch 128A: Utterance Complement Clauses — balanced vs deranked (143 languages)
Additional dimensions beyond WALS:
- Complementizer position (initial, final, none)
- Relative clause position (post-nominal, pre-nominal, internally headed, correlative)
- Purpose clause strategy (subjunctive, infinitive, nominalization, serial verb)
Key generalizations:
- Initial subordinators correlate with VO order; final subordinators with OV
- Post-nominal relative clauses are the global majority
- Pre-nominal RCs strongly correlate with OV order
- Complementizer position mirrors subordinator position
- SOV languages overwhelmingly use postpositions (Ch 95)
- Purpose clause strategy correlates with finiteness availability
A. Complement types (@cite{noonan-2007} §1) #
The six major complement types attested cross-linguistically. Ordered roughly from most to least "finite" (Noonan's "balanced" to "deranked").
- indicative : NoonanCompType
- subjunctive : NoonanCompType
- paratactic : NoonanCompType
- infinitive : NoonanCompType
- nominalized : NoonanCompType
- participle : NoonanCompType
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
B. Complement-taking predicate classes (@cite{noonan-2007} Table 2.1) #
Noonan's twelve CTP classes, organized by semantic contribution.
The ordering follows Noonan's Table 2.1 from most to least "assertive":
- Utterance/propAttitude/pretence: report/judge propositional content
- Commentative/knowledge: evaluate/know propositional content
- Perception: direct experience
- Desiderative/manipulative/modal: irrealis orientation
- Achievement/phasal: aspectual
- Negative: negation as CTP
- utterance : CTPClass
- propAttitude : CTPClass
- pretence : CTPClass
- commentative : CTPClass
- knowledge : CTPClass
- perception : CTPClass
- desiderative : CTPClass
- manipulative : CTPClass
- modal : CTPClass
- achievement : CTPClass
- phasal : CTPClass
- negative : CTPClass
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Phenomena.Complementation.Typology.instBEqCTPClass.beq x✝ y✝ = (x✝.ctorIdx == y✝.ctorIdx)
Instances For
C. Reality status (@cite{noonan-2007} §2.3) #
The fundamental realis/irrealis split that predicts complement type selection. Realis CTPs tend toward indicative; irrealis toward subjunctive/infinitive.
- realis : RealityStatus
- irrealis : RealityStatus
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Reality status of each CTP class (@cite{noonan-2007} Table 2.3).
Equations
- Phenomena.Complementation.Typology.ctpRealityStatus Phenomena.Complementation.Typology.CTPClass.utterance = Phenomena.Complementation.Typology.RealityStatus.realis
- Phenomena.Complementation.Typology.ctpRealityStatus Phenomena.Complementation.Typology.CTPClass.propAttitude = Phenomena.Complementation.Typology.RealityStatus.realis
- Phenomena.Complementation.Typology.ctpRealityStatus Phenomena.Complementation.Typology.CTPClass.pretence = Phenomena.Complementation.Typology.RealityStatus.irrealis
- Phenomena.Complementation.Typology.ctpRealityStatus Phenomena.Complementation.Typology.CTPClass.commentative = Phenomena.Complementation.Typology.RealityStatus.realis
- Phenomena.Complementation.Typology.ctpRealityStatus Phenomena.Complementation.Typology.CTPClass.knowledge = Phenomena.Complementation.Typology.RealityStatus.realis
- Phenomena.Complementation.Typology.ctpRealityStatus Phenomena.Complementation.Typology.CTPClass.perception = Phenomena.Complementation.Typology.RealityStatus.realis
- Phenomena.Complementation.Typology.ctpRealityStatus Phenomena.Complementation.Typology.CTPClass.desiderative = Phenomena.Complementation.Typology.RealityStatus.irrealis
- Phenomena.Complementation.Typology.ctpRealityStatus Phenomena.Complementation.Typology.CTPClass.manipulative = Phenomena.Complementation.Typology.RealityStatus.irrealis
- Phenomena.Complementation.Typology.ctpRealityStatus Phenomena.Complementation.Typology.CTPClass.modal = Phenomena.Complementation.Typology.RealityStatus.irrealis
- Phenomena.Complementation.Typology.ctpRealityStatus Phenomena.Complementation.Typology.CTPClass.achievement = Phenomena.Complementation.Typology.RealityStatus.irrealis
- Phenomena.Complementation.Typology.ctpRealityStatus Phenomena.Complementation.Typology.CTPClass.phasal = Phenomena.Complementation.Typology.RealityStatus.realis
- Phenomena.Complementation.Typology.ctpRealityStatus Phenomena.Complementation.Typology.CTPClass.negative = Phenomena.Complementation.Typology.RealityStatus.irrealis
Instances For
D. CTP data structure #
A cross-linguistic datum about a complement-taking predicate.
Each datum records:
- Language and verb identification
- CTP class (Noonan Table 2.1)
- Which complement types this verb allows in this language
- Reality status (derived from CTP class, but overridable for exceptions)
- Control/raising properties (Noonan §2.1-2.2)
- Negative raising (fills a gap in linglib)
- language : String
- verb : String
- ctpClass : CTPClass
- allowedCompTypes : List NoonanCompType
- realityStatus : RealityStatus
- hasEquiDeletion : Bool
- hasRaising : Bool
- hasNegativeRaising : Bool
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
- Phenomena.Complementation.Typology.instBEqCTPDatum.beq x✝¹ x✝ = false
Instances For
E. Cross-linguistic data #
English #
English attests all six complement types (@cite{noonan-2007} §1.1):
- Indicative: "John said that he was tired"
- Subjunctive: "I demand that he leave" (mandative)
- Paratactic: "John told Mary go away" (marginal)
- Infinitive: "John wants to leave"
- Nominalized: "John enjoys swimming"
- Participle: "I saw him leaving"
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Latin #
Latin uses indicative/subjunctive split along the realis/irrealis line (@cite{noonan-2007} §1.3).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Turkish #
Turkish strongly favors nominalized complements (@cite{noonan-2007} §1.4). Key contrast: even realis CTPs use nominalized forms.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Irish #
Irish uses a finite/non-finite split with interesting paratactic patterns (@cite{noonan-2007} §1.5).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Persian #
Persian shows a clear subjunctive/indicative split along CTP lines (@cite{noonan-2007} §2.3).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Hindi-Urdu #
Hindi-Urdu connects to existing Questions/Typology data. Uses subjunctive complement with desideratives (@cite{noonan-2007} §2.3).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Japanese #
Japanese connects to existing Q-particle data. Uses nominalized complements extensively.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
F. Data collections #
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
G. Verified generalizations #
G1. Realis/irrealis split (@cite{noonan-2007} Table 2.3) #
Utterance, propAttitude, commentative, knowledge, perception, and phasal CTPs are realis; desiderative, manipulative, modal, achievement, pretence, and negative are irrealis.
Each datum's reality status matches the CTP class's default.
G2. Equi-deletion restriction (@cite{noonan-2007} §2.1) #
Equi-deletion (subject deletion under coreference) only occurs with reduced complement types (infinitive, nominalized), not with finite complements (indicative, subjunctive).
Is this complement type "reduced" (non-finite)?
Equations
Instances For
Equi-deletion only occurs when some allowed complement type is reduced.
G3. Negative raising data (fills gap in linglib) #
Negative raising: "I don't think he left" ≈ "I think he didn't leave". Only propAttitude and desiderative CTPs support it. Knowledge/commentative do not.
This is the first negative-raising data in linglib.
Negative raising verbs are exclusively propAttitude or desiderative.
Knowledge CTPs never support negative raising.
G4. Implicational hierarchy (@cite{noonan-2007} §2.4) #
If a language uses indicative for desiderative CTPs, it also uses indicative for propositional attitude CTPs. This is checked per-language.
Implicational hierarchy per-language: if indicative desiderative exists, then indicative propAttitude also exists.
H. WALS Chapter 94: Order of Adverbial Subordinator and Clause #
@cite{dryer-2013-wals} classifies languages by where the adverbial subordinator (e.g., "because", "when", "if") appears relative to its clause. The fundamental distinction is between word-level and suffix-level subordinators, crossed with initial vs final position.
Sample: 659 languages.
The dominant pattern worldwide is clause-initial subordinator words (e.g., English "because he left"), which overwhelmingly correlates with VO order. Final subordinator suffixes (e.g., Turkish "-dIgI icin") correlate with OV order. This is one of the strongest head-direction correlations.
WALS Ch 94: How adverbial subordinators are positioned relative to their clause.
Five categories: subordinator word or suffix, initial or final position, plus a mixed/no-dominant category.
- Initial word: English "because he left"
- Final word: Hindi "kyonki" after clause (less common)
- Initial suffix: extremely rare
- Final suffix: Turkish "-dIgI icin" on the verb
- Mixed: no single dominant pattern
- initialWord : SubordinatorOrder
Subordinator is a free word preceding the clause. E.g., English "because he left", Arabic "li'anna-hu ghaadara". The most common type worldwide (398/659 = 60.4%).
- finalWord : SubordinatorOrder
Subordinator is a free word following the clause. E.g., Japanese "kare-ga kaetta kara" 'he-NOM returned because'. 96/659 = 14.6%.
- internalWord : SubordinatorOrder
Subordinator is a word appearing clause-internally (between subject and verb). E.g., Nkore-Kiga "when Brer Rabbit challenged the elephant". 8/659 = 1.2%.
- finalSuffix : SubordinatorOrder
Subordinator is a suffix on the verb at the end of the clause. E.g., Turkish "-dIgI icin" 'because of V-NMZ'. 64/659 = 9.7%.
- mixed : SubordinatorOrder
Mixed or no dominant subordination pattern. 93/659 = 14.1%.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
A single row in a WALS frequency table: a category label and its count.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
Instances For
Sum of counts in a WALS table.
Equations
- Phenomena.Complementation.Typology.WALSCount.totalOf cs = List.foldl (fun (acc : Nat) (c : Phenomena.Complementation.Typology.WALSCount) => acc + c.count) 0 cs
Instances For
Chapter 94 distribution: subordinator order (N = 659). Counts computed from @cite{dryer-2013-wals}, WALS Online, Ch 94.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Ch 94 total: 659 languages.
I. WALS Chapter 95: OV Order and Adposition Or@cite{dryer-2013-wals} examines the correlation between verb-object order and #
adposition type. This is one of the strongest head-direction correlations in typology: OV languages overwhelmingly use postpositions, and VO languages overwhelmingly use prepositions.
Sample: 1142 languages.
The harmonic patterns (VO+prepositions, OV+postpositions) account for 928/1142 = 81.3% of languages. The disharmonic patterns (VO+postpositions, OV+prepositions) are rare.
WALS Ch 95: Four-way classification combining verb-object order with adposition type.
The two "harmonic" patterns (matching head direction) dominate; the two "disharmonic" patterns are rare.
- voPrep : OVAdpositionType
VO order with prepositions (head-initial harmony). E.g., English "in the house", "sees the cat". 456/1142 = 39.9%.
- ovPostp : OVAdpositionType
OV order with postpositions (head-final harmony). E.g., Japanese "neko-o miru", "ie-ni" (house-in). 472/1142 = 41.3%.
- voPostp : OVAdpositionType
VO order with postpositions (disharmonic). E.g., some Austronesian languages. Very rare: 42/1142 = 3.7%.
- ovPrep : OVAdpositionType
OV order with prepositions (disharmonic). E.g., some Iranian languages. Rare: 14/1142 = 1.2%.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Chapter 95 distribution: OV order × adposition type (N = 1142). Counts computed from @cite{dryer-2013-wals}, WALS Online, Ch 95.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Ch 95 total: 1142 languages.
J. Additional Subordination Dimensions #
Beyond the WALS chapters, three further dimensions characterize how languages handle subordination: complementizer position, relative clause position, and purpose clause strategy. These dimensions interact with the subordinator order in systematic ways.
Position of the complementizer (the subordinating morpheme introducing a complement clause, e.g., English "that").
The complementizer position strongly mirrors the subordinator order from WALS Ch 94: languages with initial subordinators tend to have initial complementizers, and vice versa.
- initial : ComplementizerPosition
Complementizer precedes the clause. E.g., English "that he left", Arabic "'inna-hu ghaadara".
- final : ComplementizerPosition
Complementizer follows the clause. E.g., Japanese "kare-ga kaetta to", Korean "ku-ka tteonass-ta-ko".
- none : ComplementizerPosition
No overt complementizer; complementation via juxtaposition or verb morphology. E.g., Mandarin serial verb constructions.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Position of the relative clause with respect to the head noun.
WALS Ch 90 documents the cross-linguistic distribution. Post-nominal is the global majority, but pre-nominal dominates in East and Central Asia.
- postNominal : RelativeClausePosition
RC follows the head noun (post-nominal). E.g., English "the man [who left]", Arabic "ar-rajul [alladhi ghaadara]". The most common type worldwide.
- preNominal : RelativeClausePosition
RC precedes the head noun (pre-nominal). E.g., Japanese "[kaetta] hito" '[left] person', Mandarin "[zou-le de] ren" '[left DE] person'. Strongly correlated with OV order.
- internallyHeaded : RelativeClausePosition
Head noun appears inside the RC (internally headed). E.g., Bambara, Navajo. Rare.
- correlative : RelativeClausePosition
A correlative construction: RC appears in one clause, head noun resumed by a pronoun in the main clause. E.g., Hindi "jo aadmii aayaa, vo lambaa hai" 'which man came, he tall is'.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Strategy for expressing purpose clauses ("in order to V").
The purpose clause strategy correlates with finiteness availability: languages with productive infinitives use infinitive purpose clauses, while languages lacking infinitives use subjunctive, nominalization, or serial verb constructions.
- subjunctive : PurposeClauseStrategy
Purpose clause uses subjunctive/irrealis mood. E.g., Greek "gia na fiji" 'for SUBJ leave.3SG'.
- infinitive : PurposeClauseStrategy
Purpose clause uses infinitive. E.g., English "to leave", German "um zu gehen".
- nominalization : PurposeClauseStrategy
Purpose clause uses a nominalized verb form. E.g., Turkish "git-mek icin" 'go-NMZ for'.
- serialVerb : PurposeClauseStrategy
Purpose expressed via serial verb construction. E.g., Yoruba, many West African and Oceanic languages.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
K. Language Profile Structure #
A language's subordination profile combining all five dimensions.
Each profile records:
- WALS Ch 94: subordinator order
- WALS Ch 95 (derived): OV-adposition correlation type
- Complementizer position
- Relative clause position
- Purpose clause strategy
- Basic word order (for cross-referencing with WordOrder/Typology)
- ISO 639-3 code
- language : String
- iso : String
- subordinatorOrder : SubordinatorOrder
Ch 94: order of adverbial subordinator and clause.
- ovAdposition : OVAdpositionType
Ch 95: OV order × adposition type.
- compPosition : ComplementizerPosition
Complementizer position (initial, final, or none).
- rcPosition : RelativeClausePosition
Relative clause position.
- purposeStrategy : PurposeClauseStrategy
Purpose clause strategy.
- basicOrder : String
Basic word order label for cross-referencing.
- notes : String
Notes on the subordination system.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
L. Language Profiles #
Typologically diverse sample covering all major word-order types and subordination strategies. Each profile is documented with the key properties that distinguish it.
English: canonical VO language with initial subordinators and complementizers, post-nominal relative clauses, and infinitive purpose clauses. The prototypical head-initial subordination profile.
- "because he left" (initial subordinator word)
- "I know [that he left]" (initial complementizer)
- "the man [who left]" (post-nominal RC)
- "he came [to help]" (infinitive purpose)
Equations
- One or more equations did not get rendered due to their size.
Instances For
Japanese: canonical OV language with final subordinators and complementizers, pre-nominal relative clauses, and nominalized purpose clauses. The prototypical head-final subordination profile.
- "kare-ga kaetta kara" (final subordinator word)
- "kare-ga kaetta to" (final complementizer "to")
- "[kaetta] hito" (pre-nominal RC, no relative pronoun)
- "tabe-ru tame-ni" (nominalization purpose: eat-NMZ for-DAT)
Equations
- One or more equations did not get rendered due to their size.
Instances For
Turkish: OV language with subordinator suffixes on the verb, no overt complementizer (nominalized complements), pre-nominal relatives, and nominalized purpose clauses.
- "gel-digi icin" (subordinator suffix: come-NMZ for)
- "[gel-en] adam" (pre-nominal RC: come-PTCP man)
- "git-mek icin" (nominalization purpose: go-INF for)
Equations
- One or more equations did not get rendered due to their size.
Instances For
Hindi-Urdu: OV language with initial subordinator words (unusual for OV), correlative relative clauses (a South Asian areal feature), and subjunctive purpose clauses.
- "kyonki vo gayaa" (initial subordinator word "because")
- "ki vo gayaa" (initial complementizer "ki")
- "jo aadmii aayaa, vo lambaa hai" (correlative RC)
- "jaane ke liye" (infinitive purpose: go-INF GEN for)
Equations
- One or more equations did not get rendered due to their size.
Instances For
Mandarin Chinese: SVO language with no overt complementizer (or sentence- final particle "de"), pre-nominal relative clauses (unusual for VO), and serial verb purpose clauses.
- Pre-nominal RC despite VO: "[zou-le de] ren" (left-PERF DE person)
- Serial verb purpose: "lai bang ni" 'come help you'
- Mixed headedness: head-initial VP but head-final NP
Equations
- One or more equations did not get rendered due to their size.
Instances For
Arabic (Modern Standard): VSO language with initial subordinators and complementizers, post-nominal relative clauses, and subjunctive purpose clauses.
- "li'anna-hu ghaadara" (initial subordinator "because")
- "'anna-hu ghaadara" (initial complementizer)
- "ar-rajul [alladhi ghaadara]" (post-nominal RC)
- "li-ya-dhus" (subjunctive purpose: for-3MSG-enter.SUBJ)
Equations
- One or more equations did not get rendered due to their size.
Instances For
Korean: rigid OV language with final subordinators and complementizers, pre-nominal relative clauses, and nominalized purpose clauses. Very similar to Japanese.
- "ku-ka tteonass-ki ttaemune" (final subordinator: he-NOM left-NMZ because)
- "ku-ka tteonass-ta-ko" (final complementizer "-ko")
- "[ttonass-ten] saram" (pre-nominal RC)
- "ka-gi wihae" (nominalization purpose: go-NMZ for)
Equations
- One or more equations did not get rendered due to their size.
Instances For
Irish: VSO language with initial subordinators and complementizers, post-nominal relative clauses, and subjunctive purpose clauses. Celtic VSO languages are consistently head-initial.
- "mar gur imigh se" (initial subordinator "because")
- "go bhfuil" (initial complementizer "go")
- "an fear [a d'imigh]" (post-nominal RC)
Equations
- One or more equations did not get rendered due to their size.
Instances For
Swahili: SVO language with initial subordinators and complementizers, post-nominal relative clauses, and subjunctive purpose clauses.
- "kwa sababu alikwenda" (initial subordinator "because")
- "kwamba alikwenda" (initial complementizer "that")
- "mtu [ambaye alikwenda]" (post-nominal RC)
Equations
- One or more equations did not get rendered due to their size.
Instances For
Persian (Farsi): SOV language with initial subordinators (disharmonic for an OV language), initial complementizer "ke", post-nominal relative clauses (also disharmonic), and subjunctive purpose clauses.
- "chon raft" (initial subordinator "because")
- "ke raft" (initial complementizer "that")
- "mard-i [ke raft]" (post-nominal RC, disharmonic for OV)
- "baraye raftan" (infinitive purpose)
Equations
- One or more equations did not get rendered due to their size.
Instances For
German: V2 in main clauses, SOV in embedded; mixed subordination pattern with initial subordinators and complementizers.
- "weil er ging" (initial subordinator "because")
- "dass er ging" (initial complementizer "that")
- "der Mann, [der ging]" (post-nominal RC)
Equations
- One or more equations did not get rendered due to their size.
Instances For
Russian: SVO language with initial subordinators and complementizers, post-nominal relative clauses, and infinitive purpose clauses.
- "potomu chto on ushel" (initial subordinator "because")
- "chto on ushel" (initial complementizer "that")
- "chelovek, [kotoryj ushel]" (post-nominal RC)
Equations
- One or more equations did not get rendered due to their size.
Instances For
Quechua (Southern): rigid SOV with final subordinator suffixes, no overt complementizer (nominalized complements), pre-nominal relative clauses, and nominalized purpose clauses.
- "-pti" suffixed to verb (adverbial subordinator)
- Nominalized complement: "ri-na-n-ta" 'go-NMZ-3-ACC'
- "[hamu-q] runa" (pre-nominal RC: come-AG person)
Equations
- One or more equations did not get rendered due to their size.
Instances For
Yoruba (Kwa): SVO language with initial subordinators, post-nominal relative clauses, and serial verb purpose clauses.
- "toripe o lo" (initial subordinator "because")
- "pe o lo" (initial complementizer "that")
- "eniyan [ti o lo]" (post-nominal RC)
- Serial verb purpose: "wa ran mi" 'come help me'
Equations
- One or more equations did not get rendered due to their size.
Instances For
Tagalog: V-initial (VSO/VOS) with initial subordinators, post-nominal relative clauses, and infinitive/nominalized purpose clauses.
- "dahil umalis siya" (initial subordinator "because")
- "na umalis siya" (initial linker/complementizer "na")
- "ang tao [na umalis]" (post-nominal RC, with linker "na")
Equations
- One or more equations did not get rendered due to their size.
Instances For
Basque: SOV language with final subordinator suffixes, no overt complementizer (nominalized complements), pre-nominal relative clauses, and nominalized purpose clauses.
- "-lako" suffixed to verb (causal subordinator: because)
- "-(e)la" suffixed to verb (complement clause marker)
- "[etorri den] gizona" (pre-nominal RC: come AUX.REL man)
- "-tzeko" nominalized purpose
Equations
- One or more equations did not get rendered due to their size.
Instances For
Bambara: SOV language with initial subordinators and post-nominal relative clauses, and serial verb purpose clauses. One of the best-studied languages with internally headed RCs.
- "a tagara bawo" (initial subordinator "because")
- Internally headed RCs: a distinctive Mande feature
- Serial verb purpose constructions
Equations
- One or more equations did not get rendered due to their size.
Instances For
Amharic: SOV language with final subordinators and complementizers, post-nominal relative clauses (unusual for an OV language), and subjunctive purpose clauses.
- Subordinator suffixed to verb: "-s" (conditional), "-na" (when)
- Post-nominal RC despite SOV: "ye-hedde sew" (REL-went man)
- Subjunctive purpose: "le-yi-hed" 'for-3MSG-go.SUBJ'
Equations
- One or more equations did not get rendered due to their size.
Instances For
Malagasy: VOS language with initial subordinators and complementizers, post-nominal relative clauses, and infinitive purpose clauses. Austronesian with head-initial subordination.
- "satria lasa izy" (initial subordinator "because")
- "fa lasa izy" (initial complementizer "that")
- "ny olona [izay lasa]" (post-nominal RC)
Equations
- One or more equations did not get rendered due to their size.
Instances For
M. Data Collections #
All subordination profiles in the sample.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Sample size: 20 languages.
N. WALS Aggregate Total Verification #
Ch 94: initial subordinator words are the most common type (398/659).
Ch 95: harmonic patterns (VO+Prep, OV+Postp) dominate (928/1142 = 81.3%).
Ch 95: OV+Postpositions is the single most common pairing.
Ch 95: disharmonic patterns are rare (56/1142 = 4.9%).
O. Helper Predicates #
Does this profile have an initial subordinator (word or suffix)?
Equations
- One or more equations did not get rendered due to their size.
Instances For
Does this profile have a final subordinator (word or suffix)?
Equations
- One or more equations did not get rendered due to their size.
Instances For
Does this profile have VO order?
Equations
Instances For
Does this profile have OV order?
Equations
Instances For
Does this profile have pre-nominal RCs?
Equations
Instances For
Does this profile have post-nominal RCs?
Equations
Instances For
Count of profiles matching a predicate.
Equations
Instances For
P. Per-Language Verification #
Q. Typological Generalizations #
Q1. Initial subordinators correlate with VO order #
In our sample, every language with an initial subordinator word and VO order shows this correlation. The disharmonic cases (OV + initial subordinator) are Hindi-Urdu and Persian — both Iranian/Indo-Aryan languages with known mixed headedness.
All VO languages in our sample have initial subordinators.
Most OV languages with final subordinators use postpositions.
Q2. Post-nominal relative clauses are the global majority #
Post-nominal RCs dominate in our sample. Pre-nominal RCs appear only in OV languages (Japanese, Turkish, Korean, Quechua, Basque, Amharic) plus Mandarin (which has mixed headedness).
Post-nominal RCs are the most common type in our sample.
Count of post-nominal RC languages.
Count of pre-nominal RC languages.
Q3. Pre-nominal RCs strongly correlate with OV order #
In our sample, all pre-nominal RC languages except Mandarin have OV order. Mandarin is the well-known exception: SVO but pre-nominal RC, reflecting its mixed headedness (head-initial VP, head-final NP).
All pre-nominal RC languages in our sample are OV, except Mandarin.
Q4. Complementizer position mirrors subordinator position #
Languages with initial subordinators tend to have initial complementizers (or no overt complementizer). Languages with final subordinators tend to have final complementizers (or no overt complementizer).
No language in our sample has initial subordinator + final complementizer.
No language in our sample has final subordinator + initial complementizer.
Q5. SOV languages overwhelmingly use postpositions (Ch 95) #
The OV-postposition correlation is one of the strongest in typology. In WALS Ch 95 data, OV+Postpositions (472) dwarfs OV+Prepositions (41). In our sample, OV languages with postpositions outnumber OV languages with prepositions.
In WALS data: OV+Postpositions (472) is 11x more common than OV+Prepositions (41).
In our sample, most OV languages use postpositions.
Q6. Purpose clause strategy correlates with finiteness #
Languages with productive infinitives use infinitive purpose clauses. Languages without infinitives use subjunctive, nominalization, or serial verb constructions. In our sample, nominalization purpose clauses appear only in OV languages with suffixal morphology (Japanese, Turkish, Korean, Quechua, Basque).
All nominalization purpose languages in our sample are OV.
Serial verb purpose clauses appear in both VO and OV languages.
Q7. Head-direction consistency across constructions #
Languages that are consistently head-initial show initial subordinators, initial complementizers, and post-nominal RCs. Languages that are consistently head-final show the opposite pattern. Disharmonic languages (Persian, Hindi-Urdu, Mandarin) are the interesting exceptions.
Count of "consistently head-initial" languages (initial sub + initial comp
- post-nominal RC + VO).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Count of "consistently head-final" languages (final sub + (final comp or none)
- pre-nominal RC + OV).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Most languages in our sample are consistently head-initial or head-final.
Number of consistently head-initial languages (English, Arabic, Irish, Swahili, German, Russian, Yoruba, Tagalog, Malagasy).
Number of consistently head-final languages.
Q8. Disharmonic languages are typologically interesting #
Persian (OV + prepositions + initial comp + post-nominal RC), Hindi-Urdu (OV + initial subordinator + correlative RC), and Mandarin (VO + pre-nominal RC + no comp) are the three disharmonic languages in our sample.
Persian is disharmonic: OV with prepositions (Ch 95 type).
Hindi-Urdu is disharmonic: OV with initial subordinator.
Mandarin is disharmonic: VO with pre-nominal RC.
Q9. Correlative RCs are restricted to South Asian languages #
In our sample, only Hindi-Urdu has correlative RCs. This is an areal feature of South Asian languages.
Exactly one language in our sample has correlative RCs.
The correlative RC language is Hindi-Urdu.
Q10. Internally headed RCs are rare #
Internally headed RCs (where the head noun appears inside the relative clause) are typologically rare, attested in Navajo and Bambara in our sample.
Exactly two languages in our sample have internally headed RCs.
Q11. Ch 94 initial words dominate overall #
Initial subordinator words (398/659 = 60.4%) are by far the most common pattern in WALS Ch 94. Final subordinator words and subordinating suffixes together total 160/659, less than half of the initial word count.
Initial subordinator words outnumber final words + suffixes combined.
Initial subordinator words account for over 60% of Ch 94 sample.
Q12. Subordinator suffixes are restricted to OV languages #
In WALS Ch 94, subordinating suffixes and internal subordinator words are typologically rare (64 + 8 = 72/659). In our sample, all languages with subordinator suffixes are OV. This follows from morphological typology: suffixal subordination requires the subordinated verb to be identifiable by position, which OV order provides.
All subordinator suffix languages in our sample are OV.
R. WALS Complementation Enums #
@cite{cristofaro-2013}'s balanced/deranked typology classifies complement clauses by how much they resemble main clauses. "Balanced" means the complement retains main-clause morphology; "deranked" means it uses reduced/non-finite forms; "balanced/deranked" means both strategies exist.
This maps naturally to the Noonan complement type hierarchy:
- Balanced ≈ indicative/subjunctive (finite)
- Deranked ≈ infinitive/nominalized/participle (non-finite)
Cristofaro's balanced/deranked typology for complement clauses (shared across WALS Chapters 125A--128A).
- balanced : BalancedDeranked
- balancedDeranked : BalancedDeranked
- deranked : BalancedDeranked
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Cristofaro's 'want' complement subject typology (WALS Ch 124A). Captures whether desiderative CTPs leave the complement subject implicit or express it overtly — plus the desiderative affix/particle alternative where 'want' is not a separate verb at all.
- subjectImplicit : WantCompStrategy
- subjectOvert : WantCompStrategy
- both : WantCompStrategy
- desidAffix : WantCompStrategy
- desidParticle : WantCompStrategy
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
S. Complementation Profile #
A language's complementation profile across WALS Chapters 124A--128A, capturing how the language handles complement clauses across five subordination domains. Fields are optional because not every language appears in every WALS chapter's sample.
A language's complementation profile across WALS Chapters 124A--128A.
- language : String
Language name.
- walsCode : String
WALS code.
- wantComp : Option WantCompStrategy
Ch 124A: 'want' complement subject strategy.
- purposeClause : Option BalancedDeranked
Ch 125A: purpose clause type (balanced/deranked).
- whenClause : Option BalancedDeranked
Ch 126A: 'when' clause type (balanced/deranked).
- reasonClause : Option BalancedDeranked
Ch 127A: reason clause type (balanced/deranked).
- utteranceComp : Option BalancedDeranked
Ch 128A: utterance complement clause type (balanced/deranked).
- notes : String
Notes on the complementation system.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
- Phenomena.Complementation.Typology.instBEqComplementationProfile.beq x✝¹ x✝ = false
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
T. Language Complementation Profiles #
Profiles for languages already in the subordination sample, grounded against WALS Chapters 124A--128A where coverage exists.
English: implicit 'want' complement subject (equi/PRO); balanced+deranked in purpose, 'when', and reason clauses; balanced utterance complements.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Japanese: desiderative verbal affix (-tai); balanced+deranked purpose and 'when' clauses; balanced reason and utterance complements.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Turkish: implicit 'want' complement subject; deranked purpose and 'when' clauses; balanced+deranked reason and utterance complements.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Hindi: implicit 'want' complement subject; balanced+deranked 'when' and reason clauses; balanced utterance complements. Not in the F125A sample.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Mandarin: implicit 'want' complement subject; balanced purpose, 'when', reason, and utterance clauses.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Korean: implicit 'want' complement subject; balanced purpose, 'when', reason, and utterance clauses.
Equations
- One or more equations did not get rendered due to their size.
Instances For
German: implicit 'want' complement subject; balanced+deranked purpose and 'when' clauses; balanced reason clauses. Not in the F128A sample.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Russian: implicit 'want' complement subject; balanced+deranked purpose, 'when', and reason clauses; balanced utterance complements.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Persian: overt 'want' complement subject; deranked purpose clauses; balanced 'when', reason, and utterance clauses.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Irish: not in the F124A sample; balanced+deranked purpose, 'when', and reason clauses; balanced utterance complements.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Basque: not in the F124A sample; deranked purpose clauses; balanced+deranked 'when' and reason clauses; balanced utterance complements.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Yoruba: implicit 'want' complement subject; balanced purpose, 'when', reason, and utterance clauses.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Tagalog: implicit 'want' complement subject; deranked purpose clauses; balanced+deranked 'when' and reason clauses; balanced utterance complements.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Swahili: balanced utterance complements. Only in the F128A sample among these chapters.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Arabic (Gulf): balanced+deranked across purpose, 'when', reason, and utterance clause types. Not in the F124A sample under this code; Egyptian Arabic uses "aeg".
Equations
- One or more equations did not get rendered due to their size.
Instances For
Finnish: implicit 'want' complement subject; balanced+deranked purpose, 'when', and reason clauses; balanced+deranked utterance complements.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Spanish: implicit 'want' complement subject; deranked purpose clauses; balanced+deranked 'when' and reason clauses; balanced+deranked utterance complements.
Equations
- One or more equations did not get rendered due to their size.
Instances For
French: implicit 'want' complement subject; deranked purpose clauses; balanced+deranked 'when' and reason clauses; balanced+deranked utterance complements.
Equations
- One or more equations did not get rendered due to their size.
Instances For
U. Complementation Profile Collections #
Equations
- One or more equations did not get rendered due to their size.
Instances For
Complementation profile sample size.
V. WALS Grounding Theorems #
Per-language theorems proving that each ComplementationProfile field
matches the corresponding WALS chapter's data via the converter function.
W. WALS Distribution Verification #
Counts derived from the WALS generated data modules, replacing hand-coded distribution numbers.
Ch 124A: 'subject is left implicit' is the most common strategy.
Ch 124A: desiderative verbal affix is the second most common.
Ch 125A: deranked purpose clauses are the most common.
Ch 125A: balanced purpose clauses.
Ch 128A: balanced utterance complements dominate.
Ch 128A: deranked utterance complements are rare.
X. Cross-Chapter Generalizations #
Typological generalizations connecting complementation strategies across WALS chapters 124A--128A.
Utterance complements (Ch 128A) are overwhelmingly balanced: 'say/tell' complements tend to retain main-clause morphology cross-linguistically. This confirms the typological observation that reporting clauses resist deranking.
Purpose clauses (Ch 125A) favor deranking more than utterance complements (Ch 128A). This reflects the irrealis orientation of purpose clauses.