Chuj Verb Building: Empirical Data and Bridge Theorems #
@cite{coon-2019}
Theory-neutral empirical data from @cite{coon-2019} "Building verbs in Chuj: Consequences for the nature of roots." Journal of Linguistics 55(1): 35–81.
Chuj is a Q'anjob'alan (Mayan) language spoken in Guatemala and Mexico. The data here encodes the paper's primary empirical observations about root classes, voice morphology, and argument structure, without committing to the theoretical analysis.
Data encoded #
- Root classes (§§2–3): four morphosyntactic classes of roots (√TV, √ITV, √POS, √NOM), identified by their surface distribution.
- Voice suffixes (Table 58/78): Ø, -ch, -j, -w with their morphological and distributional properties.
- Paradigm grammaticality (§§2–5): which root×voice combinations are grammatical.
- -aj distribution (§5): existential closure suffix tracks implicit arguments.
- Agent diagnostics (§4.1–4.2): agent-oriented adverbs and by-phrases distinguish -ch from -j.
- Example verbs with glosses, organized by root class.
Bridge theorems #
Chuj fragment bridge #
Connects the Chuj fragment (Fragments/Chuj/VerbBuilding.lean) to the
empirical data.
Root class ↔ Root arity: The phenomena's
CRootClassmaps to the fragment'sRootvalues. √TV = selectsTheme, others = noTheme.Voice suffix ↔ VoiceHead: Each suffix maps to the fragment's VoiceHead, with matching properties (theta assignment, D feature, phase head status).
Paradigm predictions: The fragment's
isGrammaticalmatches the data's paradigm attestation for all root×voice combinations.-aj predictions: The fragment's
hasImplicitExternalandtriggersAjmatch the data's -aj distribution.Agent diagnostics: The fragment's
assignsThetamatches the data's agent adverb and by-phrase diagnostics.Division of labor: The data's
formsBareTransitivealigns with the fragment's arity distinction: only roots withselectsThemeform bare transitives.
Root typology bridge #
Connects the theory-side predictions of Theories/Morphology/RootTypology.lean
(@cite{beavers-etal-2021} formalization) to the empirical data in
Phenomena/Causatives/Studies/BeaversEtAl2021.lean.
Classification isomorphism: The theory's
RootTypeand the phenomena'sCoSRootClassare provably isomorphic — they describe the same partition.Diagnostic alignment: The phenomena's semantic diagnostics (
changeDenialTest,restitutiveAgainTest) agree exactly with the theory's Boolean correlates (entailsChange,allowsRestitutiveAgain).Prediction ↔ attestation: The theory predicts PC roots HAVE simple statives and result roots LACK them; the empirical data confirms this (PC: 7/8 sample roots ≥ 50%; result: all 10 sample roots ≤ 10%).
Markedness prediction: The theory predicts PC verbs are marked and result verbs are unmarked; the statistical comparison confirms PC median (56.01%) exceeds result median (15.20%).
Fragment grounding: The Chuj fragment's
Rootvalues instantiate the theory's predictions — e.g.,rootTV_res.entailsChange = truematches the theory'sRootType.entailsChange.result = true.
The four morphosyntactic root classes in Chuj, identified by surface distribution (which suffixes they combine with, whether they form bare transitive stems). Labels follow Coon's notation.
- tv : CRootClass
- itv : CRootClass
- pos : CRootClass
- nom : CRootClass
Instances For
Equations
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
The four voice suffixes in Chuj (Table 58, p. 76).
- null : ChujVoiceSuffix
- ch : ChujVoiceSuffix
- j : ChujVoiceSuffix
- w : ChujVoiceSuffix
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
The morphological form of each suffix.
Equations
Instances For
Status of the external argument for each voice form.
- overt_erg : ExtArgStatus
- overt_abs : ExtArgStatus
- implicit : ExtArgStatus
- absent : ExtArgStatus
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
External argument status for each voice suffix (Table 58).
Equations
- Phenomena.Causatives.Studies.Coon2019.ChujVoiceSuffix.null.extArgStatus = Phenomena.Causatives.Studies.Coon2019.ExtArgStatus.overt_erg
- Phenomena.Causatives.Studies.Coon2019.ChujVoiceSuffix.ch.extArgStatus = Phenomena.Causatives.Studies.Coon2019.ExtArgStatus.implicit
- Phenomena.Causatives.Studies.Coon2019.ChujVoiceSuffix.j.extArgStatus = Phenomena.Causatives.Studies.Coon2019.ExtArgStatus.absent
- Phenomena.Causatives.Studies.Coon2019.ChujVoiceSuffix.w.extArgStatus = Phenomena.Causatives.Studies.Coon2019.ExtArgStatus.overt_abs
Instances For
Whether the voice suffix assigns a thematic role to an external argument (observed via agent-oriented adverb diagnostics, §4.1–4.2).
Equations
Instances For
Whether a root class can combine with a voice suffix to form a grammatical verb stem.
Based on the distributional facts in §§2–5:
- √TV: all four voices (Ø, -ch, -j, -w) — Table 58
- √ITV: null v only (§3.1, p. 40)
- √POS: -w only (§3.2, p. 44)
- √NOM: -w only (§3.3, p. 46)
Equations
- Phenomena.Causatives.Studies.Coon2019.isGrammatical Phenomena.Causatives.Studies.Coon2019.CRootClass.tv vs = true
- Phenomena.Causatives.Studies.Coon2019.isGrammatical Phenomena.Causatives.Studies.Coon2019.CRootClass.itv Phenomena.Causatives.Studies.Coon2019.ChujVoiceSuffix.null = true
- Phenomena.Causatives.Studies.Coon2019.isGrammatical Phenomena.Causatives.Studies.Coon2019.CRootClass.pos Phenomena.Causatives.Studies.Coon2019.ChujVoiceSuffix.w = true
- Phenomena.Causatives.Studies.Coon2019.isGrammatical Phenomena.Causatives.Studies.Coon2019.CRootClass.nom Phenomena.Causatives.Studies.Coon2019.ChujVoiceSuffix.w = true
- Phenomena.Causatives.Studies.Coon2019.isGrammatical rc vs = false
Instances For
√TV is the only class that forms bare transitive stems (§2.2, p. 37).
Equations
Instances For
Whether -aj (existential closure) appears on a √TV stem in each voice form (Table 58, p. 76).
-aj marks the presence of an implicit argument:
- Ø: no implicit arg → no -aj
- -ch: implicit external arg → -aj on stem (ex. 36, p. 59)
- -j: no external arg at all → no -aj
- -w (absolutive): implicit internal arg → -aj (ex. 54a, p. 64)
- -w (incorporation): overt bare NP internal arg → no -aj (ex. 55, p. 65)
For the -w cases, we encode the two antipassive subtypes separately.
- absolutive : AntipassiveType
- incorporation : AntipassiveType
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
-aj on √TV stems in passive/agentless contexts.
Equations
- Phenomena.Causatives.Studies.Coon2019.ajOnPassive Phenomena.Causatives.Studies.Coon2019.ChujVoiceSuffix.null = false
- Phenomena.Causatives.Studies.Coon2019.ajOnPassive Phenomena.Causatives.Studies.Coon2019.ChujVoiceSuffix.ch = true
- Phenomena.Causatives.Studies.Coon2019.ajOnPassive Phenomena.Causatives.Studies.Coon2019.ChujVoiceSuffix.j = false
- Phenomena.Causatives.Studies.Coon2019.ajOnPassive Phenomena.Causatives.Studies.Coon2019.ChujVoiceSuffix.w = false
Instances For
-aj on √TV stems in antipassive (-w) contexts.
Equations
Instances For
Agent-oriented adverb test (§4.1, exx. 47–48). "chi yuj" ('on purpose') is grammatical with -ch but not -j.
(47) Ix-mak'-ch-aj-i nok' wakax (yuj ix) chi yuj. 'The cow was hit (by her) on purpose.' ✓
(48) *Ix-mak'-j-i nok' wakax chi yuj. 'The cow was hit on purpose.' ✗
Equations
- Phenomena.Causatives.Studies.Coon2019.agentAdverbOK Phenomena.Causatives.Studies.Coon2019.ChujVoiceSuffix.null = true
- Phenomena.Causatives.Studies.Coon2019.agentAdverbOK Phenomena.Causatives.Studies.Coon2019.ChujVoiceSuffix.ch = true
- Phenomena.Causatives.Studies.Coon2019.agentAdverbOK Phenomena.Causatives.Studies.Coon2019.ChujVoiceSuffix.j = false
- Phenomena.Causatives.Studies.Coon2019.agentAdverbOK Phenomena.Causatives.Studies.Coon2019.ChujVoiceSuffix.w = true
Instances For
By-phrase test (§4.1, exx. 47, 49). "yuj ix" ('by her') is grammatical with -ch but not -j.
(47)... (yuj ix)... 'by her' ✓ with -ch (49) *Ix-mak'-j-i nok' wakax yuj ix. 'The cow was hit by her.' ✗ with -j
Equations
- Phenomena.Causatives.Studies.Coon2019.byPhraseOK Phenomena.Causatives.Studies.Coon2019.ChujVoiceSuffix.null = false
- Phenomena.Causatives.Studies.Coon2019.byPhraseOK Phenomena.Causatives.Studies.Coon2019.ChujVoiceSuffix.ch = true
- Phenomena.Causatives.Studies.Coon2019.byPhraseOK Phenomena.Causatives.Studies.Coon2019.ChujVoiceSuffix.j = false
- Phenomena.Causatives.Studies.Coon2019.byPhraseOK Phenomena.Causatives.Studies.Coon2019.ChujVoiceSuffix.w = false
Instances For
A Chuj verb entry with its root class and gloss.
- root : String
- gloss : String
- rootClass : CRootClass
Instances For
Equations
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- Phenomena.Causatives.Studies.Coon2019.mak' = { root := "mak'", gloss := "hit", rootClass := Phenomena.Causatives.Studies.Coon2019.CRootClass.tv }
Instances For
Equations
- Phenomena.Causatives.Studies.Coon2019.jax = { root := "jax", gloss := "grind", rootClass := Phenomena.Causatives.Studies.Coon2019.CRootClass.tv }
Instances For
Equations
- Phenomena.Causatives.Studies.Coon2019.k'ux = { root := "k'ux", gloss := "bite", rootClass := Phenomena.Causatives.Studies.Coon2019.CRootClass.tv }
Instances For
Equations
- Phenomena.Causatives.Studies.Coon2019.il = { root := "il", gloss := "see", rootClass := Phenomena.Causatives.Studies.Coon2019.CRootClass.tv }
Instances For
Equations
- Phenomena.Causatives.Studies.Coon2019.jatz' = { root := "jatz'", gloss := "hit (injure)", rootClass := Phenomena.Causatives.Studies.Coon2019.CRootClass.tv }
Instances For
Equations
- Phenomena.Causatives.Studies.Coon2019.tzak' = { root := "tzak'", gloss := "wrap", rootClass := Phenomena.Causatives.Studies.Coon2019.CRootClass.tv }
Instances For
Equations
- Phenomena.Causatives.Studies.Coon2019.a'_give = { root := "a'", gloss := "give", rootClass := Phenomena.Causatives.Studies.Coon2019.CRootClass.tv }
Instances For
Equations
- Phenomena.Causatives.Studies.Coon2019.lok' = { root := "lok'", gloss := "pull out", rootClass := Phenomena.Causatives.Studies.Coon2019.CRootClass.tv }
Instances For
Equations
- Phenomena.Causatives.Studies.Coon2019.way = { root := "way", gloss := "sleep", rootClass := Phenomena.Causatives.Studies.Coon2019.CRootClass.itv }
Instances For
Equations
- Phenomena.Causatives.Studies.Coon2019.ok' = { root := "ok'", gloss := "cry", rootClass := Phenomena.Causatives.Studies.Coon2019.CRootClass.itv }
Instances For
Equations
- Phenomena.Causatives.Studies.Coon2019.jaw = { root := "jaw", gloss := "arrive", rootClass := Phenomena.Causatives.Studies.Coon2019.CRootClass.itv }
Instances For
Equations
- Phenomena.Causatives.Studies.Coon2019.b'at = { root := "b'at", gloss := "go", rootClass := Phenomena.Causatives.Studies.Coon2019.CRootClass.itv }
Instances For
Equations
- Phenomena.Causatives.Studies.Coon2019.kam = { root := "kam", gloss := "die", rootClass := Phenomena.Causatives.Studies.Coon2019.CRootClass.itv }
Instances For
Equations
- Phenomena.Causatives.Studies.Coon2019.atin = { root := "atin", gloss := "bathe", rootClass := Phenomena.Causatives.Studies.Coon2019.CRootClass.itv }
Instances For
Equations
- Phenomena.Causatives.Studies.Coon2019.chot = { root := "chot", gloss := "sit/crouch", rootClass := Phenomena.Causatives.Studies.Coon2019.CRootClass.pos }
Instances For
Equations
- Phenomena.Causatives.Studies.Coon2019.kot = { root := "kot", gloss := "on all fours", rootClass := Phenomena.Causatives.Studies.Coon2019.CRootClass.pos }
Instances For
Equations
- Phenomena.Causatives.Studies.Coon2019.watz = { root := "watz", gloss := "lie face down", rootClass := Phenomena.Causatives.Studies.Coon2019.CRootClass.pos }
Instances For
Equations
- Phenomena.Causatives.Studies.Coon2019.buch = { root := "buch", gloss := "sit cross-legged", rootClass := Phenomena.Causatives.Studies.Coon2019.CRootClass.pos }
Instances For
Equations
- Phenomena.Causatives.Studies.Coon2019.chanhal = { root := "chanhal", gloss := "dance", rootClass := Phenomena.Causatives.Studies.Coon2019.CRootClass.nom }
Instances For
Equations
- Phenomena.Causatives.Studies.Coon2019.a'_water = { root := "a'", gloss := "water/swim", rootClass := Phenomena.Causatives.Studies.Coon2019.CRootClass.nom }
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
(8) Active transitive: √TV + Ø (§2.2, p. 37).
Equations
- One or more equations did not get rendered due to their size.
Instances For
(20) √ITV + null v (§3.1, p. 40).
Equations
- One or more equations did not get rendered due to their size.
Instances For
(23b) √POS + -w (§3.2, p. 44).
Equations
- One or more equations did not get rendered due to their size.
Instances For
(16b) √NOM + -w (§3.3, p. 46).
Equations
- One or more equations did not get rendered due to their size.
Instances For
(36) √TV + -ch (passive, §4.1, p. 59).
Equations
- One or more equations did not get rendered due to their size.
Instances For
(43a) √TV + -j (agentless passive, §4.2, p. 62).
Equations
- One or more equations did not get rendered due to their size.
Instances For
(47) Agent adverb with -ch: grammatical (§4.1, p. 61).
Equations
- One or more equations did not get rendered due to their size.
Instances For
(48) Agent adverb with -j: ungrammatical (§4.2, p. 62).
Equations
- One or more equations did not get rendered due to their size.
Instances For
(54a) √TV + -w absolutive antipassive (§4.3, p. 64).
Equations
- One or more equations did not get rendered due to their size.
Instances For
(55) √TV + -w incorporation antipassive (§4.3, p. 65).
Equations
- One or more equations did not get rendered due to their size.
Instances For
All example √TV roots are classified as tv.
All example √ITV roots are classified as itv.
All example √POS roots are classified as pos.
All example √NOM roots are classified as nom.
√TV combines with all four voice suffixes.
√ITV combines only with null v.
√POS combines only with -w.
√NOM combines only with -w.
Only √TV forms bare transitive stems.
-ch has an implicit agent; -j does not.
Agent adverbs distinguish -ch (OK) from -j (blocked).
By-phrases distinguish -ch (OK) from -j (blocked).
-aj tracks implicit arguments: -ch (implicit ext) → -aj; -j (no ext) → no -aj.
Grammatical examples are predicted grammatical; ungrammatical examples are predicted ungrammatical.
Map the phenomena's root class to the fragment's Root. This connects theory-neutral distributional classes to the theoretically analyzed Root structure.
Equations
- Phenomena.Causatives.Studies.Coon2019.toFragmentRoot Phenomena.Causatives.Studies.Coon2019.CRootClass.tv = Fragments.Chuj.rootTV_res
- Phenomena.Causatives.Studies.Coon2019.toFragmentRoot Phenomena.Causatives.Studies.Coon2019.CRootClass.itv = Fragments.Chuj.rootITV
- Phenomena.Causatives.Studies.Coon2019.toFragmentRoot Phenomena.Causatives.Studies.Coon2019.CRootClass.pos = Fragments.Chuj.rootPOS
- Phenomena.Causatives.Studies.Coon2019.toFragmentRoot Phenomena.Causatives.Studies.Coon2019.CRootClass.nom = Fragments.Chuj.rootNOM
Instances For
√TV maps to a theme-selecting root; all others map to non-theme roots. This is the formal content of the observation that only √TV forms bare transitive stems (§2.2).
The data's formsBareTransitive matches the fragment's hasInternalArg.
Only roots that select a theme can form bare transitive stems.
Map the phenomena's voice suffix to the fragment's VoiceHead.
Equations
- Phenomena.Causatives.Studies.Coon2019.toFragmentVoice Phenomena.Causatives.Studies.Coon2019.ChujVoiceSuffix.null = Fragments.Chuj.vØ
- Phenomena.Causatives.Studies.Coon2019.toFragmentVoice Phenomena.Causatives.Studies.Coon2019.ChujVoiceSuffix.ch = Fragments.Chuj.v_ch
- Phenomena.Causatives.Studies.Coon2019.toFragmentVoice Phenomena.Causatives.Studies.Coon2019.ChujVoiceSuffix.j = Fragments.Chuj.v_j
- Phenomena.Causatives.Studies.Coon2019.toFragmentVoice Phenomena.Causatives.Studies.Coon2019.ChujVoiceSuffix.w = Fragments.Chuj.v_w
Instances For
Theta assignment matches: the data's hasAgent agrees with the
fragment's assignsTheta for all four voice suffixes.
External argument status matches D feature: overt external arg ↔ hasD = true.
Only Ø is a phase head (assigns ERG case).
The data's agent adverb diagnostic matches the fragment's theta assignment. Agent-oriented adverbs require a theta-role-bearing Voice head.
The -ch vs -j contrast is the critical test: both are passives (no overt external arg), but they differ in theta assignment. The agent diagnostic data confirms the fragment's distinction.
The data's -aj on passives matches the fragment's hasImplicitExternal.
-aj appears when there is an implicit (but not absent) external argument.
The fragment's triggersAj predicts the data's full -aj distribution:
- -ch (implicit ext) → -aj
- -j (no ext) → no -aj
- -w absolutive (implicit int) → -aj
- -w incorporation (overt int) → no -aj
The fragment predicts correct event decompositions for each root×voice combination attested in the data.
√TV result + Ø → causative (active transitive) √TV result + -j → inchoative (agentless passive / anticausative) √TV result + -ch → causative (passive with implicit agent) √ITV + -w → activity (intransitive)
The core empirical claim (Table 2/77, p. 76): roots determine internal arguments, Voice determines external arguments.
The data confirms this in two ways:
- Theme persistence: √TV always has an internal arg regardless of Voice
- Voice determines agent: same root with Ø has overt agent, with -ch has implicit agent, with -j has no agent
Theme persistence across all four voice forms for √TV. The data shows √TV maintains its internal argument in active (Ø), passive (-ch), agentless passive (-j), and antipassive (-w). The fragment encodes this as a root property (arity), not a derived property — so it holds by construction.
The four root classes have distinct denotation types (@cite{coon-2019}, (3)).
The fragment's denotationType field captures these:
√TV/√ITV = eventPred ⟨e,⟨s,t⟩⟩, √POS = measureFn ⟨e,⟨s,d⟩⟩,
√NOM = entityPred ⟨e,t⟩.
√TV and √ITV share semantic type (event predicate) but differ in arity. This is the formal content of the observation that both compose with an entity argument per @cite{davis-1997}, but only √TV projects a syntactic complement.
The -w suffix cross-class generalization: -w verbalizes √POS and √NOM roots (data: both take -w), and the fragment predicts different event structures depending on the root's lower structure.
Map the theory's root type to the phenomena's root class. These are parallel enums — the bridge makes the correspondence explicit.
Equations
Instances For
Map back from phenomena to theory.
Equations
Instances For
The mapping is a bijection (left inverse).
The mapping is a bijection (right inverse).
The phenomena's changeDenialTest agrees with the theory's entailsChange.
Theory: RootType.entailsChange.result = true (result roots entail change)
Phenomena: changeDenialTest.result =.negative ("#The shattered vase
has never shattered" is contradictory — the state entails prior change)
The relationship is: entailsChange = true ↔ changeDenial = negative. That is, entailing change means the change-denial test FAILS.
The phenomena's restitutiveAgainTest agrees with the theory's
allowsRestitutiveAgain.
Both diagnostics jointly align with the full semantic correlate package.
This is the bridge version of semantic_determines_morphosyntax.
Theory predicts: PC roots have simple statives. Data confirms: 7 of 8 PC sample roots have ≥ 50% attestation. The one exception (oldRoot, age class) has 0 — noted by Beavers et al. as a crosslinguistic outlier.
Theory predicts: result roots LACK simple statives. Data confirms: all 10 result sample roots have ≤ 10% attestation.
Theory predicts: PC verbs are morphologically marked; result verbs are unmarked. (Markedness Generalization, eq. 44.) Data confirms: PC median marked % (56.01) > result median (15.20).
The theory's markedness complementarity predicts that if a language
marks PC verbs, it should NOT also show result verbs as more marked
than PC verbs. The fourth logically possible language type (result
marked, PC unmarked) is unattested — exactly 3 types are attested.
This matches the theory: markedness_complementarity says verbal and
stative markedness are always opposite.
Chuj √TV result roots instantiate the theory's result root predictions: entails change, no simple stative, unmarked verb.
Chuj √TV PC roots instantiate the theory's PC root predictions: no change entailment, has simple stative, marked verb.
The Chuj fragment witnesses the full orthogonality theorem: all four cells of the (arity × changeType) matrix are inhabited.
Per-root class verification: each Chuj root's change entailment matches
its predicted morphosyntactic correlates via grand_unification.
Every PC root in the empirical sample is classified as PC, and the theory predicts PC roots should have simple statives — they do.
Every result root in the empirical sample is classified as result, and the theory predicts result roots lack simple statives — they do.
The subclass taxonomies are parallel: the theory's PCClass and the
phenomena's PCSubclass have the same constructors. Similarly for
ResultClass and ResultSubclass. This is verified by exhaustive
mapping (both have 6 PC subclasses and 8 result subclasses).