Documentation

Linglib.Phenomena.Case.Typology

Case Typology (WALS Chapters 49--52) #

@cite{dryer-haspelmath-2013} @cite{iggesen-2013} @cite{stolz-veselinova-2013}

Formalizes four chapters from the World Atlas of Language Structures (WALS) covering the typology of case systems:

Each chapter is encoded as an inductive type with distributions derived from generated WALS data (Ch 49--51) or hand-coded counts (Ch 52). Language profiles combine all four dimensions, and typological generalizations are verified over the sample by native_decide.

Number-of-cases categories (WALS Ch. 49, @cite{iggesen-2013}).

Languages are classified by the number of morphological case distinctions in their nominal paradigm. "No morphological case-marking" means the language has no affixal or clitic case at all (e.g., Mandarin, Thai).

Instances For
    Equations
    • One or more equations did not get rendered due to their size.
    Instances For

      Chapter 49 total sample size (from generated data).

      Asymmetrical (differential) case-marking types (WALS Ch. 50, @cite{iggesen-2013}).

      Differential case marking (DCM) means that case marking on a noun phrase depends on properties of that NP -- its animacy, definiteness, or whether it is a full noun vs. a pronoun. For example, in Hindi-Urdu the accusative marker -ko appears on animate/definite objects but not inanimate/indefinite ones.

      Instances For
        Equations
        • One or more equations did not get rendered due to their size.
        Instances For

          Chapter 50 total sample size (from generated data).

          Position of case affixes (WALS Ch. 51, @cite{iggesen-2013}).

          Classifies where the case morpheme sits relative to the nominal stem. Languages with no case affixes at all (either no case or case expressed only by adpositions) are distinguished from those with suffixes, prefixes, tonal marking, or mixed strategies.

          Instances For
            Equations
            • One or more equations did not get rendered due to their size.
            Instances For

              Chapter 51 total sample size (from generated data).

              Comitative-instrumental syncretism (WALS Ch. 52, @cite{stolz-veselinova-2013}).

              In many languages the marker for 'with X' (comitative: accompaniment) and 'by means of X' (instrumental: means/instrument) is the same morpheme. For example, Russian uses the instrumental case (-om, -oj) for both "I went with Ivan" and "I cut it with a knife". Other languages distinguish them, e.g. Japanese -to (comitative) vs. -de (instrumental).

              Instances For
                Equations
                • One or more equations did not get rendered due to their size.
                Instances For

                  A language's case profile, combining classifications from all four WALS case chapters.

                  This structure records a single language's position in each of the four typological dimensions. The rawCaseCount field stores the actual number of morphological cases (not just the WALS bin), enabling finer-grained generalizations.

                  Instances For
                    Equations
                    • One or more equations did not get rendered due to their size.
                    Instances For
                      Equations
                      • One or more equations did not get rendered due to their size.
                      Instances For
                        Equations
                        Instances For

                          Whether the raw case count is consistent with the WALS bin.

                          Equations
                          Instances For

                            Whether the profile is internally consistent across chapters: no-case in Ch 49 should align with no-case in Ch 50 and no-affixes in Ch 51.

                            Equations
                            • One or more equations did not get rendered due to their size.
                            Instances For

                              Finnish: 15 morphological cases (nom, gen, acc, part, iness, elat, illat, adess, ablat, allat, ess, transl, instruct, comit, abes). Suffixal. No DCM. Comitative and instrumental are distinct cases.

                              Equations
                              • One or more equations did not get rendered due to their size.
                              Instances For

                                Hungarian: 18 morphological cases (nom, acc, dat, instrum, causal-final, translative, terminative, essive-formal, essive-modal, inessive, elative, illative, superessive, delative, sublative, adessive, ablative, allative). Suffixal agglutinative. Comitative (-val, -vel) = instrumental.

                                Equations
                                • One or more equations did not get rendered due to their size.
                                Instances For

                                  Turkish: 6 cases (nom, acc, gen, dat, loc, abl). Suffixal agglutinative. Differential object marking: definite objects take -I, indefinite do not.

                                  Equations
                                  • One or more equations did not get rendered due to their size.
                                  Instances For

                                    Latin: 6 cases (nom, acc, gen, dat, abl, voc; locative vestigial). Suffixal fusional. No asymmetrical case-marking. Comitative (cum + abl) vs. instrumental (plain abl) are technically distinct strategies.

                                    Equations
                                    • One or more equations did not get rendered due to their size.
                                    Instances For

                                      Russian: 6 cases (nom, acc, gen, dat, instrum, prep/loc). Suffixal fusional. Differential accusative: animate nouns take genitive form in accusative, inanimates keep nominative form.

                                      Equations
                                      • One or more equations did not get rendered due to their size.
                                      Instances For

                                        German: 4 cases (nom, acc, gen, dat). Suffixal fusional with articles carrying most case marking. No systematic DCM. Comitative (mit + dat) and instrumental (mit + dat) use the same marker.

                                        Equations
                                        • One or more equations did not get rendered due to their size.
                                        Instances For

                                          Japanese: case particles (ga, o, ni, no, de, e, to, kara, made,...). Postpositional clitics rather than affixes in WALS's classification. Differential object marking with -o conditioned by specificity/topicality. Comitative -to vs. instrumental -de are distinct.

                                          Equations
                                          • One or more equations did not get rendered due to their size.
                                          Instances For

                                            English: 2-case system surviving only in pronouns (nom/acc: I/me, he/him, she/her, we/us, they/them). No case affixes on nouns. Comitative 'with' and instrumental 'with' are identical.

                                            Equations
                                            • One or more equations did not get rendered due to their size.
                                            Instances For

                                              Korean: case particles (-i/ga nom, -(l)eul acc, -ui gen, -e dat/loc, -eseo loc/source, -(eu)ro instr/dir, -wa/gwa comit). Particles are postpositional clitics. Optional object marking conditioned by definiteness/topicality. Comitative -wa and instrumental -(eu)ro are distinct.

                                              Equations
                                              • One or more equations did not get rendered due to their size.
                                              Instances For

                                                Mandarin Chinese: no morphological case. Fixed SVO word order encodes grammatical relations. No case markers, no DCM, comitative and instrumental expressed by distinct prepositions (he 'with-COM' vs. yong 'with-INSTR').

                                                Equations
                                                • One or more equations did not get rendered due to their size.
                                                Instances For

                                                  Hindi-Urdu: 3 cases (direct, oblique, vocative). Postpositional system with -ne (ergative), -ko (accusative/dative), -se (instrumental/ ablative), -me (locative). Differential object marking: -ko appears on animate/specific objects. Comitative -ke saath vs. instrumental -se are distinct.

                                                  Equations
                                                  • One or more equations did not get rendered due to their size.
                                                  Instances For

                                                    Arabic (Modern Standard): 3 cases (nom -u, acc -a, gen -i). Suffixal. Full case marking on indefinite nouns (tanwin); definite nouns often show reduced marking in spoken varieties, but MSA maintains it. Comitative (maʕa) and instrumental (bi-) are distinct.

                                                    Equations
                                                    • One or more equations did not get rendered due to their size.
                                                    Instances For

                                                      Georgian: 7 cases (nom, erg, dat, gen, instrum, adverbial, vocative). Suffixal agglutinative. Split-ergative system conditioned by tense/aspect (not NP properties), so no DCM in the WALS sense. Instrumental -it and comitative -tan are distinct.

                                                      Equations
                                                      • One or more equations did not get rendered due to their size.
                                                      Instances For

                                                        Quechua (Cusco): 12+ cases (nom, acc -ta, gen -pa or -q, dat -man, loc -pi, abl -manta, instrum -wan, comit -wan, limit -kama, causal -rayku, benef -paq, topic -qa,...). Suffixal agglutinative. Comitative and instrumental both use -wan (identity).

                                                        Equations
                                                        • One or more equations did not get rendered due to their size.
                                                        Instances For

                                                          Basque: ergative-absolutive system with 11+ cases (abs, erg, dat, gen, comit -ekin, instrum -z, iness, allat, ablat, destinat, motivat). Suffixal. Differential ergative marking in some analyses. Comitative -ekin and instrumental -z are distinct.

                                                          Equations
                                                          • One or more equations did not get rendered due to their size.
                                                          Instances For

                                                            Tamil: 8 cases (nom, acc, dat, gen, instrum, loc, ablat, sociative/ comitative). Suffixal agglutinative. Differential object marking: accusative -ai on animate/definite objects. Comitative -ootu and instrumental -aal are distinct.

                                                            Equations
                                                            • One or more equations did not get rendered due to their size.
                                                            Instances For

                                                              All language profiles in our sample.

                                                              Equations
                                                              • One or more equations did not get rendered due to their size.
                                                              Instances For

                                                                Every language's raw case count falls within its declared WALS category.

                                                                All raw case counts are consistent with their WALS bins.

                                                                Cross-chapter consistency: no-case in Ch 49 aligns with noCase in Ch 50 and noAffixes in Ch 51; case-bearing languages do not have noCase in Ch 50.

                                                                All profiles are cross-chapter consistent.

                                                                Generalization 1: Case-rich languages are overwhelmingly suffixal. #

                                                                Among the world's languages, suffixal case marking is far more common than prefixal. In our sample, every language with case affixes uses suffixes (either exclusively or in combination with prefixes). This reflects the strong universal preference documented by @cite{hawkins-1983} and @cite{dryer-1992}.

                                                                Generalization 2: No prefixal-only case in our sample. #

                                                                No language in our sample uses exclusively prefixal case marking. Cross-linguistically, prefixal-only case is very rare (WALS Ch 51 reports only 7 out of 261 languages).

                                                                Generalization 3: DCM is conditioned by animacy or definiteness. #

                                                                Among languages with differential case marking in our sample, the conditioning factors are animacy, definiteness, or pronoun status -- never some other property like gender or number alone.

                                                                Generalization 4: Comitative-instrumental identity is common but #

                                                                not universal. Identity (syncretism) and differentiation both occur across language families.

                                                                Generalization 5: No-case languages have no asymmetrical marking. #

                                                                By definition, if there is no morphological case, there can be no asymmetrical (differential) case marking.

                                                                Generalization 6: No-case languages have no case affixes. #

                                                                Again by definition: without morphological case, there are no case affixes to position.

                                                                Generalization 7: 10+-case languages all have suffixal case. #

                                                                Highly agglutinative case-rich systems (Finnish, Hungarian, Quechua, Basque) uniformly use suffixes. No case-rich language uses prefixes only or tone only.

                                                                Generalization 8: Languages with 2 cases tend toward asymmetrical #

                                                                marking.

                                                                When a language has only two cases, case marking often applies differentially (to pronouns only, or conditioned by definiteness). English is the classic example: only pronouns show nominative/accusative.

                                                                Generalization 9: Comitative-instrumental identity correlates with #

                                                                case-rich systems.

                                                                Among our case-rich languages (5+ cases), those with identity include Hungarian, Russian, Turkish, Quechua -- all agglutinative or fusional languages where an instrumental case doubles for comitative.

                                                                Generalization 10: All CaseCount bins are attested in the sample. #

                                                                Our 16-language sample covers every WALS Chapter 49 category.

                                                                Spot-checks that each language has the expected WALS category values.

                                                                Number of caseless languages in our sample.

                                                                Equations
                                                                • One or more equations did not get rendered due to their size.
                                                                Instances For

                                                                  All ISO 639-3 codes are non-empty.

                                                                  All ISO 639-3 codes are exactly 3 characters (standard length).

                                                                  No duplicate ISO codes (each language appears once).

                                                                  Languages with DCM (Ch 50) all have at least 2 cases (Ch 49).

                                                                  Languages with case affixes (Ch 51) all have at least 2 cases (Ch 49).

                                                                  No language with 10+ cases uses identity for comitative-instrumental in our sample that also has no DCM and uses suffixes. This checks a three-way conjunction across chapters.

                                                                  @cite{aissen-2003} DOM Hierarchy #

                                                                  Formalizes the bidimensional DOM predictions from:

                                                                  The prominence scales (AnimacyLevel, DefinitenessLevel) and their orderings are defined in Core.Prominence and re-exported here. DOM is the P-flagging specialization of the general differential marking framework.

                                                                  @[reducible, inline]

                                                                  A DOM (Differential Object Marking) profile: a DifferentialMarkingProfile specialized to role P + channel flagging.

                                                                  Each cell (a, d) records whether an object with animacy level a and definiteness level d obligatorily receives an overt DOM marker (e.g., Spanish a, Turkish -(y)I, Hindi -ko).

                                                                  DOM is the P-flagging instance of @cite{just-2024}'s general differential marking framework. Monotonicity (isMonotone), isAnimacyOnly, and isDefinitenessOnly are all inherited from DifferentialMarkingProfile.

                                                                  Equations
                                                                  Instances For

                                                                    Spanish: a-marking for human direct objects regardless of definiteness. One-dimensional (animacy-based), cutoff between human and animate.

                                                                    Equations
                                                                    Instances For

                                                                      Russian: animate accusative (genitive form used as accusative for animate nouns). One-dimensional (animacy-based), cutoff between animate and inanimate.

                                                                      Equations
                                                                      Instances For

                                                                        Turkish: -(y)I marking for definite direct objects regardless of animacy. One-dimensional (definiteness-based), cutoff between definite and indefinite specific.

                                                                        Equations
                                                                        Instances For

                                                                          Hebrew: ʔet marking for definite direct objects regardless of animacy. Same one-dimensional definiteness cutoff as Turkish.

                                                                          Equations
                                                                          Instances For

                                                                            Persian: -rā marking for definite direct objects. One-dimensional (definiteness-based) for obligatory marking; optional extension to specific indefinite animates. Modeled here with the definiteness-based obligatory core.

                                                                            Equations
                                                                            Instances For

                                                                              Catalan: a-marking restricted to personal pronouns. The most restrictive DOM pattern attested: only the highest cell on the definiteness scale receives marking.

                                                                              Equations
                                                                              • One or more equations did not get rendered due to their size.
                                                                              Instances For

                                                                                Hindi-Urdu: -ko marking conditioned by BOTH animacy and definiteness. Two-dimensional DOM with a staircase cutoff:

                                                                                • Human objects: marked when indefinite specific or more prominent
                                                                                • Animate objects: marked when definite or more prominent
                                                                                • Inanimate objects: not obligatorily marked

                                                                                This captures the obligatory marking core. Optional/variable marking extends further down the staircase at the boundary cells.

                                                                                Equations
                                                                                • One or more equations did not get rendered due to their size.
                                                                                Instances For

                                                                                  No DOM: no differential marking (either no case at all, or uniform case on all objects). Trivially monotone.

                                                                                  Equations
                                                                                  • One or more equations did not get rendered due to their size.
                                                                                  Instances For

                                                                                    All DOM profiles in the sample.

                                                                                    Equations
                                                                                    • One or more equations did not get rendered due to their size.
                                                                                    Instances For

                                                                                      Each language's DOM pattern forms an upper set in the bidimensional animacy × definiteness grid — Aissen's central prediction.

                                                                                      Aissen's DOM monotonicity universal: all attested DOM patterns in the sample form upper sets in the bidimensional animacy × definiteness grid. No language marks a less-prominent object while leaving a more-prominent one unmarked.

                                                                                      Verify that the one-dimensional profiles are indeed one-dimensional, and that Hindi is genuinely two-dimensional.

                                                                                      Hindi DOM depends on both animacy and definiteness — it cannot be reduced to a single scale.

                                                                                      Consequences of monotonicity: higher prominence on one dimension implies at least as much marking, holding the other dimension constant.

                                                                                      In all sample languages, human objects are never less marked than animate objects at the same definiteness level.

                                                                                      In all sample languages, animate objects are never less marked than inanimate objects at the same definiteness level.

                                                                                      The most prominent cell (human, pronoun) is always marked when any DOM exists; the least prominent cell (inanimate, non-specific) is never marked in our sample.

                                                                                      The least prominent cell (inanimate, non-specific) is unmarked in all DOM languages in the sample.

                                                                                      Total marked cells across all sample languages.

                                                                                      Equations
                                                                                      • One or more equations did not get rendered due to their size.
                                                                                      Instances For

                                                                                        Marked cells: Spanish (5) + Russian (10) + Turkish (9) + Hebrew (9) + Persian (9) + Catalan (3) + Hindi (7) + NoDOM (0) = 52.