Pronoun Typology: PER/DEM Classification + Gradient Measures #
@cite{cardinaletti-starke-1999} @cite{elbourne-2005} @cite{patel-grosz-grosz-2017} @cite{postal-1966} @cite{schwarz-2009} @cite{schwarz-2013} @cite{levshina-stoynova-2023}
@cite{patel-grosz-grosz-2017} "Revisiting Pronominal Typology" (LI 48(2)) argue that 3rd-person pronouns split into two structural types:
- PER (personal): D_det + NP (weak article only)
- DEM (demonstrative): D_deix + D_det + NP (strong article)
Minimize DP! makes PER the default; DEM requires pragmatic licensing (emotivity, disambiguation, register).
Key Claims #
- If a language has DEM pronouns, it also has PER pronouns (DEM ⊂ PER)
- DEM use requires pragmatic licensing (Minimize DP!)
- Article system predicts D-layer structure
Gradient Component #
Following @cite{levshina-stoynova-2023} / WordOrder/Gradience.lean, we encode
continuous measures of pronoun system complexity: inventory sizes, licensing
context counts, and strength-level counts.
@cite{patel-grosz-grosz-2017}: structural classification of 3rd-person pronouns.
PER pronouns project only D_det (weak article layer). DEM pronouns project D_deix + D_det (strong article layer).
- per : PronounClass
- dem : PronounClass
Instances For
Equations
- Phenomena.Anaphora.Typology.instBEqPronounClass.beq x✝ y✝ = (x✝.ctorIdx == y✝.ctorIdx)
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
@cite{cardinaletti-starke-1999}: pronoun strength.
Three-way typology based on phonological/syntactic deficiency:
- Strong: full, stressed (can be coordinated, modified, focused)
- Weak: reduced, unstressed (cannot be coordinated/focused)
- Clitic: phonologically deficient, must attach to host
- strong : PronounStrength
- weak : PronounStrength
- clitic : PronounStrength
Instances For
Equations
- Phenomena.Anaphora.Typology.instBEqPronounStrength.beq x✝ y✝ = (x✝.ctorIdx == y✝.ctorIdx)
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Pragmatic contexts that license DEM pronoun use (@cite{patel-grosz-grosz-2017} §3).
Minimize DP! requires DEM to be pragmatically licensed. These are the five licensing contexts identified by PG&G.
- emotivity : DEMLicensingContext
- disambiguation : DEMLicensingContext
- register : DEMLicensingContext
- deixis : DEMLicensingContext
- contrast : DEMLicensingContext
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
A 3rd-person pronoun form in a language's inventory.
- form : String
- pronClass : PronounClass
- strengths : List PronounStrength
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
- Phenomena.Anaphora.Typology.instBEqPronounForm.beq x✝¹ x✝ = false
Instances For
Per-language pronoun system datum (@cite{patel-grosz-grosz-2017} + @cite{cardinaletti-starke-1999}).
Each datum records the full 3rd-person pronoun inventory, article system, D-layer count, DEM licensing contexts, and DEM productivity.
- language : String
- isoCode : String
- forms : List PronounForm
Available 3rd-person pronoun forms
- articleType : Core.Definiteness.ArticleType
Article system type
- dLayers : Nat
Number of D-layers: 1 = D_det only (PER), 2 = D_deix + D_det (PER+DEM)
- demLicensing : List DEMLicensingContext
Pragmatic contexts licensing DEM use (empty for PER-only languages)
- demProductive : Bool
Whether DEM pronouns are productive (freely usable) as 3rd-person reference
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
- Phenomena.Anaphora.Typology.instBEqPronounSystemDatum.beq x✝¹ x✝ = false
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
All 11 languages from @cite{patel-grosz-grosz-2017} survey.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Finnish: "hän" (3sg human, PER, no gender), "he" (3pl human, PER), "se" (3sg non-human / DEM), "tämä" (proximal DEM), "tuo" (distal DEM). No articles. "se" is productively used as 3rd-person reference in colloquial Finnish. Not part of @cite{patel-grosz-grosz-2017} sample — a counterexample to the article-DEM productivity correlation (2 D-layers, productive DEM, but no articles).
Equations
- One or more equations did not get rendered due to their size.
Instances For
Gradient pronoun system profile, analogous to GradientWOProfile.
Captures continuous variation in pronoun system complexity across languages.
- name : String
- isoCode : String
- perInventory : Nat
Number of distinct PER pronoun forms
- demInventory : Nat
Number of distinct DEM pronoun forms usable as pronouns
- demLicensingCount : Nat
Number of pragmatic contexts licensing DEM use (0–5 scale)
- strengthLevels : Nat
Pronoun strength levels available: 1=strong only, 2=strong+weak, 3=strong+weak+clitic
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
- Phenomena.Anaphora.Typology.instBEqPronounComplexityProfile.beq x✝¹ x✝ = false
Instances For
Compute gradient profile from a PronounSystemDatum.
Equations
- One or more equations did not get rendered due to their size.
Instances For
All 11 gradient pronoun system profiles.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Finnish has productive DEM with no articles — a counterexample to the PG&G sample's dem_productivity_from_article_system generalization.
PG&G Core Claims #
Minimize DP! (@cite{patel-grosz-grosz-2017} §3): Languages where DEM is productive all require pragmatic licensing (demLicensing is non-empty).
DEM is the marked choice; PER is the default.
Implicational universal: If DEM exists in a language's inventory, PER also exists. No language has DEM without PER.
This follows from PG&G's structural claim: DEM = D_deix + D_det + NP, where D_det is the PER layer. DEM presupposes PER structurally.
Article-D-layer correlation (@cite{schwarz-2009} → PG&G): Languages with both weak and strong articles have 2 D-layers.
PER-only languages (1 D-layer) have only weak or no articles.
The converse of strong_article_two_layers.
Gradient Claims #
PER inventory is continuous: ranges from 2 (Kutchi Gujarati) to 3 (most languages with m/f/n), not a binary split.
DEM inventory correlates with article system: languages with weakAndStrong articles have non-zero DEM inventory.
Strength levels vary: Romance languages (French, Italian, Spanish, Catalan) have 3 strength levels (strong+weak+clitic), while Germanic typically has 2.
Germanic languages with DEM (German, Bavarian) have 2 strength levels.
DEM licensing count ranges from 0 to 5, forming a continuum rather than a binary productive/non-productive distinction.
Open Problem #
DEM productivity tracks overt strong articles (pattern in PG&G data):
Among 2-layer languages, only those with overt weak+strong article morphology (German, Bavarian) have productive DEM. Languages with 2 D-layers but no overt articles (Hebrew, Czech) or limited article systems restrict DEM.
@cite{schwarz-2013} §5.5 provides the theoretical link: the strong article conventionalizes the D_deix layer, making DEM pronouns (which also project D_deix) more accessible. Without overt strong articles, D_deix is available syntactically but not conventionalized for reference tracking.
Open question: why does article-system conventionalization affect pronoun productivity? PG&G suggest familiarity/frequency; @cite{schwarz-2013} suggests the strong article's anaphoric function naturally extends to pronominal use.
Definite use types (@cite{hawkins-1978}, @cite{schwarz-2013} §2.1) #
Types and mappings are defined in Core/Definiteness.lean:
DefiniteUseType, BridgingSubtype, useTypeToPresupType, bridgingPresupType.
@cite{schwarz-2013} cross-linguistic article paradigm data #
Per-language article paradigm from @cite{schwarz-2013}.
- language : String
- isoCode : String
Morphological form of the strong article (if any)
Morphological form of the weak article (if any)
- weakStrategy : Core.Definiteness.WeakArticleStrategy
How weak definites are expressed
- strongForAnaphoric : Bool
Strong article used for anaphoric definites
- weakForUniqueness : Bool
Weak article/bare nominal used for uniqueness/situational
- bridgingSplit : Bool
Bridging shows the split (part-whole = weak, producer = strong)
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
- Phenomena.Anaphora.Typology.instBEqSchwarzArticleDatum.beq x✝¹ x✝ = false
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
Equations
- One or more equations did not get rendered due to their size.
Instances For
All 7 languages from @cite{schwarz-2013} survey.
Equations
- One or more equations did not get rendered due to their size.
Instances For
@cite{schwarz-2013} verified generalizations #
Strong article → anaphoric use (@cite{schwarz-2013} §3.1.1): All surveyed languages use the strong article for anaphoric definites.
Weak form → uniqueness/situational use (@cite{schwarz-2013} §3.1.2): All surveyed languages use weak articles (or bare nominals) for uniqueness-based definites.
Bridging split (@cite{schwarz-2013} §3.2): Most languages split bridging across article forms (part-whole = weak, producer = strong). 5 of 7 languages show this pattern; Hausa lacks data, and Haitian Creole uses a single form for everything.
Bare-nominal strategy (@cite{schwarz-2013} §4.1): Languages with only one overt article form (Akan, Mauritian Creole) use bare nominals for weak-article definites.
Haitian Creole is exceptional (@cite{schwarz-2013} §4.3): single determiner
la for both anaphoric and uniqueness uses — no weak/strong split.
Bridge: Schwarz article types ↔ PG&G pronoun D-layers #
@cite{schwarz-2013} §5.5 explicitly connects the article contrast to pronouns: "pronouns are definite articles without overt NP". German d-pronouns (der/die/das) are identical to strong articles. The pronominal domain shows parallel contrasts (/2007, /2011).
The structural mapping:
- Schwarz weak article = PG&G D_det layer = PER pronoun
- Schwarz strong article = PG&G D_deix + D_det = DEM pronoun
Languages with two overt article forms in @cite{schwarz-2013} correspond to 2-D-layer languages in @cite{patel-grosz-grosz-2017}. Verified for German, which appears in both datasets.
The semantic mapping is compositional (@cite{schwarz-2013} §2.2):
- Weak article contributes uniqueness presupposition (ι-operator)
- Strong article contributes familiarity/anaphoricity (index variable) This parallels PG&G's D_det (weak/uniqueness) vs D_deix (strong/deixis).
Bridge 1: PronounClass ↔ AnaphorType (Coreference.lean) #
PER pronouns correspond to AnaphorType.pronoun in Coreference.lean.
DEM pronouns have no direct AnaphorType counterpart — they are structurally
richer than simple pronouns but not descriptions either.
Equations
Instances For
All PER forms map to the pronoun binding pattern (Principle B domain).
Bridge 2: DEM pronouns ↔ Kaplan-style true demonstratives #
DEM pronouns require D_deix — the same structural layer that hosts
Kaplan's demonstration. True demonstratives in Demonstratives.lean
have a Demonstration component; DEM pronouns require D_deix licensing.
The connection: D_deix is the syntactic home of the demonstration. PER pronouns lack D_deix, so they cannot be true demonstratives.
DEM pronouns require D_deix (dLayers = 2), which is the structural position for Kaplan's demonstration. PER-only languages (dLayers = 1) cannot have true demonstrative pronouns.
Bridge 3: PER pronouns ↔ Direct Reference #
PER pronouns are directly referential in Kaplan's sense: they contribute their referent to the proposition, with no descriptive content (no D_deix, no demonstration, no descriptive component).
This connects to DirectReference.lean's modal argument: PER
pronouns, like names, are rigid designators. DEM pronouns may
involve a descriptive/deictic component (D_deix), making them
potentially non-rigid under some analyses.
PER-only languages have no descriptive D-layer: all forms are directly referential (rigid designators).
Bridge 4: Article system ↔ D-layer count #
@cite{schwarz-2009} establishes that the weak/strong article distinction is structurally real (D_det vs D_deix + D_det). PG&G build on this: languages with both article types have the structural space for DEM.
No-article languages with DEM (Hebrew, Czech) show that D-layers can exist without overt article morphology. The D_deix layer is present in the syntax even without morphological exponence.