Documentation

Linglib.Phenomena.PhonologicalAlternation.Studies.Magri2025

@cite{magri-2025}: Constraint Interaction in Probabilistic Phonology #

@cite{magri-2025}

Replication of @cite{magri-2025} "Constraint Interaction in Probabilistic Phonology: Deducing Maximum Entropy Grammars from Hayes and Zuraw's Shifted Sigmoids Generalization" (Linguistic Inquiry, Early Access).

Main result #

Within harmony-based probabilistic phonology, an n-ary harmony function predicts the shifted-sigmoids generalization of Hayes and Zuraw (@cite{zuraw-hayes-2017}; @cite{hayes-2022}) if and only if the harmony is separable — it decomposes as ∏ₖ hₖ(Cₖ)^{wₖ}. Since MaxEnt harmony is separable (each hₖ = exp(−·)), ME predicts HZ as a corollary. And since any separable harmony can be construed as ME through constraint rescaling Ĉₖ = −log hₖ(Cₖ), the characterization is complete.

Formalization #

This study file instantiates @cite{magri-2025}'s theory with the Tagalog nasal substitution case study from the paper's §2–3, verifying:

  1. The six constraints satisfy ConstraintIndependence (§2.3, Figure 3)
  2. The violation differences inherit independence (ViolDiffIndependence)
  3. ME predicts HZ's constant logit-rate difference identity (§3.6, eq. 22)
  4. The identity holds for any weight assignment (not just specific values)

The constraint data comes from Fragments.Tagalog.Phonology.

Constraint independence (§2.3): for each fixed output, the six constraints satisfy ConstraintIndependence on the nasal substitution square.

C₁–C₄ (markedness) are insensitive to row (prefix); C₅–C₆ (faithfulness) are insensitive to column (stem obstruent).

ME predicts HZ for Tagalog nasal substitution (§3.6): for any weight assignment w : Fin 6 → ℝ, the MaxEnt logit rates of nasal substitution satisfy the constant-difference identity.

LR(/maŋb/) − LR(/maŋk/) = LR(/paŋb/) − LR(/paŋk/)

This is a direct instantiation of me_predicts_hz with the Tagalog violation differences and their verified independence.

LR(/maŋk/) = w₁ + w₂ − w₃ − w₄ − w₅

LR(/paŋk/) = w₁ + w₂ − w₃ − w₄ − w₆

The constant logit-rate difference equals −w₂ + w₃ + w₄ for both rows, regardless of weights. This follows from the insensitivity structure of the six constraints (§2.3).

theorem Phenomena.PhonologicalAlternation.Studies.Magri2025.odds_ratios_close :
6412 * 514494 = 3298935528 39494 * 83412 = 3294273528

The two odds ratios are close: 6412/83412 ≈ 0.0769 and 39494/514494 ≈ 0.0768 — a remarkable match confirming HZ's empirical observation. Equality of these ratios would mean logit(R(tl)) − logit(R(tr)) = logit(R(bl)) − logit(R(br)) exactly.

ME predicts HZ at the probability level: the log-probability-ratio log(P(YES|x)/P(NO|x)) under ME satisfies HZ's constant-difference identity for Tagalog nasal substitution, for any weight assignment.

This instantiates separable_predicts_hz with meSeparable and the Tagalog constraints. Since ME rescaling is the identity (meSeparable_rescale), the rescaled violation differences reduce to the raw violation differences, and violDiff_independence provides the independence hypothesis.