Documentation

Linglib.Phenomena.PhonologicalAlternation.Studies.Magri2025

@cite{magri-2025}: Constraint Interaction in Probabilistic Phonology #

@cite{magri-2025}

Replication of @cite{magri-2025} "Constraint Interaction in Probabilistic Phonology: Deducing Maximum Entropy Grammars from Hayes and Zuraw's Shifted Sigmoids Generalization" (Linguistic Inquiry, Early Access).

Main result #

Within harmony-based probabilistic phonology, an n-ary harmony function predicts the shifted-sigmoids generalization of Hayes and Zuraw (@cite{zuraw-hayes-2017}; @cite{hayes-2022}) if and only if the harmony is separable — it decomposes as ∏ₖ hₖ(Cₖ)^{wₖ}. Since MaxEnt harmony is separable (each hₖ = exp(−·)), ME predicts HZ as a corollary. And since any separable harmony can be construed as ME through constraint rescaling Ĉₖ = −log hₖ(Cₖ), the characterization is complete.

Formalization #

This study file instantiates @cite{magri-2025}'s theory with the Tagalog nasal substitution case study from the paper's §2–3, verifying:

The six constraints satisfy ConstraintIndependence (§2.3, Figure 3)
The violation differences inherit independence (ViolDiffIndependence)
ME predicts HZ's constant logit-rate difference identity (§3.6, eq. 22)
The identity holds for any weight assignment (not just specific values)

The constraint data comes from Fragments.Tagalog.Phonology.

C₁ = *NC is insensitive to the prefix (row dimension): the violation is 1 for NO and 0 for YES regardless of prefix.

theorem Phenomena.PhonologicalAlternation.Studies.Magri2025.constraint_independence (o : Fragments.Tagalog.Phonology.NasalSubOutput) :

Theories.Phonology.HarmonicGrammar.ConstraintIndependence (fun (k : Fin 6) (x : Fragments.Tagalog.Phonology.NasalSubInput) => Fragments.Tagalog.Phonology.constraints k (x, o)) Fragments.Tagalog.Phonology.nasalSubSquare

Constraint independence (§2.3): for each fixed output, the six constraints satisfy ConstraintIndependence on the nasal substitution square.

C₁–C₄ (markedness) are insensitive to row (prefix); C₅–C₆ (faithfulness) are insensitive to column (stem obstruent).

theorem Phenomena.PhonologicalAlternation.Studies.Magri2025.violDiff_consistent (k : Fin 6) (x : Fragments.Tagalog.Phonology.NasalSubInput) :

Fragments.Tagalog.Phonology.violDiffProfile k x = ↑(Fragments.Tagalog.Phonology.constraints k (x, Fragments.Tagalog.Phonology.NasalSubOutput.no )) - ↑(Fragments.Tagalog.Phonology.constraints k (x, Fragments.Tagalog.Phonology.NasalSubOutput.yes ))

The violation differences are consistent with the raw constraint profiles: Δₖ(x) = Cₖ(x, NO) − Cₖ(x, YES).

theorem Phenomena.PhonologicalAlternation.Studies.Magri2025.me_predicts_hz_tagalog (w : Fin 6 → ℝ) :

Theories.Phonology.HarmonicGrammar.ConstantLogitDiff (fun (x : Fragments.Tagalog.Phonology.NasalSubInput) => ∑ k : Fin 6, w k * Fragments.Tagalog.Phonology.deltaR k x) Fragments.Tagalog.Phonology.nasalSubSquare

ME predicts HZ for Tagalog nasal substitution (§3.6): for any weight assignment w : Fin 6 → ℝ, the MaxEnt logit rates of nasal substitution satisfy the constant-difference identity.

LR(/maŋb/) − LR(/maŋk/) = LR(/paŋb/) − LR(/paŋk/)

This is a direct instantiation of me_predicts_hz with the Tagalog violation differences and their verified independence.

theorem Phenomena.PhonologicalAlternation.Studies.Magri2025.logitRate_mang_b (w : Fin 6 → ℚ) :

∑ k : Fin 6, w k * ↑(Fragments.Tagalog.Phonology.violDiffProfile k Fragments.Tagalog.Phonology.NasalSubInput.mang_b) = w 0 - w 4

LR(maŋb) = w₁ − w₅

theorem Phenomena.PhonologicalAlternation.Studies.Magri2025.logitRate_mang_k (w : Fin 6 → ℚ) :

∑ k : Fin 6, w k * ↑(Fragments.Tagalog.Phonology.violDiffProfile k Fragments.Tagalog.Phonology.NasalSubInput.mang_k) = w 0 + w 1 - w 2 - w 3 - w 4

LR(/maŋk/) = w₁ + w₂ − w₃ − w₄ − w₅

theorem Phenomena.PhonologicalAlternation.Studies.Magri2025.logitRate_pang_b (w : Fin 6 → ℚ) :

∑ k : Fin 6, w k * ↑(Fragments.Tagalog.Phonology.violDiffProfile k Fragments.Tagalog.Phonology.NasalSubInput.pang_b) = w 0 - w 5

LR(/paŋb/) = w₁ − w₆

theorem Phenomena.PhonologicalAlternation.Studies.Magri2025.logitRate_pang_k (w : Fin 6 → ℚ) :

∑ k : Fin 6, w k * ↑(Fragments.Tagalog.Phonology.violDiffProfile k Fragments.Tagalog.Phonology.NasalSubInput.pang_k) = w 0 + w 1 - w 2 - w 3 - w 5

LR(/paŋk/) = w₁ + w₂ − w₃ − w₄ − w₆

theorem Phenomena.PhonologicalAlternation.Studies.Magri2025.hz_constant_value (w : Fin 6 → ℚ) :

∑ k : Fin 6, w k * ↑(Fragments.Tagalog.Phonology.violDiffProfile k Fragments.Tagalog.Phonology.NasalSubInput.mang_b) - ∑ k : Fin 6, w k * ↑(Fragments.Tagalog.Phonology.violDiffProfile k Fragments.Tagalog.Phonology.NasalSubInput.mang_k) = -w 1 + w 2 + w 3

The constant logit-rate difference equals −w₂ + w₃ + w₄ for both rows, regardless of weights. This follows from the insensitivity structure of the six constraints (§2.3).

theorem Phenomena.PhonologicalAlternation.Studies.Magri2025.hz_constant_value' (w : Fin 6 → ℚ) :

∑ k : Fin 6, w k * ↑(Fragments.Tagalog.Phonology.violDiffProfile k Fragments.Tagalog.Phonology.NasalSubInput.pang_b) - ∑ k : Fin 6, w k * ↑(Fragments.Tagalog.Phonology.violDiffProfile k Fragments.Tagalog.Phonology.NasalSubInput.pang_k) = -w 1 + w 2 + w 3

theorem Phenomena.PhonologicalAlternation.Studies.Magri2025.hz_identity_concrete (w : Fin 6 → ℚ) :

∑ k : Fin 6, w k * ↑(Fragments.Tagalog.Phonology.violDiffProfile k Fragments.Tagalog.Phonology.NasalSubInput.mang_b) - ∑ k : Fin 6, w k * ↑(Fragments.Tagalog.Phonology.violDiffProfile k Fragments.Tagalog.Phonology.NasalSubInput.mang_k) = ∑ k : Fin 6, w k * ↑(Fragments.Tagalog.Phonology.violDiffProfile k Fragments.Tagalog.Phonology.NasalSubInput.pang_b) - ∑ k : Fin 6, w k * ↑(Fragments.Tagalog.Phonology.violDiffProfile k Fragments.Tagalog.Phonology.NasalSubInput.pang_k)

The HZ identity verified concretely: both row-differences are equal.

theorem Phenomena.PhonologicalAlternation.Studies.Magri2025.rate_pos (x : Fragments.Tagalog.Phonology.NasalSubInput) :

0 < Fragments.Tagalog.Phonology.nasalSubRate x

Rates are in (0, 1).

theorem Phenomena.PhonologicalAlternation.Studies.Magri2025.rate_lt_one (x : Fragments.Tagalog.Phonology.NasalSubInput) :

Fragments.Tagalog.Phonology.nasalSubRate x < 1

theorem Phenomena.PhonologicalAlternation.Studies.Magri2025.top_row_odds_ratio :

Fragments.Tagalog.Phonology.nasalSubRate Fragments.Tagalog.Phonology.NasalSubInput.mang_b * (1 - Fragments.Tagalog.Phonology.nasalSubRate Fragments.Tagalog.Phonology.NasalSubInput.mang_k) / (Fragments.Tagalog.Phonology.nasalSubRate Fragments.Tagalog.Phonology.NasalSubInput.mang_k * (1 - Fragments.Tagalog.Phonology.nasalSubRate Fragments.Tagalog.Phonology.NasalSubInput.mang_b)) = 6412 / 83412

Logit-odds ratio for top row: (916/1000)·(7/1000) / ((993/1000)·(84/1000)) = 916·7 / (993·84) = 6412 / 83412.

theorem Phenomena.PhonologicalAlternation.Studies.Magri2025.bottom_row_odds_ratio :

Fragments.Tagalog.Phonology.nasalSubRate Fragments.Tagalog.Phonology.NasalSubInput.pang_b * (1 - Fragments.Tagalog.Phonology.nasalSubRate Fragments.Tagalog.Phonology.NasalSubInput.pang_k) / (Fragments.Tagalog.Phonology.nasalSubRate Fragments.Tagalog.Phonology.NasalSubInput.pang_k * (1 - Fragments.Tagalog.Phonology.nasalSubRate Fragments.Tagalog.Phonology.NasalSubInput.pang_b)) = 39494 / 514494

Logit-odds ratio for bottom row: (434/1000)·(91/1000) / ((909/1000)·(566/1000)) = 434·91 / (909·566) = 39494 / 514494.

theorem Phenomena.PhonologicalAlternation.Studies.Magri2025.odds_ratios_close :

6412 * 514494 = 3298935528 ∧ 39494 * 83412 = 3294273528

The two odds ratios are close: 6412/83412 ≈ 0.0769 and 39494/514494 ≈ 0.0768 — a remarkable match confirming HZ's empirical observation. Equality of these ratios would mean logit(R(tl)) − logit(R(tr)) = logit(R(bl)) − logit(R(br)) exactly.

ME predicts HZ at the probability level: the log-probability-ratio log(P(YES|x)/P(NO|x)) under ME satisfies HZ's constant-difference identity for Tagalog nasal substitution, for any weight assignment.

This instantiates separable_predicts_hz with meSeparable and the Tagalog constraints. Since ME rescaling is the identity (meSeparable_rescale), the rescaled violation differences reduce to the raw violation differences, and violDiff_independence provides the independence hypothesis.