@cite{magri-2025}: Constraint Interaction in Probabilistic Phonology #
@cite{magri-2025}
Replication of @cite{magri-2025} "Constraint Interaction in Probabilistic Phonology: Deducing Maximum Entropy Grammars from Hayes and Zuraw's Shifted Sigmoids Generalization" (Linguistic Inquiry, Early Access).
Main result #
Within harmony-based probabilistic phonology, an n-ary harmony function
predicts the shifted-sigmoids generalization of Hayes and Zuraw
(@cite{zuraw-hayes-2017}; @cite{hayes-2022}) if and only
if the harmony is separable — it decomposes as ∏ₖ hₖ(Cₖ)^{wₖ}.
Since MaxEnt harmony is separable (each hₖ = exp(−·)), ME predicts HZ
as a corollary. And since any separable harmony can be construed as ME
through constraint rescaling Ĉₖ = −log hₖ(Cₖ), the characterization
is complete.
Formalization #
This study file instantiates @cite{magri-2025}'s theory with the Tagalog nasal substitution case study from the paper's §2–3, verifying:
- The six constraints satisfy
ConstraintIndependence(§2.3, Figure 3) - The violation differences inherit independence (
ViolDiffIndependence) - ME predicts HZ's constant logit-rate difference identity (§3.6, eq. 22)
- The identity holds for any weight assignment (not just specific values)
The constraint data comes from Fragments.Tagalog.Phonology.
C₁ = *NC is insensitive to the prefix (row dimension): the violation is 1 for NO and 0 for YES regardless of prefix.
C₂ = *NC̥ is insensitive to the prefix.
C₃ = *[stem] is insensitive to the prefix.
C₄ = *[stem]/n is insensitive to the prefix.
C₅ = UNIF(maŋ) is insensitive to the stem-initial obstruent (column).
C₆ = UNIF(paŋ) is insensitive to the stem-initial obstruent.
Constraint independence (§2.3): for each fixed output, the six
constraints satisfy ConstraintIndependence on the nasal substitution
square.
C₁–C₄ (markedness) are insensitive to row (prefix); C₅–C₆ (faithfulness) are insensitive to column (stem obstruent).
The violation differences are consistent with the raw constraint
profiles: Δₖ(x) = Cₖ(x, NO) − Cₖ(x, YES).
ME predicts HZ for Tagalog nasal substitution (§3.6):
for any weight assignment w : Fin 6 → ℝ, the MaxEnt logit rates
of nasal substitution satisfy the constant-difference identity.
LR(/maŋb/) − LR(/maŋk/) = LR(/paŋb/) − LR(/paŋk/)
This is a direct instantiation of me_predicts_hz with the
Tagalog violation differences and their verified independence.
LR(maŋb) = w₁ − w₅
LR(/maŋk/) = w₁ + w₂ − w₃ − w₄ − w₅
LR(/paŋb/) = w₁ − w₆
LR(/paŋk/) = w₁ + w₂ − w₃ − w₄ − w₆
The constant logit-rate difference equals −w₂ + w₃ + w₄
for both rows, regardless of weights. This follows from the
insensitivity structure of the six constraints (§2.3).
The HZ identity verified concretely: both row-differences are equal.
Rates are in (0, 1).
Logit-odds ratio for top row: (916/1000)·(7/1000) / ((993/1000)·(84/1000)) = 916·7 / (993·84) = 6412 / 83412.
Logit-odds ratio for bottom row: (434/1000)·(91/1000) / ((909/1000)·(566/1000)) = 434·91 / (909·566) = 39494 / 514494.
The two odds ratios are close: 6412/83412 ≈ 0.0769 and
39494/514494 ≈ 0.0768 — a remarkable match confirming HZ's
empirical observation. Equality of these ratios would mean
logit(R(tl)) − logit(R(tr)) = logit(R(bl)) − logit(R(br))
exactly.
ME predicts HZ at the probability level: the log-probability-ratio
log(P(YES|x)/P(NO|x)) under ME satisfies HZ's constant-difference
identity for Tagalog nasal substitution, for any weight assignment.
This instantiates separable_predicts_hz with meSeparable and the
Tagalog constraints. Since ME rescaling is the identity
(meSeparable_rescale), the rescaled violation differences reduce to
the raw violation differences, and violDiff_independence provides
the independence hypothesis.