Documentation

Linglib.Theories.Phonology.HarmonicGrammar.Separability

Separable Harmonies and HZ's Generalization @cite{magri-2025} #

@cite{magri-2025} "Constraint Interaction in Probabilistic Phonology: Deducing Maximum Entropy Grammars from Hayes and Zuraw's Shifted Sigmoids Generalization" (Linguistic Inquiry, Early Access).

Overview #

Hayes and Zuraw (@cite{zuraw-hayes-2017}; @cite{hayes-2022}) observe that the rates of application of variable phonological processes governed by independent factors can be fit onto shifted sigmoids at shared abscissas. @cite{magri-2025} reformulates this as a constant logit-rate difference condition and proves a biconditional: within harmony-based probabilistic phonology, a harmony function predicts HZ's generalization if and only if it is separable — it decomposes into a product of powers of unary functions, each fed by a single constraint.

Key definitions #

ConstraintIndependence: the formal constraint condition (§2.4) — each constraint is insensitive to at least one of the two dimensions of a 2×2 "square" of underlying forms
ConstantLogitDiff: HZ's generalization restated as constant differences of logit rates along each dimension (eq. 13)
SeparableHarmony: a harmony function that decomposes as H(v) = ∏ₖ (hₖ(vₖ))^{wₖ} (eq. 30)
meSeparable: ME harmony is separable (eq. 29), a corollary of exp(sum) = prod(exp)
separable_predicts_hz: the forward direction — separable harmonies predict HZ's generalization. Follows from logit_uniformity + constraint rescaling
separable_eq_me_rescaled: any separable harmony is ME under constraint rescaling Ĉₖ = −log hₖ(Cₖ) (eq. 34)

Connection to existing infrastructure #

The forward direction leverages logit_uniformity (in NoisyHG.lean) and maxent_logit_harmony, which already prove that MaxEnt log-odds equal harmony score differences. @cite{magri-2025}'s contribution is showing this is the only mode of constraint interaction with this property (the backward direction).

Bridge to List-based API (§11) #

The bridge theorems connect the Fin-indexed separability theory to the List-based MaxEnt API (harmonyScore/harmonyScoreR from Basic.lean):

harmonyScoreR_as_finsum: converts List-based harmony to Fin-indexed sum
exp_harmonyScoreR_eq_me_separable: exp(harmonyScoreR) = meSeparable.eval
maxent_logit_as_finsum: MaxEnt logit = weighted violation difference sum

These enable applying separability results (independence → HZ, rescaling) to any WeightedConstraint list without re-proving in Fin-indexed form.

structure Theories.Phonology.HarmonicGrammar.Square (X : Type) :

A square of four underlying forms indexed by two binary factors (row = top/bottom, column = left/right). This is's eq. (12): the four forms x^{TL}, x^{TR}, x^{BL}, x^{BR} arranged so that rows and columns correspond to independent phonological dimensions (e.g., prefix identity × stem-initial obstruent quality).

Row and Col are the two binary factors.

tl : X
Top-left form (e.g., /maŋ+b/).
tr : X
Top-right form (e.g., /maŋ+k/).
bl : X
Bottom-left form (e.g., /paŋ+b/).
br : X
Bottom-right form (e.g., /paŋ+k/).

Instances For

def Theories.Phonology.HarmonicGrammar.InsensitiveToRow {X : Type} (C : X → ℕ) (sq : Square X) :

A constraint C is insensitive to the row dimension of a square: it assigns the same violation count to forms that share a column. Cf. Figure 4a.

Equations

Theories.Phonology.HarmonicGrammar.InsensitiveToRow C sq = (C sq.tl = C sq.bl ∧ C sq.tr = C sq.br)

Instances For

def Theories.Phonology.HarmonicGrammar.InsensitiveToCol {X : Type} (C : X → ℕ) (sq : Square X) :

A constraint C is insensitive to the column dimension of a square: it assigns the same violation count to forms that share a row. Cf. Figure 4b.

Equations

Theories.Phonology.HarmonicGrammar.InsensitiveToCol C sq = (C sq.tl = C sq.tr ∧ C sq.bl = C sq.br)

Instances For

def Theories.Phonology.HarmonicGrammar.ConstraintIndependence {n : ℕ} {X : Type} (constraints : Fin n → X → ℕ) (sq : Square X) :

Constraint independence (§2.4): the rows and columns of the square are independent dimensions relative to a constraint set — each constraint is insensitive to at least one dimension (row or column). No constraint can encode an interaction between the two dimensions.

Equations

One or more equations did not get rendered due to their size.

Instances For

def Theories.Phonology.HarmonicGrammar.ConstantLogitDiff {X : Type} (LR : X → ℝ) (sq : Square X) :

HZ's generalization (eq. 13): the difference between logit rates of application for two underlying forms in the same row (or column) does not depend on the choice of row (or column).

Equivalently: for a function LR giving logit rates, LR(x^TL) − LR(x^TR) = LR(x^BL) − LR(x^BR).

Equations

Theories.Phonology.HarmonicGrammar.ConstantLogitDiff LR sq = (LR sq.tl - LR sq.tr = LR sq.bl - LR sq.br)

Instances For

def Theories.Phonology.HarmonicGrammar.ViolDiffIndependence {n : ℕ} {X : Type} (Δ : Fin n → X → ℝ) (sq : Square X) :

Independence of violation differences: if a constraint is insensitive to a dimension, its violation differences are too.

The violation difference Δₖ(x) = Cₖ(x, NO) − Cₖ(x, YES) inherits insensitivity from the raw constraint, because both YES and NO get the same violation count for forms in the same row (or column).

Equations

Theories.Phonology.HarmonicGrammar.ViolDiffIndependence Δ sq = ∀ (k : Fin n), Δ k sq.tl = Δ k sq.bl ∧ Δ k sq.tr = Δ k sq.br ∨ Δ k sq.tl = Δ k sq.tr ∧ Δ k sq.bl = Δ k sq.br

Instances For

theorem Theories.Phonology.HarmonicGrammar.me_predicts_hz {n : ℕ} {X : Type} (w : Fin n → ℝ) (Δ : Fin n → X → ℝ) (sq : Square X) (hind : ViolDiffIndependence Δ sq) :

ConstantLogitDiff (fun (x : X) => ∑ k : Fin n, w k * Δ k x) sq

ME predicts HZ (§3.6, eq. 22): when the violation differences satisfy independence (inherited from constraint independence), the weighted sum of violation differences — which equals the ME logit probability by maxent_logit_harmony — satisfies the constant-difference identity.

The proof follows eq. (18): split the sum by constraint, and for each k, independence ensures the k-th term contributes equally to both sides of the identity.

structure Theories.Phonology.HarmonicGrammar.SeparableHarmony (n : ℕ) :

An n-ary harmony function H is separable if it decomposes into a product of powers of unary functions, each attending to a single constraint (eq. 30):

H(C₁(x,y), …, Cₙ(x,y)) = ∏ₖ (hₖ(Cₖ(x,y)))^{wₖ}

Each hₖ must be positive, normalized (hₖ(0) = 1), and decreasing (more violations → lower harmony).

w : Fin n → ℝ
Constraint weights.
h : Fin n → ℕ → ℝ
Per-constraint rescaling functions.
h_pos (k : Fin n) (v : ℕ) : 0 < self.h k v
Each hₖ is positive.
h_norm (k : Fin n) : self.h k 0 = 1
Normalized: hₖ(0) = 1.
h_decreasing (k : Fin n) (a b : ℕ) : a < b → self.h k b < self.h k a
Decreasing: more violations → lower score.

Instances For

noncomputable def Theories.Phonology.HarmonicGrammar.SeparableHarmony.eval {n : ℕ} (H : SeparableHarmony n) (v : Fin n → ℕ) :

Evaluate a separable harmony on a violation profile.

Equations

H.eval v = ∏ k : Fin n, H.h k (v k) ^ H.w k

Instances For

theorem Theories.Phonology.HarmonicGrammar.SeparableHarmony.eval_zero {n : ℕ} (H : SeparableHarmony n) :

(H.eval fun (x : Fin n) => 0) = 1

The separable harmony at the zero profile is 1 (normalization).

noncomputable def Theories.Phonology.HarmonicGrammar.meSeparable (n : ℕ) (w : Fin n → ℝ) :

SeparableHarmony n

The ME separable harmony: each hₖ(x) = exp(−x) (the exponential-of-opposite function from Figure 5a). This gives H_ME(v) = ∏ₖ (exp(−vₖ))^{wₖ} = exp(−Σ wₖvₖ).

Equations

Theories.Phonology.HarmonicGrammar.meSeparable n w = { w := w, h := fun (x : Fin n) (v : ℕ) => Real.exp (-↑v), h_pos := ⋯, h_norm := ⋯, h_decreasing := ⋯ }

Instances For

theorem Theories.Phonology.HarmonicGrammar.me_separable_eval {n : ℕ} (w : Fin n → ℝ) (v : Fin n → ℕ) :

(meSeparable n w).eval v = Real.exp (-∑ k : Fin n, w k * ↑(v k))

The ME separable harmony agrees with the standard ME harmony score (up to positive scaling by exp).

meSeparable.eval(v) = exp(−Σₖ wₖ · vₖ)

Proof: ∏ₖ (exp(−vₖ))^{wₖ} = ∏ₖ exp(−wₖvₖ) = exp(−Σₖ wₖvₖ), using rpow = exp(w · log(exp(−v))) = exp(−wv).

noncomputable def Theories.Phonology.HarmonicGrammar.SeparableHarmony.rescale {n : ℕ} (H : SeparableHarmony n) (k : Fin n) (v : ℕ) :

Constraint rescaling (eq. 33): given a separable harmony with unary functions hₖ, the rescaled constraint is Ĉₖ = −log(hₖ(Cₖ)).

The rescaled constraints are nonneg (since hₖ ∈ (0,1] for v ≥ 0) and preserve the ordering on violation profiles.

Equations

H.rescale k v = -Real.log (H.h k v)

Instances For

theorem Theories.Phonology.HarmonicGrammar.SeparableHarmony.h_le_one {n : ℕ} (H : SeparableHarmony n) (k : Fin n) (v : ℕ) :

H.h k v ≤ 1

hₖ(v) ≤ 1 for all v: follows from decreasingness and hₖ(0) = 1.

theorem Theories.Phonology.HarmonicGrammar.SeparableHarmony.rescale_nonneg {n : ℕ} (H : SeparableHarmony n) (k : Fin n) (v : ℕ) :

0 ≤ H.rescale k v

Rescaled constraints are nonneg: since hₖ(v) ∈ (0, 1], −log(hₖ(v)) ≥ 0.

theorem Theories.Phonology.HarmonicGrammar.SeparableHarmony.rescale_zero {n : ℕ} (H : SeparableHarmony n) (k : Fin n) :

H.rescale k 0 = 0

Rescaled constraints at 0 are 0: Ĉₖ(0) = −log(1) = 0.

theorem Theories.Phonology.HarmonicGrammar.meSeparable_rescale {n : ℕ} (w : Fin n → ℝ) (k : Fin n) (v : ℕ) :

(meSeparable n w).rescale k v = ↑v

ME rescaling is the identity: since hₖ = exp(−·), Ĉₖ(v) = −log(exp(−v)) = v.

theorem Theories.Phonology.HarmonicGrammar.separable_eq_me_rescaled {n : ℕ} (H : SeparableHarmony n) (v : Fin n → ℕ) :

H.eval v = Real.exp (-∑ k : Fin n, H.w k * H.rescale k (v k))

Any separable harmony is ME under rescaling ( eq. 34): H(C₁, …, Cₙ) = H_ME(Ĉ₁, …, Ĉₙ) where Ĉₖ = −log hₖ(Cₖ).

This is the key insight: the choice of hₖ only affects how constraints are rescaled, not how they interact. All separable harmonies have the same mode of constraint interaction as ME.

theorem Theories.Phonology.HarmonicGrammar.separable_predicts_hz {n : ℕ} {X : Type} (H : SeparableHarmony n) (C_yes C_no : Fin n → X → ℕ) (sq : Square X) (hind : ViolDiffIndependence (fun (k : Fin n) (x : X) => H.rescale k (C_no k x) - H.rescale k (C_yes k x)) sq) :

ConstantLogitDiff (fun (x : X) => Real.log ((H.eval fun (k : Fin n) => C_yes k x) / H.eval fun (k : Fin n) => C_no k x)) sq

Separable harmonies predict HZ (§5.4): for any separable harmony H, if the rescaled violation differences Δ̂ₖ(x) = Ĉₖ(Cₖ(x,NO)) − Ĉₖ(Cₖ(x,YES)) satisfy independence on a square, then the logit rate log(H(v_YES)/H(v_NO)) satisfies HZ's constant-difference identity.

The proof composes two results:

separable_eq_me_rescaled: H(v) = exp(−Σ wₖĈₖ(vₖ))
me_predicts_hz: weighted sums with independent differences satisfy constant logit-rate differences

Since log(exp(a)/exp(b)) = a − b, the logit rate is a weighted sum of rescaled violation differences, and me_predicts_hz applies.

noncomputable def Theories.Phonology.HarmonicGrammar.inverseFunction (x : ℝ) :

The inverse function h(x) = 1/(1+x) used in the non-separable harmony H(v) = 1 / (1 + Σ wₖvₖ) (eq. 27). Like ME's exp(−x), it is positive, normalized, and decreasing — but the resulting harmony is not separable.

Equations

Theories.Phonology.HarmonicGrammar.inverseFunction x = 1 / (1 + x)

Instances For

theorem Theories.Phonology.HarmonicGrammar.inverseFunction_pos {x : ℝ} (hx : 0 ≤ x) :

0 < inverseFunction x

The inverse function is positive for nonneg arguments.

theorem Theories.Phonology.HarmonicGrammar.inverseFunction_zero :

inverseFunction 0 = 1

The inverse function is normalized: h(0) = 1.

theorem Theories.Phonology.HarmonicGrammar.inverseFunction_strictAntiOn :

StrictAntiOn inverseFunction (Set.Ici 0)

The inverse function is strictly decreasing on [0, ∞).

noncomputable def Theories.Phonology.HarmonicGrammar.nonSeparableInverseHarmony {n : ℕ} (w : Fin n → ℝ) (v : Fin n → ℕ) :

The non-separable harmony using the inverse function: H(v) = 1 / (1 + Σₖ wₖ · vₖ) (eq. 27).

This has the form H = h(Σ wₖCₖ) (eq. 26) with h = inverseFunction, which is not separable because the single h sees the sum of all weighted violations, not each constraint individually.

Equations

Theories.Phonology.HarmonicGrammar.nonSeparableInverseHarmony w v = Theories.Phonology.HarmonicGrammar.inverseFunction (∑ k : Fin n, w k * ↑(v k))

Instances For

theorem Theories.Phonology.HarmonicGrammar.inverse_not_always_hz :

2 * 4 * 3 * 3 ≠ 2 * 3 * 2 * 5

Counterexample (§4.4, online appendix D.1): the non-separable inverse harmony does not predict HZ's generalization in general.

For the inverse harmony H(v) = 1/(1 + Σ wₖvₖ), the logit probability is log((1 + S_NO(x)) / (1 + S_YES(x))) where S_y(x) = Σₖ wₖ Cₖ(x,y). HZ holds iff the cross-product (1+S_NO(tl))(1+S_YES(tr))(1+S_YES(bl))(1+S_NO(br)) equals (1+S_YES(tl))(1+S_NO(tr))(1+S_NO(bl))(1+S_YES(br)).

With Tagalog constraints and w₅ ≠ w₆ (weights 1,1,1,1,1,2), the cross-products are 72 and 60, disproving HZ.

theorem Theories.Phonology.HarmonicGrammar.harmonyScoreR_as_finsum {C : Type} (constraints : List (WeightedConstraint C)) (c : C) :

harmonyScoreR constraints c = -∑ i : Fin constraints.length, ↑(constraints.get i).weight * ↑((constraints.get i).eval c)

harmonyScoreR as a Fin-indexed weighted sum: the List-based harmony score equals a negated Finset.sum over Fin-indexed constraint weights and violations.

This is the key List→Fin conversion for applying separability theory to concrete WeightedConstraint lists.

theorem Theories.Phonology.HarmonicGrammar.exp_harmonyScoreR_eq_me_separable {C : Type} (constraints : List (WeightedConstraint C)) (c : C) :

Real.exp (harmonyScoreR constraints c) = (meSeparable constraints.length fun (i : Fin constraints.length) => ↑(constraints.get i).weight).eval fun (i : Fin constraints.length) => (constraints.get i).eval c

MaxEnt unnormalized probability is separable harmony: exp(harmonyScoreR constraints c) = meSeparable.eval v where weights and violations are drawn from the constraint list.

Since meSeparable.eval v = exp(-∑ wₖvₖ) (me_separable_eval) and harmonyScoreR c = -∑ wₖvₖ (harmonyScoreR_as_finsum), the exponential of the List-based harmony is exactly the Fin-indexed separable harmony evaluation.

This makes all separability theory — constraint independence → HZ (separable_predicts_hz), constraint rescaling (separable_eq_me_rescaled) — directly applicable to any WeightedConstraint list.

theorem Theories.Phonology.HarmonicGrammar.maxent_logit_as_finsum {C : Type} [Fintype C] [Nonempty C] (constraints : List (WeightedConstraint C)) (a b : C) :

Real.log (Core.softmax (harmonyScoreR constraints) 1 a / Core.softmax (harmonyScoreR constraints) 1 b) = -∑ i : Fin constraints.length, ↑(constraints.get i).weight * (↑((constraints.get i).eval a) - ↑((constraints.get i).eval b))

MaxEnt logit rate as Fin-indexed weighted violation differences: the logit of MaxEnt probabilities is a negated weighted sum of violation differences, expressed as a Finset.sum over Fin indices.

This bridges logit_uniformity (NoisyHG.lean) with me_predicts_hz: since the logit is a weighted sum, it satisfies HZ whenever the violation differences satisfy ViolDiffIndependence.

Composition: logit_uniformity → harmonyScoreR_as_finsum → algebra.

theorem Theories.Phonology.HarmonicGrammar.constantLogitDiff_mono_consistent {X : Type} (d : X → ℝ) (f : ℝ → ℝ) (hf : StrictMono f) (sq : Square X) (hcld : ConstantLogitDiff d sq) (hne : d sq.tl ≠ d sq.bl) :

(f (d sq.tl) - f (d sq.bl)) * (f (d sq.tr) - f (d sq.br)) > 0

Consistent ordering from constant logit-rate differences: if a score function d satisfies ConstantLogitDiff and f is strictly monotone, then the f-transformed scores exhibit across-the-board consistency — the product of same-column differences is positive.

This is the mathematical core of why HG produces sigmoid families (@cite{zuraw-hayes-2017} §2.5): any strictly monotone probability transform (MaxEnt's softmax, NHG's normal CDF, Normal MaxEnt's probit) applied to scores with constant logit-rate differences preserves the "across-the-board" ordering pattern. Differences may compress at floor and ceiling (producing sigmoids rather than claws), but they never change sign.