MaxEnt → OT Limit #
@cite{smolensky-legendre-2006} @cite{prince-smolensky-1993}
As the rationality parameter α → ∞, MaxEnt Harmonic Grammar recovers Optimality Theory's categorical optimization. OT is the "infinite-temperature" limit of MaxEnt.
The argument has two components:
HG–OT agreement (@cite{smolensky-legendre-2006} ch. 14): with exponentially separated weights, the Harmonic Grammar winner (argmax of harmony score) equals the OT winner (lexicographic comparison of violation profiles). The key: if weight wₖ exceeds M · Σᵢ₍ᵢ>ₖ₎ wᵢ (where M bounds violation counts), then a single violation difference on constraint k outweighs any combination of violations on lower-ranked constraints.
MaxEnt concentration (
softmax_argmax_limit): as α → ∞, the MaxEnt distribution softmax(α·H) concentrates on the argmax of H — i.e., the HG winner. This is proved inCore.Agent.RationalAction.
Together: MaxEnt(α → ∞) → HG winner = OT winner.
Definitions #
otToWeighted: convert OT ranking + violation bound to weighted constraintsLexStrictlyBetter: violation vector a lexicographically dominates bExponentiallySeparated: weight separation condition for HG–OT agreementlex_imp_harmony: key lemma (lex dominance ⟹ higher harmony)maxent_ot_limit: main limit theorem
Convert an OT constraint ranking to weighted constraints.
Each constraint at rank position i (0 = highest) receives weight
(M+1)^(n−1−i), where n is the number of constraints and M is a
violation count bound. This exponential spacing ensures HG–OT agreement.
The eval function is preserved: the weighted constraint evaluates
candidates identically to the original named constraint.
Equations
- One or more equations did not get rendered due to their size.
Instances For
The weighted constraints have the same length as the ranking.
Each weighted constraint preserves the original eval function.
Candidate a (violation vector va) lexicographically beats candidate b (vb): at the first position where they differ, a has strictly fewer violations.
This mirrors Core.ConstraintEvaluation.lexLT but on Fin n → Nat
rather than List Nat, enabling Finset-based reasoning.
Equations
Instances For
Weights are exponentially separated with violation bound M: each weight exceeds M times the sum of all lower-ranked weights.
This ensures that no combination of lower-constraint violations can override a single higher-constraint violation difference, matching OT's strict ranking semantics.
Equations
Instances For
Exponential weights are positive.
Exponential weights are exponentially separated.
Ganging: two constraints with individual weights w₁, w₂ each weaker than a third weight w₃, but jointly stronger.
This is the hallmark of weighted constraint interaction that distinguishes MaxEnt/HG from OT (@cite{hayes-wilson-2008}). In OT (strict ranking), a lower-ranked constraint can never override a higher-ranked one regardless of how many violations accumulate. In MaxEnt, constraint effects are additive, so multiple weak constraints can "gang up" to outweigh a strong one.
Equations
Instances For
Ganging is achievable: weights (2, 2, 3) exhibit ganging.
Ganging is incompatible with exponential separation: if weights are exponentially separated, no pair of lower constraints can gang up against any higher constraint.
Weighted violation sum (the positive part of harmony:
harmonyScore = -weightedViolations).
Equations
- Theories.Phonology.HarmonicGrammar.weightedViolations w v = ∑ i : Fin n, w i * ↑(v i)
Instances For
HG–OT agreement lemma (@cite{smolensky-legendre-2006}): with exponentially separated weights and bounded violations, lexicographic dominance implies strictly lower weighted violations.
Since harmonyScore = -weightedViolations, this means the
lexicographically better candidate has strictly higher harmony.
Proof sketch: decompose the violation-difference sum at the first differing position k.
- For i < k: terms cancel (va(i) = vb(i) by
hlex) - At i = k: wₖ · (vb(k) − va(k)) ≥ wₖ (since vb(k) > va(k))
- For i > k: |wᵢ · (vb(i) − va(i))| ≤ wᵢ · M (by
hM) - Net: ≥ wₖ − M · Σᵢ₍ᵢ>ₖ₎ wᵢ > 0 (by
hw)
HG–OT agreement for a concrete candidate type: if candidate a
lexicographically beats b on the violation profile induced by ranking,
then a has strictly higher harmony under otToWeighted ranking M,
provided M bounds all violation counts.
MaxEnt concentration on HG winner: as α → ∞, MaxEnt probability concentrates on the candidate with the highest harmony score.
This is softmax_argmax_limit instantiated with harmony scores.
The interesting content is in the hypotheses: showing that the
HG winner equals the OT winner (§4).
MaxEnt → OT limit (@cite{smolensky-legendre-2006}): as α → ∞, MaxEnt probability concentrates on the OT winner.
Given a constraint ranking with violation bound M and a candidate c_opt
that lexicographically beats all competitors, the MaxEnt probability
softmax(α · H)(c_opt) → 1 as α → ∞.
The proof combines:
ot_lex_imp_higher_harmony: lex-better ⟹ higher harmony (HG–OT agreement)softmax_argmax_limit: MaxEnt concentrates on harmony maximizer