Documentation

Linglib.Theories.Syntax.DependencyGrammar.Formal.HarmonicOrder

Harmonic Word Order via Dependency Length Minimization #

@cite{dryer-1992} @cite{futrell-gibson-2020} @cite{gibson-2025} @cite{greenberg-1963}

@cite{gibson-2025} argues that dependency length minimization (DLM) explains the head-direction generalization: languages overwhelmingly prefer consistent head direction across construction types. The argument:

  1. In recursive structures, consistent direction (HH or FF) keeps spine dependencies local — each dependent is adjacent to its head.
  2. Mixed direction (HF or FH) forces intervening subtree material between a head and its spine dependent, inflating dependency lengths.
  3. For single-word dependents (leaves), direction is irrelevant to DLM because there is no subtree to intervene.

Structure #

Chain Total Dependency Length #

For a sequence of positions p[0], p[1],..., p[k] representing a chain of dependencies (head → dep₁ → dep₂ →...), the total dependency length is the sum of |p[i+1] - p[i]| for all consecutive pairs.

Key insight: monotone sequences (all ascending or all descending) achieve the minimum TDL = span = max - min. Non-monotone sequences must exceed the span because each direction reversal adds distance.

Total dependency length of a chain of positions: sum of consecutive absolute differences |p[i+1] - p[i]|.

Equations
Instances For

    The span of a list: max - min. For a chain of k+1 positions spanning k dependencies, this is the minimum possible TDL.

    Equations
    Instances For

      Triangle inequality for chains: total dep length ≥ span. By induction: |a-b| + span(b::rest) ≥ span(a::b::rest) reduces to arithmetic on max/min given m ≤ b ≤ M (head is between min and max of the tail).

      Monotone (ascending) chain achieves TDL = span. For ascending lists, each step contributes exactly b - a and the span decomposes as (b - a) + span(tail).

      The consecutive sequence [0, 1,..., k] has chainTDL = k.

      Intervening Material #

      The structural heart of Gibson's argument: dependency length between a head h and dependent d equals 1 + the number of subtree members of d that fall between h and d in linear order. In a projective tree, all subtree members of d are on the same side of h as d, so they may intervene.

      When direction is consistent (harmonic), recursive spine dependents are adjacent to their heads. When direction is mixed (disharmonic), subtree material from one dependent intervenes between the next spine dependent and its head.

      Collect all transitive descendants of a node (excluding the node itself). Uses projection from Core/Basic.lean.

      Equations
      Instances For

        Count nodes from subtree(d) that fall between positions h and d in linear order. These are the "intervening" nodes that inflate dependency length beyond 1.

        Equations
        • One or more equations did not get rendered due to their size.
        Instances For

          Dependency length ≥ 1 + intervening subtree nodes.

          depLength counts ALL positions between head and dependent (= |h − d|). interveningSubtreeNodes counts only dep's subtree members in that interval. The gap is filled by siblings' subtrees placed between h and d.

          The original equality depLength = 1 + interveningSubtreeNodes (@cite{gibson-2025}, Ch. 5.3) holds only when d's subtree is the SOLE occupant of the interval (h, d) — i.e., no sibling subtrees intervene. That special case is exactly the harmonic-order scenario demonstrated concretely in §4 below.

          Single-Word (Leaf) Exception #

          When a dependent is a leaf (no children), its subtree is empty, so there are no intervening nodes regardless of direction. This is why single-word dependents (adjectives, demonstratives, intensifiers, negation markers) can freely violate the head-direction generalization without DLM cost.

          A node is a leaf if it has no dependents.

          Equations
          Instances For

            A leaf has no subtree members.

            theorem DepGrammar.HarmonicOrder.leaf_no_intervening (t : DepTree) (headPos depPos : ) (h : isLeaf t depPos = true) :
            interveningSubtreeNodes t headPos depPos = 0

            A leaf has 0 intervening nodes for any head position.

            Since leaves have no subtree, no subtree members can intervene. Direction of a leaf dependent is irrelevant to DLM.

            theorem DepGrammar.HarmonicOrder.leaf_direction_irrelevant_bridge (h d : ) :
            DependencyLength.depLength { headIdx := h, depIdx := d, depType := UD.DepRel.amod } = DependencyLength.depLength { headIdx := d, depIdx := h, depType := UD.DepRel.amod }

            Bridge to single_dep_direction_irrelevant from DependencyLength.lean: a single (leaf) dependency has the same length regardless of direction. This is the formal basis for Gibson's Table 4 exceptions.

            Harmonic vs Disharmonic Trees #

            @cite{gibson-2025} pattern: a recursive embedding where the verb takes a clausal complement. With 3 levels of embedding:

            V₁ — S₁ — V₂ — S₂ — V₃ — O₃
            

            In head-initial (HH): V₁ S₁ V₂ S₂ V₃ O₃ — each V is adjacent to its S In head-final (FF): O₃ V₃ S₂ V₂ S₁ V₁ — same TDL, mirror image In mixed (HF): V₁ S₁ S₂ V₃ O₃ V₂ — V₂ separated from V₁ by intervening material In mixed (FH): S₁ V₁ V₂ O₃ V₃ S₂ — similar inflation

            We build concrete DepTrees for each and verify TDL.

            Consistent head-initial (HH) tree: V₁(0) S₁(1) V₂(2) S₂(3) V₃(4) O₃(5) Dependencies: V₁→S₁ (nsubj, 1), V₁→V₂ (ccomp, 2), V₂→S₂ (nsubj, 1), V₂→V₃ (ccomp, 2), V₃→O₃ (obj, 1)

            Equations
            • One or more equations did not get rendered due to their size.
            Instances For

              Consistent head-final (FF) tree: O₃(0) V₃(1) S₂(2) V₂(3) S₁(4) V₁(5) Mirror image of HH. Same TDL.

              Equations
              • One or more equations did not get rendered due to their size.
              Instances For

                Disharmonic HF tree: verb-initial main clause but verb-final embedding. V₁(0) S₁(1) S₂(2) O₃(3) V₃(4) V₂(5) The ccomp V₁→V₂ now spans the entire sentence (length 5).

                Equations
                • One or more equations did not get rendered due to their size.
                Instances For

                  Disharmonic FH tree: verb-final main clause but verb-initial embedding. S₁(0) V₂(1) S₂(2) V₃(3) O₃(4) V₁(5) The ccomp V₁→V₂ again spans far.

                  Equations
                  • One or more equations did not get rendered due to their size.
                  Instances For

                    DLM predicts harmonic order is cheaper.

                    Equations
                    • One or more equations did not get rendered due to their size.
                    Instances For