Documentation

Linglib.Phenomena.WordOrder.Studies.Shieber1985

Shieber (1985) @cite{shieber-1985} #

Evidence against the Context-Freeness of Natural Language. Linguistics and Philosophy, 8(3), 333–343.

Core Argument #

@cite{shieber-1985} proves that Swiss German is not weakly context-free, using a purely string-based argument that makes no assumptions about constituent structure or semantics. The proof rests on four empirical claims about Swiss German subordinate clauses, plus the closure of context-free languages under homomorphism and intersection with regular languages.

The Four Claims #

  1. Swiss German subordinate clauses have structures where all Vs follow all NPs.
  2. Among such sentences, those with all DAT-NPs before all ACC-NPs, and all DAT-Vs before all ACC-Vs, are grammatical.
  3. The number of DAT-Vs must equal the number of DAT-NPs (and similarly for ACC).
  4. An arbitrary number of Vs can occur (subject to performance).

The Proof #

Define a homomorphism f mapping Swiss German words to an abstract alphabet:

Intersect f(L) with the regular language r = w a* b* x c* d* y. By Claims 1–4, f(L) ∩ r = {w aᵐ bⁿ x cᵐ dⁿ y}.

A further homomorphism removing w, x, y yields {aᵐ bⁿ cᵐ dⁿ}, which contains {aⁿ bⁿ cⁿ dⁿ} (setting m = n). Since {aⁿ bⁿ cⁿ dⁿ} is not context-free (anbncndn_not_pumpable), and CFLs are closed under homomorphism and intersection with regular languages, Swiss German is not context-free.

Contrast with @cite{bresnan-etal-1982} #

@cite{bresnan-etal-1982}'s earlier argument for Dutch non-context-freeness relied on linguistic assumptions about constituent structure, which @cite{gazdar-pullum-1982} contested. @cite{shieber-1985}'s argument is purely formal — it rests entirely on the string set of Swiss German and the case-marking facts, making no claims about phrase structure.

A Swiss German subordinate clause token, abstracting over specific lexical items to their role in the cross-serial construction.

@cite{shieber-1985}'s proof only needs to distinguish NPs and Vs by case.

  • datNP : Token

    Dative NP (e.g., em Hans)

  • accNP : Token

    Accusative NP (e.g., d'chind, de Hans)

  • datV : Token

    Dative-subcategorizing verb (e.g., hälfe "help")

  • accV : Token

    Accusative-subcategorizing verb (e.g., lönd "let", aastriiche "paint")

Instances For
    Equations
    • One or more equations did not get rendered due to their size.
    Instances For

      A cross-serial clause: a sequence of NPs followed by a sequence of Vs. This encodes Claims 1 and 2.

      Instances For
        Equations
        • One or more equations did not get rendered due to their size.
        Instances For
          Equations
          • One or more equations did not get rendered due to their size.
          Instances For

            Claim 3: case matching — the number of dative verbs equals the number of dative NPs, and similarly for accusative.

            Equations
            Instances For

              A grammatical cross-serial clause satisfies case matching.

              Instances For

                Claim 4: any combination of dative and accusative verb counts can occur (we can produce a GrammaticalClause for any m, n).

                Equations
                Instances For

                  The string image of a grammatical clause under the homomorphism.

                  Equations
                  • One or more equations did not get rendered due to their size.
                  Instances For

                    Setting m = n in the clause image gives aⁿbⁿcⁿdⁿ.

                    The diagonal clause images are in {aⁿbⁿcⁿdⁿ}.

                    CFL closure (contrapositive). If a language L can be mapped by a homomorphism f and intersected with a regular language r to produce a non-context-free language, then L is not context-free.

                    This is the contrapositive of the standard closure theorem for CFLs (Hopcroft & Ullman 1979, pp. 130–135): CFLs are closed under homomorphism and under intersection with regular languages.

                    We state this as a proposition rather than proving it from first principles, since linglib does not formalize the full theory of CFLs. The pumping lemma proof of the specific non-CFL witness ({aⁿbⁿcⁿdⁿ}) IS fully verified.

                    Equations
                    • One or more equations did not get rendered due to their size.
                    Instances For

                      Main result. The image of Swiss German cross-serial clauses under @cite{shieber-1985}'s homomorphism contains {aⁿbⁿcⁿdⁿ}, which is not context-free. Combined with CFL closure properties, this proves Swiss German is not context-free.

                      The conjunction packages the two independently verified facts:

                      1. The homomorphism maps Swiss German data to {aⁿbⁿcⁿdⁿ} (by construction)
                      2. {aⁿbⁿcⁿdⁿ} violates the CFL pumping property (by proof)

                      Cross-serial dependencies with case-marking require at least mildly context-sensitive power — the same classification used by Phenomena.WordOrder.CrossSerial.

                      Corollary: Swiss German is not strongly context-free either.

                      @cite{shieber-1985} §3: "As a trivial corollary, Swiss German is not strongly context-free either, regardless of one's view as to the appropriate structures for the language." Since strong context-freeness implies weak context-freeness, weak non-context-freeness implies strong non-context-freeness.

                      The formal–processing dissociation: crossed dependencies are formally harder (not CF) but psycholinguistically easier.

                      @cite{shieber-1985} establishes the formal side; the processing side is in Phenomena.WordOrder.Studies.BachBrownMarslenWilson1986.

                      Example (1): mer em Hans es huus hälfed aastriiche "we helped Hans paint the house"

                      em Hans (DAT) → hälfed (DAT-verb "helped") es huus (ACC) → aastriiche (ACC-verb "paint")

                      Equations
                      Instances For

                        Example (5): triply embedded cross-serial clause mer d'chind em Hans es huus lönd hälfe aastriiche "we let the children help Hans paint the house"

                        d'chind (ACC) → lönd (ACC-verb "let") em Hans (DAT) → hälfe (DAT-verb "help") es huus (ACC) → aastriiche (ACC-verb "paint")

                        With case sorting: 1 DAT-NP, 2 ACC-NPs, 1 DAT-V, 2 ACC-Vs

                        Equations
                        Instances For