Documentation

Linglib.Phenomena.WordOrder.Studies.ColeHermon2008

VP Raising in a VOS Language @cite{cole-hermon-2008} #

@cite{cole-hermon-2008} argue that VOS word order in Toba Batak derives from VP-raising to Spec,TP (or more precisely, VoiceP-raising to Spec,FP in their full analysis), rather than from rightward subject shift or base-generation. Three lines of evidence converge:

  1. Word order: VOS and the positions of IOs and adverbials follow from VP/VoiceP raising + remnant movement. The subject is stranded below the fronted predicate.

  2. Extraction restrictions: Direct objects and passive agents cannot be Ā-extracted (frozen inside the raised VoiceP), while subjects, indirect objects, and adverbials can (they escape VoiceP before it raises). This freezing effect is the paper's central novel prediction.

  3. Binding asymmetries: In actives, the subject c-commands the direct object (can bind a reflexive DO) but the DO cannot c-command the subject (cannot bind a reflexive subject). In passives, reconstruction allows the passive agent to bind a reflexive passive subject — a pattern unexplained by a purely thematic hierarchy.

Simplification #

The derivation here follows the simplified tree (2) from p. 146 of the paper, where VP raises to Spec,TP. The paper's full analysis (§4, trees 50–65) uses VoiceP raising to Spec,FP with remnant movement (IO/Adv escape before VoiceP fronts), a Voice head for mang-/di- morphology, and a richer functional sequence. The simplified derivation suffices for the word-order and c-command predictions; the extraction and binding predictions are formalized separately using the paper's empirical generalizations.

Toba Batak VOS via VP-raising to Spec,TP.

Steps (bottom-up):

  1. EM-R Obj → [VP V Obj]
  2. EM-L v → [v' v VP]
  3. EM-L Subj → [vP Subj [v' v VP]]
  4. EM-L T → [TP T [vP Subj [v' v VP]]]
  5. IM VP → [TP VP [T' T [vP Subj [v' v tVP]]]]
Equations
  • One or more equations did not get rendered due to their size.
Instances For

    English SVO via subject-raising to Spec,TP.

    Same base as Toba Batak, but the subject (not VP) moves to Spec,TP.

    Equations
    • One or more equations did not get rendered due to their size.
    Instances For

      VP-raising yields Verb-Object-Subject surface order.

      Subject-raising yields Subject-Verb-Object surface order.

      Both derivations have the same tree shape before the movement step (stage 4). The only parametric difference is what moves in step 5.

      After VP-raising, the VP c-commands the subject in the derived tree.

      The derived tree is [TP [VP V Obj] [T' T [vP Subj [v' v tVP]]]]. VP is the left daughter of TP; its sister T' dominates the subject.

      @cite{cole-hermon-2008} use this c-command relation to explain:

      • Freezing: the raised VP is a moved constituent in specifier position, making it an island for extraction. Elements inside VP (including the direct object) are frozen and cannot Ā-extract.
      • Subject accessibility: the subject, outside the raised VP, is stranded and remains accessible for further extraction.

      Note: this does NOT establish "backward binding" by the object into the subject. The paper explicitly shows that active DOs cannot bind a reflexive subject (Table 1, Type C: ill-formed for all speakers). VP c-commanding the subject is a phrasal c-command relation; it does not entail that the DO (properly contained within VP) individually c-commands the subject.

      Freezing under VP-raising #

      @cite{cole-hermon-2008} §4: the VP-raising analysis predicts extraction restrictions via freezing. The raised VP/VoiceP is a moved constituent in specifier position, making it an island (following the Sentential Subject Constraint / Condition on Extraction Domain). The predictions:

      These predictions match the Toba Batak extraction data formalized in Fragments.TobaBatak.Basic and verified in Phenomena.FillerGap.TobaBatak.

      The direct object is contained within the fronted VP. In the VP-raising analysis, this means the DO is frozen — trapped inside a moved constituent — and cannot be further Ā-extracted.

      This is verified computationally: n_biangi is contained in vp.

      The subject is NOT contained within the fronted VP. It is stranded outside the moved constituent and remains accessible for extraction.

      This is the structural basis for the pivot-only extraction restriction: only the subject (= pivot) survives VP-raising in a position where Ā-extraction is possible.

      Extraction prediction: the VP-raising analysis predicts exactly the extraction pattern found in Toba Batak.

      For DP arguments in actor voice:

      • Agent (= subject/pivot, outside VP): grammatical
      • Patient (= DO, inside VP): ungrammatical

      This matches Fragments.TobaBatak.avAgentExtraction (grammatical) and Fragments.TobaBatak.avPatientExtraction (ungrammatical).

      Binding data from Table 1 #

      @cite{cole-hermon-2008} §3.4–§5 present binding data that bear on the choice between a c-command analysis and the Semantic Hierarchy Condition of Schachter (1984b) and Sugamoto (1984). The key data from Table 1:

      AntecedentReflexiveAcceptability
      Active subjectDirect objectType A (fully acceptable)
      Passive agentPassive subjectType A (fully acceptable)
      Passive subjectPassive agentType B (intermediate)
      Active DOActive subjectType C (ill-formed)

      Type A follows from c-command in the base structure (pre-movement). Type C follows from the absence of c-command: the DO does not c-command the subject at any derivational stage. Type B requires reconstruction: the passive subject can be interpreted in its base VP-internal position, where the passive agent c-commands it.

      The VP-raising analysis correctly predicts all four types. The Semantic Hierarchy Condition alone fails to distinguish Types B and C (it predicts both should be ill-formed, since in both cases the patient antecedes the agent reflexive).

      Binding acceptability from Table 1 of @cite{cole-hermon-2008}.

      Instances For
        Equations
        • One or more equations did not get rendered due to their size.
        Instances For

          A binding datum: which NP is the antecedent, which is the reflexive, in which voice, and the acceptability judgment.

          Instances For
            Equations
            • One or more equations did not get rendered due to their size.
            Instances For

              Active subject antecedes DO reflexive: Type A. Example: "Si-Bunga mang-ida [dirina sandiri]" (Bunga saw herself.)

              Equations
              • One or more equations did not get rendered due to their size.
              Instances For

                Passive agent antecedes passive subject reflexive: Type A. Example: "Di-ida si-Torus dirina natoari" (Himself was seen by Torus yesterday.)

                Equations
                • One or more equations did not get rendered due to their size.
                Instances For

                  Passive subject antecedes passive agent reflexive: Type B. Example: "Di-ida [dirina sandiri] si-John" (John was seen by himself.)

                  Equations
                  • One or more equations did not get rendered due to their size.
                  Instances For

                    Active DO antecedes active subject reflexive: Type C. Example: "*[Dirina sandiri] pa-ias-hon dakdanak-i" (*Himself cleaned the child.)

                    Equations
                    • One or more equations did not get rendered due to their size.
                    Instances For

                      All binding data from Table 1.

                      Equations
                      • One or more equations did not get rendered due to their size.
                      Instances For

                        The direct object does not c-command the subject (Boolean check).

                        The DO's inability to bind the subject (Type C) follows from the derivation: the DO is inside VP, and VP c-commands the subject (proved in vp_ccommands_subject), but the DO itself does not c-command the subject. C-command is not inherited by sub-constituents.

                        In the derived tree [TP [VP V DO] [T' T [vP Subj ...]]]:

                        • VP c-commands Subj ✓ (VP's sister T' dominates Subj)
                        • DO does NOT c-command Subj ✗ (DO's sister is V, which does not dominate Subj — V is inside VP, not sister to anything outside VP)

                        This asymmetry is why Types A and C differ: the subject (in Spec,vP) c-commands into VP (can bind a reflexive DO), but the DO (inside VP) cannot c-command out past VP (cannot bind a reflexive subject).

                        Connecting to Toba Batak extraction infrastructure #

                        The VP-raising analysis's extraction predictions are independently formalized in Fragments.TobaBatak.Basic (empirical extraction data) and Phenomena.FillerGap.TobaBatak (verification theorems). This section bridges the derivational analysis to that data.

                        The EPP strategy for Toba Batak is VP-raising, which is the derivational mechanism that produces the freezing effect responsible for the extraction restriction.

                        The extraction profile marks only the subject position as extractable. This is exactly the position that VP-raising strands outside the fronted predicate: the pivot in Spec,TP (or Spec,FP in the full analysis).

                        The VOS Hypothesis #

                        @cite{cole-hermon-2008} §5: SVO order is common in Toba Batak (~1/3 of sentences). Two competing analyses:

                        The data confirm the VOS Hypothesis: extraction from SVO clauses shows the same freezing effects as VOS (examples 85–88). Direct objects cannot be wh-fronted regardless of surface word order.

                        The derivation extends tobaBatakVOS with one more step: subject raises to Spec,FP (a higher functional projection), past the fronted VP.

                        This analysis connects to the claim in §6 that linear order within Merge is irrelevant — only c-command matters. This is precisely the content of the Linear Correspondence Axiom (LCA) formalized in Theories.Syntax.Minimalism.Formal.Linearization.LCA.

                        Toba Batak SVO via the VOS Hypothesis.

                        Steps 1–5 are identical to tobaBatakVOS (yielding VOS at stage 5). Then: 6. EM-L F → [FP F [TP VP [T' T [vP Subj [v' v tVP]]]]] 7. IM Subj → [FP Subj [F' F [TP VP [T' T [vP tSubj [v' v tVP]]]]]]

                        The subject raises past the fronted VP, yielding S-V-O surface order.

                        Equations
                        • One or more equations did not get rendered due to their size.
                        Instances For

                          The VOS Hypothesis derives SVO surface order.

                          SVO goes through VOS: at stage 5 (before subject-raising), the intermediate tree has VOS order — the same as tobaBatakVOS.final.

                          The VOS Hypothesis predicts identical extraction restrictions for SVO: the DO is still inside the fronted VP, regardless of whether the subject subsequently raises past it.

                          SVO requires two movement steps (VP-raising + subject-raising).

                          English passives and the agent-as-adjunct analysis #

                          @cite{cole-hermon-2008} §7 extends the VP-raising analysis to English passives, predicting why English and Toba Batak differ on passive binding.

                          The key structural difference: in TB, the passive agent is an argument generated in Spec,vP (high position, c-commands patient in VP). In English, the passive agent is an adjunct (by-phrase, low position inside VP, does not c-command patient).

                          Consequence:

                          We model the English passive with the agent as a low complement of V (representing the by-phrase adjunct) and the patient as specifier of VP (following @cite{larson-1988}), with no external argument in Spec,vP.

                          The passive VP: [VP patient [V' V agent-PP]].

                          Equations
                          • One or more equations did not get rendered due to their size.
                          Instances For

                            English passive derivation (trees 97–100 of the paper).

                            Steps:

                            1. EM-R agent-PP → [V' V agent]
                            2. EM-L patient → [VP patient [V' V agent]]
                            3. EM-L v → [v' v VP] (no external argument — passive)
                            4. EM-L T → [TP T [vP v VP]]
                            5. IM patient → [TP patient [T' T [vP v [VP t [V' V agent]]]]]
                            Equations
                            • One or more equations did not get rendered due to their size.
                            Instances For

                              English passive yields patient-verb-agent surface order.

                              TB vs English passive binding #

                              The same theory (c-command based binding + optional reconstruction) with different structural parameters (agent-as-argument vs agent-as-adjunct) correctly predicts the cross-linguistic contrast:

                              PatternToba BatakEnglish
                              Agent antecedes patient refl.✓ (Type A)✗ (96)
                              Patient antecedes agent refl.✓ (Type B)✓ (95)

                              Both predictions follow from c-command in the derived tree:

                              The formalization verifies the c-command predictions computationally using cCommandsInB over the derived trees.

                              In the English passive, the patient (raised to Spec,TP) c-commands the by-phrase agent. This is why "The boy was injured by himself" is grammatical: the patient can bind a reflexive in the agent position.

                              In the English passive, the by-phrase agent does NOT c-command the patient. This is why "*Himself was injured by the boy" is ungrammatical: the agent (low adjunct inside VP) cannot bind a reflexive in subject position.

                              In the TB active base structure (pre-movement, stage 4), the subject c-commands the object. This is the structural basis for Type A binding (active subject → DO refl).

                              Binding is evaluated at the pre-movement stage: the base tree is [TP T [vP Subj [v' v [VP V Obj]]]], where Subj's sister (v') contains the object. After VP-raising, the object moves to a different branch; reconstruction restores the base c-command.

                              Cross-linguistic contrast verified: same c-command theory, different structural parameters, different binding predictions.

                              The conjunction links four c-command checks across two languages:

                              1. TB active (base): subject c-commands object (Type A binding)
                              2. TB active (derived): object does not c-command subject (Type C)
                              3. English passive (derived): patient c-commands agent (ex. 95)
                              4. English passive (derived): agent does not c-command patient (ex. 96)