Documentation

Linglib.Phenomena.Reference.Studies.KehlerRohde2013

@cite{kehler-rohde-2013} #

@cite{hobbs-1979} @cite{kehler-2002}

A Probabilistic Reconciliation of Coherence-Driven and Centering-Driven Theories of Pronoun Interpretation. Theoretical Linguistics 39(1-2), 1–37.

Core Argument #

Two theories make seemingly irreconcilable claims about pronoun interpretation. @cite{hobbs-1979}: it is a by-product of coherence establishment; grammatical form is irrelevant. Centering (Grosz, Joshi & Weinstein 1995): it is driven by information structure and grammatical roles; world knowledge is irrelevant.

The reconciliation is a Bayesian decomposition (eq. 13):

P(referent | pronoun) ∝ P(pronoun | referent) × P(referent)

The two terms have different conditioning:

P(referent): coherence-driven next-mention bias, computed via eq. (9): P(referent) = Σ_CR P(CR) × P(referent | CR)
P(pronoun | referent): production/form bias, driven by topichood (centering's contribution)

Five experiments with transfer-of-possession verbs and IC verbs confirm that these two components are empirically dissociable.

Key Findings #

#	Finding	Section
1	Imperfective → more Source interpretations than perfective	§3
2	Coherence relations strongly condition next-mention bias	§4
3	Shifting P(CR) via instructions shifts interpretation	§5
4	P(referent\|CR) stable across conditions	§6
5	Pronoun prompt shifts CR distribution bidirectionally	§7
6	Voice affects next-mention but not pronominalization per position	§8
7	Passive subject → more pronominalization than active subject	§8
8	Bayesian predictions match actual interpretation biases	§8
9	Contiguity class splits: Occasion → Goal, Elaboration → Source	§9

Independence Hypothesis #

P(pronoun | referent) is conditioned by topichood/subjecthood, while P(referent) is conditioned by coherence relations. These two components are independent: coherence-driven semantic biases affect next-mention but NOT pronominalization rate.

inductive Phenomena.Reference.Studies.KehlerRohde2013.PromptType :

Prompt type in passage completion experiments.

pronoun : PromptType
noPronoun : PromptType

Instances For

instance Phenomena.Reference.Studies.KehlerRohde2013.instDecidableEqPromptType :

DecidableEq PromptType

Equations

Phenomena.Reference.Studies.KehlerRohde2013.instDecidableEqPromptType x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯

instance Phenomena.Reference.Studies.KehlerRohde2013.instReprPromptType :

Repr PromptType

Equations

Phenomena.Reference.Studies.KehlerRohde2013.instReprPromptType = { reprPrec := Phenomena.Reference.Studies.KehlerRohde2013.instReprPromptType.repr }

def Phenomena.Reference.Studies.KehlerRohde2013.instReprPromptType.repr :

PromptType → Nat → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.instBEqPromptType.beq :

PromptType → PromptType → Bool

Equations

Phenomena.Reference.Studies.KehlerRohde2013.instBEqPromptType.beq x✝ y✝ = (x✝.ctorIdx == y✝.ctorIdx)

Instances For

instance Phenomena.Reference.Studies.KehlerRohde2013.instBEqPromptType :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.instBEqPromptType = { beq := Phenomena.Reference.Studies.KehlerRohde2013.instBEqPromptType.beq }

inductive Phenomena.Reference.Studies.KehlerRohde2013.InstructionCond :

Instruction condition (transfer-of-possession exps).

whatNext : InstructionCond
why : InstructionCond

Instances For

instance Phenomena.Reference.Studies.KehlerRohde2013.instDecidableEqInstructionCond :

DecidableEq InstructionCond

Equations

Phenomena.Reference.Studies.KehlerRohde2013.instDecidableEqInstructionCond x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯

instance Phenomena.Reference.Studies.KehlerRohde2013.instReprInstructionCond :

Repr InstructionCond

Equations

Phenomena.Reference.Studies.KehlerRohde2013.instReprInstructionCond = { reprPrec := Phenomena.Reference.Studies.KehlerRohde2013.instReprInstructionCond.repr }

def Phenomena.Reference.Studies.KehlerRohde2013.instReprInstructionCond.repr :

InstructionCond → Nat → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

instance Phenomena.Reference.Studies.KehlerRohde2013.instBEqInstructionCond :

BEq InstructionCond

Equations

Phenomena.Reference.Studies.KehlerRohde2013.instBEqInstructionCond = { beq := Phenomena.Reference.Studies.KehlerRohde2013.instBEqInstructionCond.beq }

def Phenomena.Reference.Studies.KehlerRohde2013.instBEqInstructionCond.beq :

InstructionCond → InstructionCond → Bool

Equations

Phenomena.Reference.Studies.KehlerRohde2013.instBEqInstructionCond.beq x✝ y✝ = (x✝.ctorIdx == y✝.ctorIdx)

Instances For

structure Phenomena.Reference.Studies.KehlerRohde2013.NextMentionModel :

Eq. (9): coherence-marginalized next-mention bias.

P(referent) = Σ_CR P(CR) × P(referent | CR)

The prior probability of a referent being mentioned next is a mixture of CR-specific biases weighted by the prior over coherence relations. This is the coherence-driven "top-down" component.

pCR : Core.Discourse.CoherenceRelation.CoherenceRelation → Nat
P(CR): prior probability of coherence relation (%)
pSourceGivenCR : Core.Discourse.CoherenceRelation.CoherenceRelation → Nat
P(referent = Source | CR): Source bias given CR (%)

Instances For

inductive Phenomena.Reference.Studies.KehlerRohde2013.TopichoodLevel :

Topichood level, determined by grammatical construction.

Passive subjects signal stronger topichood than active subjects: using a marked construction to place an entity in subject position is a stronger indicator that the speaker treats it as the sentence topic (Davison 1984). This is the centering-driven "bottom-up" component of the model.

The P(pronoun | referent) term in eq. (13) tracks this level, not grammatical role per se.

strong : TopichoodLevel
default_ : TopichoodLevel
low : TopichoodLevel

Instances For

instance Phenomena.Reference.Studies.KehlerRohde2013.instDecidableEqTopichoodLevel :

DecidableEq TopichoodLevel

Equations

Phenomena.Reference.Studies.KehlerRohde2013.instDecidableEqTopichoodLevel x✝ y✝ = if h : x✝.ctorIdx = y✝.ctorIdx then isTrue ⋯ else isFalse ⋯

instance Phenomena.Reference.Studies.KehlerRohde2013.instReprTopichoodLevel :

Repr TopichoodLevel

Equations

Phenomena.Reference.Studies.KehlerRohde2013.instReprTopichoodLevel = { reprPrec := Phenomena.Reference.Studies.KehlerRohde2013.instReprTopichoodLevel.repr }

def Phenomena.Reference.Studies.KehlerRohde2013.instReprTopichoodLevel.repr :

TopichoodLevel → Nat → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

instance Phenomena.Reference.Studies.KehlerRohde2013.instBEqTopichoodLevel :

BEq TopichoodLevel

Equations

Phenomena.Reference.Studies.KehlerRohde2013.instBEqTopichoodLevel = { beq := Phenomena.Reference.Studies.KehlerRohde2013.instBEqTopichoodLevel.beq }

def Phenomena.Reference.Studies.KehlerRohde2013.instBEqTopichoodLevel.beq :

TopichoodLevel → TopichoodLevel → Bool

Equations

Phenomena.Reference.Studies.KehlerRohde2013.instBEqTopichoodLevel.beq x✝ y✝ = (x✝.ctorIdx == y✝.ctorIdx)

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.topichood (voice : UD.Voice) (isSubject : Bool) :

Compute topichood from voice and surface position.

Equations

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.sourceInterp_perfective :

Table 1: Source interpretation rate by aspect. Imperfective focuses on ongoing event (Source still central); perfective focuses on end state (Goal = endpoint of transfer).

Equations

Phenomena.Reference.Studies.KehlerRohde2013.sourceInterp_perfective = 57

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.sourceInterp_imperfective :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.sourceInterp_imperfective = 80

Instances For

theorem Phenomena.Reference.Studies.KehlerRohde2013.imperfective_more_source :

sourceInterp_imperfective > sourceInterp_perfective

Imperfective yields more Source interpretations than perfective.

structure Phenomena.Reference.Studies.KehlerRohde2013.CRDatum :

Coherence relation frequency and bias data from Table 2 (perfective condition, transfer-of-possession verbs). "Violated Expectation" in the paper = CoherenceRelation.contrast.

cr : Core.Discourse.CoherenceRelation.CoherenceRelation
freqPct : Nat
sourceGivenCR : Nat

Instances For

instance Phenomena.Reference.Studies.KehlerRohde2013.instReprCRDatum :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.instReprCRDatum = { reprPrec := Phenomena.Reference.Studies.KehlerRohde2013.instReprCRDatum.repr }

def Phenomena.Reference.Studies.KehlerRohde2013.instReprCRDatum.repr :

CRDatum → Nat → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.cr_occasion :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.cr_occasion = { cr := Core.Discourse.CoherenceRelation.CoherenceRelation.occasion, freqPct := 38, sourceGivenCR := 18 }

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.cr_elaboration :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.cr_elaboration = { cr := Core.Discourse.CoherenceRelation.CoherenceRelation.elaboration, freqPct := 28, sourceGivenCR := 98 }

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.cr_explanation :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.cr_explanation = { cr := Core.Discourse.CoherenceRelation.CoherenceRelation.explanation, freqPct := 18, sourceGivenCR := 80 }

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.cr_violatedExp :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.cr_violatedExp = { cr := Core.Discourse.CoherenceRelation.CoherenceRelation.contrast, freqPct := 8, sourceGivenCR := 76 }

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.cr_result :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.cr_result = { cr := Core.Discourse.CoherenceRelation.CoherenceRelation.result, freqPct := 6, sourceGivenCR := 8 }

Instances For

theorem Phenomena.Reference.Studies.KehlerRohde2013.goal_biased_crs :

cr_occasion.sourceGivenCR < 50 ∧ cr_result.sourceGivenCR < 50

Occasion and Result are Goal-biased (Source < 50%).

theorem Phenomena.Reference.Studies.KehlerRohde2013.source_biased_crs :

cr_elaboration.sourceGivenCR > 50 ∧ cr_explanation.sourceGivenCR > 50 ∧ cr_violatedExp.sourceGivenCR > 50

Elaboration, Explanation, and Violated Expectation are Source-biased.

theorem Phenomena.Reference.Studies.KehlerRohde2013.biases_masked_by_mixture :

cr_occasion.sourceGivenCR < 50 ∧ cr_elaboration.sourceGivenCR > 50 ∧ cr_occasion.freqPct > cr_elaboration.freqPct

The overall ~57/43 Source/Goal split masks strong CR-conditioned biases. Occasion is most common (.38) and Goal-biased (.18 Source); Elaboration is second (.28) and strongly Source-biased (.98).

def Phenomena.Reference.Studies.KehlerRohde2013.perfective_model :

NextMentionModel

Instantiate the perfective-condition next-mention model with Table 2 data. Downstream study files can reference these CR biases.

Equations

One or more equations did not get rendered due to their size.

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.whatNext_occasion_pct :

Table 3: "What happened next?" → Occasion-dominated; "Why?" → Explanation-dominated. Instructions shift P(CR) without changing the stimuli.

Equations

Phenomena.Reference.Studies.KehlerRohde2013.whatNext_occasion_pct = 71

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.whatNext_explanation_pct :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.whatNext_explanation_pct = 1

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.why_occasion_pct :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.why_occasion_pct = 1

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.why_explanation_pct :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.why_explanation_pct = 91

Instances For

theorem Phenomena.Reference.Studies.KehlerRohde2013.instructions_shift_pCR :

whatNext_occasion_pct > why_occasion_pct ∧ why_explanation_pct > whatNext_explanation_pct

def Phenomena.Reference.Studies.KehlerRohde2013.whatNext_sourcePct :

Table 5: Source interpretation by instruction condition (perfective). Shifting P(CR) shifts P(referent), as predicted by eq. (9).

Equations

Phenomena.Reference.Studies.KehlerRohde2013.whatNext_sourcePct = 34

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.why_sourcePct :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.why_sourcePct = 82

Instances For

theorem Phenomena.Reference.Studies.KehlerRohde2013.instructions_shift_interpretation :

why_sourcePct > whatNext_sourcePct

theorem Phenomena.Reference.Studies.KehlerRohde2013.instruction_effect_magnitude :

why_sourcePct - whatNext_sourcePct > 40

The instruction effect is 48 pp on identical stimuli. No morphosyntactic heuristic can account for this.

structure Phenomena.Reference.Studies.KehlerRohde2013.StabilityDatum :

Table 4: P(Source | CR) is stable across the original experiment and the instruction manipulation, supporting the structural claim that CR-conditioned biases are properties of the coherence relation itself, not the experimental context.

cr : Core.Discourse.CoherenceRelation.CoherenceRelation
originalPct : Nat
instructionPct : Nat

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.instReprStabilityDatum.repr :

StabilityDatum → Nat → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

instance Phenomena.Reference.Studies.KehlerRohde2013.instReprStabilityDatum :

Repr StabilityDatum

Equations

Phenomena.Reference.Studies.KehlerRohde2013.instReprStabilityDatum = { reprPrec := Phenomena.Reference.Studies.KehlerRohde2013.instReprStabilityDatum.repr }

def Phenomena.Reference.Studies.KehlerRohde2013.stab_elaboration :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.stab_elaboration = { cr := Core.Discourse.CoherenceRelation.CoherenceRelation.elaboration, originalPct := 98, instructionPct := 100 }

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.stab_explanation :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.stab_explanation = { cr := Core.Discourse.CoherenceRelation.CoherenceRelation.explanation, originalPct := 80, instructionPct := 82 }

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.stab_violatedExp :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.stab_violatedExp = { cr := Core.Discourse.CoherenceRelation.CoherenceRelation.contrast, originalPct := 76, instructionPct := 74 }

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.stab_occasion :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.stab_occasion = { cr := Core.Discourse.CoherenceRelation.CoherenceRelation.occasion, originalPct := 18, instructionPct := 27 }

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.stab_result :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.stab_result = { cr := Core.Discourse.CoherenceRelation.CoherenceRelation.result, originalPct := 8, instructionPct := 9 }

Instances For

Bias direction (above/below 50%) is preserved for all five CRs across conditions. P(CR) can shift independently of P(ref|CR).

structure Phenomena.Reference.Studies.KehlerRohde2013.PromptCRDatum :

Table 6: CR distribution by prompt type. The mere presence of an ambiguous pronoun shifts coherence expectations toward Source-biased relations. This bidirectionality — coreference affects coherence, not just vice versa — is predicted by Bayes (eq. 12) but not by Hobbs (pronouns are inert free variables) or Centering (does not model coherence).

prompt : PromptType
cr : Core.Discourse.CoherenceRelation.CoherenceRelation
freqPct : Nat

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.instReprPromptCRDatum.repr :

PromptCRDatum → Nat → Std.Format

Equations

One or more equations did not get rendered due to their size.

Instances For

instance Phenomena.Reference.Studies.KehlerRohde2013.instReprPromptCRDatum :

Repr PromptCRDatum

Equations

Phenomena.Reference.Studies.KehlerRohde2013.instReprPromptCRDatum = { reprPrec := Phenomena.Reference.Studies.KehlerRohde2013.instReprPromptCRDatum.repr }

def Phenomena.Reference.Studies.KehlerRohde2013.np_elaboration :

Equations

One or more equations did not get rendered due to their size.

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.np_explanation :

Equations

One or more equations did not get rendered due to their size.

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.np_occasion :

Equations

One or more equations did not get rendered due to their size.

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.np_result :

Equations

One or more equations did not get rendered due to their size.

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.np_violatedExp :

Equations

One or more equations did not get rendered due to their size.

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.pp_elaboration :

Equations

One or more equations did not get rendered due to their size.

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.pp_explanation :

Equations

One or more equations did not get rendered due to their size.

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.pp_occasion :

Equations

One or more equations did not get rendered due to their size.

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.pp_result :

Equations

One or more equations did not get rendered due to their size.

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.pp_violatedExp :

Equations

One or more equations did not get rendered due to their size.

Instances For

theorem Phenomena.Reference.Studies.KehlerRohde2013.pronoun_boosts_source_CRs :

pp_elaboration.freqPct > np_elaboration.freqPct ∧ pp_explanation.freqPct > np_explanation.freqPct

Pronoun prompt increases Source-biased CRs.

theorem Phenomena.Reference.Studies.KehlerRohde2013.pronoun_reduces_goal_CRs :

pp_occasion.freqPct < np_occasion.freqPct ∧ pp_result.freqPct < np_result.freqPct

Pronoun prompt decreases Goal-biased CRs.

def Phenomena.Reference.Studies.KehlerRohde2013.nm_active_pron :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.nm_active_pron = 77

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.nm_active_noPron :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.nm_active_noPron = 59

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.nm_passive_pron :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.nm_passive_pron = 42

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.nm_passive_noPron :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.nm_passive_noPron = 76

Instances For

theorem Phenomena.Reference.Studies.KehlerRohde2013.voice_affects_nextMention :

nm_active_pron > nm_passive_pron

Voice affects next-mention in pronoun condition: active (.77) vs passive (.42). Passivization moves the causally-implicated referent out of subject position — same proposition, different bias.

theorem Phenomena.Reference.Studies.KehlerRohde2013.noPronoun_pattern_reverses :

nm_passive_noPron > nm_active_noPron

In the no-pronoun condition the pattern reverses: passive (.76) > active (.59). By-phrases are optional in English, so their inclusion signals the referent will be re-mentioned.

def Phenomena.Reference.Studies.KehlerRohde2013.expl_active_pron :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.expl_active_pron = 75

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.expl_active_noPron :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.expl_active_noPron = 60

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.expl_passive_pron :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.expl_passive_pron = 52

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.expl_passive_noPron :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.expl_passive_noPron = 72

Instances For

theorem Phenomena.Reference.Studies.KehlerRohde2013.voice_affects_coherence :

expl_active_pron > expl_passive_pron

Voice affects coherence in pronoun condition: active produces more Explanations than passive. Since propositions are identical, this is mediated by the shift in pronominal reference — demonstrating bidirectional coherence–coreference dependency.

def Phenomena.Reference.Studies.KehlerRohde2013.pron_active_subj :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.pron_active_subj = 62

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.pron_active_nonSubj :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.pron_active_nonSubj = 24

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.pron_passive_subj :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.pron_passive_subj = 87

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.pron_passive_nonSubj :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.pron_passive_nonSubj = 23

Instances For

theorem Phenomena.Reference.Studies.KehlerRohde2013.passive_subj_more_pronominalized :

pron_passive_subj > pron_active_subj

Central topichood prediction: passive subjects are pronominalized more than active subjects (87% vs 62%).

This is NOT explicable by grammatical role alone — both are subjects. It reflects the stronger topichood signal of the passive: using a marked syntactic form to place an entity in subject position is a stronger indicator of topic status. This is the key evidence that P(pronoun | referent) tracks TOPICHOOD, not subjecthood.

theorem Phenomena.Reference.Studies.KehlerRohde2013.nonSubj_pron_invariant :

pron_active_nonSubj - pron_passive_nonSubj ≤ 1

Non-subject pronominalization is invariant across voice (24% vs 23%). At the same topichood level (low), the voice manipulation — which changes coherence expectations dramatically — has no effect on pronominalization rate. This is the Independence Hypothesis in action: P(pronoun | referent) does not depend on coherence-driven factors.

theorem Phenomena.Reference.Studies.KehlerRohde2013.subject_advantage_both_voices :

pron_active_subj > pron_active_nonSubj ∧ pron_passive_subj > pron_passive_nonSubj

Subjects are pronominalized more than non-subjects in both voices. This subject advantage is the centering-derived component.

theorem Phenomena.Reference.Studies.KehlerRohde2013.topichood_monotone :

pron_passive_subj > pron_active_subj ∧ pron_active_subj > pron_active_nonSubj

Topichood monotonically predicts pronominalization: strong (passive subject, 87%) > default (active subject, 62%)

low (non-subject, ~24%).

def Phenomena.Reference.Studies.KehlerRohde2013.predicted_active_subj :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.predicted_active_subj = 81

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.actual_active_subj :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.actual_active_subj = 74

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.predicted_passive_subj :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.predicted_passive_subj = 59

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.actual_passive_subj :

Equations

Phenomena.Reference.Studies.KehlerRohde2013.actual_passive_subj = 60

Instances For

theorem Phenomena.Reference.Studies.KehlerRohde2013.bayesian_directionally_correct :

predicted_active_subj > predicted_passive_subj ∧ actual_active_subj > actual_passive_subj

Bayesian predictions are directionally correct: active > passive in both predicted and actual biases.

theorem Phenomena.Reference.Studies.KehlerRohde2013.passive_prediction_accurate :

actual_passive_subj - predicted_passive_subj ≤ 1

The passive prediction is highly accurate (59% vs 60%).

def Phenomena.Reference.Studies.KehlerRohde2013.NextMentionModel.sourceBasisPts (m : NextMentionModel) :

Compute the coherence-marginalized Source bias from a NextMentionModel. This IS equation (9): P(Source) = Σ_CR P(CR) × P(Source | CR). Result is in basis points (×10000); divide by 100 for percentage.

Equations

One or more equations did not get rendered due to their size.

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.whatNext_model :

NextMentionModel

Equations

One or more equations did not get rendered due to their size.

Instances For

def Phenomena.Reference.Studies.KehlerRohde2013.why_model :

NextMentionModel

Equations

One or more equations did not get rendered due to their size.

Instances For

theorem Phenomena.Reference.Studies.KehlerRohde2013.instruction_models_share_bias :

whatNext_model.pSourceGivenCR = why_model.pSourceGivenCR

Structural invariant: the two instruction models share the same CR-conditioned biases. The instruction manipulation changes P(CR) while holding P(ref|CR) constant. This is the structural content of Table 4.

theorem Phenomena.Reference.Studies.KehlerRohde2013.eq9_why_exceeds_whatNext :

why_model.sourceBasisPts > whatNext_model.sourceBasisPts

Eq. (9) derivation: the "Why?" mixture exceeds the "What next?" mixture. This is DERIVED from the model, not read off Table 5. The proof computes: Why: 1×27 + 91×82 + 8×100 + 1×74 + 0×9 + 0×50 = 8363 What next: 71×27 + 1×82 + 5×100 + 8×74 + 5×9 + 10×50 = 3636 and verifies 8363 > 3636. The direction follows from Explanation (Source-biased at 82%) dominating the Why mixture at 91%.

theorem Phenomena.Reference.Studies.KehlerRohde2013.eq9_mixtures_approximate_table5 :

why_model.sourceBasisPts / 100 > 80 ∧ whatNext_model.sourceBasisPts / 100 < 40

The computed mixtures are consistent with Table 5: Why → ~84% Source, What-next → ~36% Source (vs observed 82% and 34%). The small discrepancy is from integer rounding and the "Other" CR category.

def Phenomena.Reference.Studies.KehlerRohde2013.bayesianPrediction (pSubj pPronSubj pPronNonSubj : Nat) :

Compute P(Subject | pronoun) via Bayes' rule (eq. 13). Takes P(Subject next-mentioned) from no-pronoun data and P(pronoun | position) from pronominalization rates. Result is a percentage (0–100).

Equations

One or more equations did not get rendered due to their size.

Instances For

theorem Phenomena.Reference.Studies.KehlerRohde2013.eq13_active_prediction :

bayesianPrediction nm_active_noPron pron_active_subj pron_active_nonSubj > 50

Eq. (13) derivation: active voice. From:

P(Subject) = 59% (Table 7, no-pronoun, causal ref = subject)
P(pronoun | Subject) = 62% (Table 9)
P(pronoun | NonSubject) = 24% (Table 9) Bayes' rule yields: 62×59 / (62×59 + 24×41) = 3658/4642 ≈ 78%. The paper reports 81% (from unrounded data); the direction matches.

theorem Phenomena.Reference.Studies.KehlerRohde2013.eq13_passive_prediction :

bayesianPrediction (100 - nm_passive_noPron) pron_passive_subj pron_passive_nonSubj > 50

Eq. (13) derivation: passive voice. From:

P(Subject) = 100 - 76 = 24% (Table 7: 76% mention causal ref, who is the NON-subject in passive)
P(pronoun | Subject) = 87% (Table 9)
P(pronoun | NonSubject) = 23% (Table 9) Bayes' rule yields: 87×24 / (87×24 + 23×76) = 2088/3836 ≈ 54%.

theorem Phenomena.Reference.Studies.KehlerRohde2013.eq13_active_exceeds_passive :

bayesianPrediction nm_active_noPron pron_active_subj pron_active_nonSubj > bayesianPrediction (100 - nm_passive_noPron) pron_passive_subj pron_passive_nonSubj

Central Bayesian prediction: Bayes' rule correctly derives that active > passive for P(Subject | pronoun), even though passive subjects are more likely to be pronominalized (87% vs 62%). The prior P(Subject) is much lower in passive (24% vs 59%), and this dominates. Production bias alone would predict passive > active; the Bayesian model correctly reverses this.

The two Goal-biased CRs (Occasion, Result) both focus on what happens AFTER the prior event. For transfer verbs, the endpoint is the Goal.

theorem Phenomena.Reference.Studies.KehlerRohde2013.explanation_source_and_backward :

cr_explanation.cr.selectsCause = true ∧ cr_explanation.sourceGivenCR > 50

Explanation is Source-biased and selects for causes (backward causal). For transfer verbs, the Source/initiator is the cause. For IC verbs, the stimulus is the cause — this is the bridge to IC bias studies.

theorem Phenomena.Reference.Studies.KehlerRohde2013.contiguity_class_splits :

cr_occasion.cr.toClass = cr_elaboration.cr.toClass ∧ cr_occasion.sourceGivenCR < 50 ∧ cr_elaboration.sourceGivenCR > 50

Key insight: the contiguity class does NOT uniformly predict bias. Occasion (18% Source) and Elaboration (98% Source) are both contiguity relations but have opposite biases. Occasion focuses on the END STATE (Goal); Elaboration redescribes the SAME EVENT (Source/initiator). The bias is determined by the specific relation, not the class.