Documentation

Linglib.Phenomena.Reference.Studies.WaldonDegen2021

@cite{waldon-degen-2021} — Continuous-Incremental RSA (CI-RSA) #

@cite{cohn-gordon-goodman-potts-2019} @cite{degen-etal-2020}

Waldon, B. & Degen, J. (2021). Modeling cross-linguistic production of referring expressions. Proceedings of the Society for Computation in Linguistics (SCiL) 4, 206–215.

The Model #

CI-RSA synthesizes two RSA extensions:

  1. Incremental RSA (@cite{cohn-gordon-goodman-potts-2019}): Word-by-word production via the chain rule S1(u|r) = ∏ₖ S1(wₖ | [w₁,...,wₖ₋₁], r)
  2. Continuous semantics (@cite{degen-etal-2020}): Noisy adjective reliability L^C(r, i) = v^i if i true of r, else 1 - v^i

The incremental meaning function averages continuous semantics over grammatical completions of the current prefix:

X^C(c, i, r) = Σ_{u ⊒ c+i} ⟦u⟧^C(r) / |{u : u ⊒ c+i}|

The utterance set is scene-filtered: only utterances Boolean-true of at least one scene member are included (Figure 1).

Formalization #

This builds on RSAConfig's sequential infrastructure (following @cite{cohn-gordon-goodman-potts-2019}), adding:

The three predictions are trajectory probability comparisons across different RSAConfig instances (language × scene).

Predictions #

#PredictionStatus
1English color/size asymmetry: SS > CSrsa_predict
2Cross-linguistic: English SS > Spanish SSrsa_predict
3Spanish flip: CS > SS for redundant sizersa_predict
4Overall: English total > Spanish totalrsa_predict

Connections #

Words available to the incremental speaker: two color adjectives, two size adjectives, a noun ("pin"), and an explicit stop token. The stop token models the speaker's choice to end the utterance; without it, postnominal word orders lack a way to represent the stopping decision after the noun (cf. English where "pin" naturally terminates utterances).

Instances For
    Equations
    • One or more equations did not get rendered due to their size.
    Equations
    • One or more equations did not get rendered due to their size.
    Instances For

      Referents in the 2×2 reference game: big/small × blue/red.

      Instances For
        Equations
        • One or more equations did not get rendered due to their size.
        Equations
        • One or more equations did not get rendered due to their size.
        Instances For

          Whether a word is veridically true of a referent.

          Equations
          Instances For

            Continuous lexical interpretation L^C(r, i). Returns v^i if true, (1 - v^i) if false.

            Equations
            • One or more equations did not get rendered due to their size.
            Instances For

              Continuous utterance meaning ⟦u⟧^C(r) = ∏_{w ∈ u} L^C(r, w).

              Equations
              • One or more equations did not get rendered due to their size.
              Instances For

                All grammatical English (prenominal) utterances, each terminated by .stop. In English the noun always comes last before stop, so "pin" naturally precedes the stopping decision.

                Equations
                • One or more equations did not get rendered due to their size.
                Instances For

                  All grammatical Spanish (postnominal) utterances, each terminated by .stop. The stop token is critical here: after [pin, blue], the S1 chooses between .stop (2-word non-redundant) and .small (continuing to the 3-word redundant utterance). Without .stop, the model forces continuation whenever valid extensions exist.

                  Equations
                  • One or more equations did not get rendered due to their size.
                  Instances For

                    Scene-filtered utterances: only those Boolean-true of at least one scene member (Figure 1). This yields 7 utterances per scene.

                    Equations
                    • One or more equations did not get rendered due to their size.
                    Instances For

                      Incremental continuous meaning: average continuous semantics over all grammatical completions of prefix.

                      X^C(c, i, r) = Σ_{u ⊒ c+i} ⟦u⟧^C(r) / |{u : u ⊒ c+i}|

                      Equations
                      • One or more equations did not get rendered due to their size.
                      Instances For

                        Real-valued continuous meaning (for RSAConfig).

                        Equations
                        Instances For

                          CI-RSA configuration parameterized by utterance set and scene.

                          • L0 uses extension-based continuous meaning, returning 0 for referents outside the scene
                          • S1 uses rpow-based scoring with α = 7 and per-word cost C(i)
                          • S1(i|c,r) ∝ L0(r|c,i)^α · exp(−α · C(i)) (Section 4)

                          Note: v^color = 0.95 here, matching the paper's fitted values. This differs from the @cite{degen-etal-2020} value of v^color = 0.99 used in RSA.Core.Noise, because the two papers fit different experimental datasets.

                          Equations
                          • One or more equations did not get rendered due to their size.
                          Instances For

                            English (prenominal) CI-RSA in size-sufficient scene.

                            Equations
                            • One or more equations did not get rendered due to their size.
                            Instances For

                              English (prenominal) CI-RSA in color-sufficient scene.

                              Equations
                              • One or more equations did not get rendered due to their size.
                              Instances For

                                Spanish (postnominal) CI-RSA in size-sufficient scene.

                                Equations
                                • One or more equations did not get rendered due to their size.
                                Instances For

                                  Spanish (postnominal) CI-RSA in color-sufficient scene.

                                  Equations
                                  • One or more equations did not get rendered due to their size.
                                  Instances For

                                    Color adjectives have higher reliability than size adjectives. This asymmetry drives the redundant modification predictions.

                                    All semantic values are positive (required for valid probability).

                                    lexContinuousQ is an instance of the unified noise channel from RSA.Core.Noise. The continuous lexical semantics L^C(r, i) is exactly the noise channel with onMatch = v^i, onMismatch = 1 - v^i, b = 1 if item i is true of referent r, 0 otherwise.

                                    This connects @cite{waldon-degen-2021} to the @cite{degen-etal-2020} parameterization where mismatch = 1 - match.