Thurstone's Theory of Discriminal Processes @cite{luce-1959} #
@cite{luce-1959} §2.D (pp. 53-60): @cite{thurstone-1927}'s Case V model of paired comparison, and the logistic approximation that connects it to the Luce choice rule.
Thurstone Case V #
Each stimulus a evokes a random discriminal process — a Gaussian random
variable with mean u(a) (the scale value) and standard deviation σ. When
a subject compares a and b, they sample one discriminal process for each
stimulus. The probability of choosing a over b is the probability that the
sample for a exceeds that for b:
P(a,b) = Φ((u(a) - u(b)) / (σ√2))
where Φ is the standard normal CDF. Case V assumes equal variances across
all stimuli — the "simplest nontrivial case" in Thurstone's taxonomy.
The Logistic Approximation (pp. 58-59) #
Luce observes that the logistic function 1/(1 + exp(-x)) closely approximates
the normal CDF Φ(x · π/√3). The maximum absolute deviation between the two
is approximately 0.01. This means Thurstone's Case V is approximately a special
case of the Luce model:
P(a,b) ≈ 1/(1 + exp(-k(u(a) - u(b))))
for k = π / (σ√6). The logistic approximation is what makes the
connection to Luce's ratio-scale framework (§2.A) and hence to softmax (§2).
Strong Stochastic Transitivity #
Thurstone Case V satisfies strong stochastic transitivity: if u(a) > u(b) > u(c),
then P(a,c) > max(P(a,b), P(b,c)). This is stronger than the weak stochastic
transitivity that Luce's axioms alone guarantee.
Thurstone's Case V model (@cite{thurstone-1927}; @cite{luce-1959}, §2.D).
Each stimulus has a scale value scale(a) and all stimuli share a common
discriminal dispersion sigma > 0. The choice probability is determined
by the normal CDF applied to the standardized scale difference.
- scale : Stimulus → ℝ
The scale value (mean of the discriminal process) for each stimulus.
- sigma : ℝ
The common discriminal dispersion (standard deviation).
The dispersion is strictly positive.
Instances For
Choice probability under Thurstone Case V:
P(a,b) = Φ((u(a) - u(b)) / (σ√2)).
This is the probability that the discriminal process for a exceeds
that for b, when both are independent Gaussians with means u(a), u(b)
and common variance σ². The difference is Gaussian with mean
u(a) - u(b) and variance 2σ², hence standard deviation σ√2.
Equations
- m.choiceProb a b = Core.normalCDF ((m.scale a - m.scale b) / (m.sigma * √2))
Instances For
When u(a) = u(b), the choice probability is 1/2 (indifference).
Complementarity: P(a,b) + P(b,a) = 1.
If u(a) > u(b), then P(a,b) > 1/2 — the higher-scale stimulus
is chosen more often than chance.
Strong stochastic transitivity (Thurstone Case V).
If u(a) > u(b) > u(c), then P(a,c) > P(a,b) — the "big gap" comparison
is easier than either "small gap" comparison.
Proof: u(a) - u(c) > u(a) - u(b), so after dividing by σ√2 > 0,
the argument to Φ is larger, and Φ is strictly monotone.
The right half of strong stochastic transitivity:
if u(a) > u(b) > u(c), then P(a,c) > P(b,c).
Thurstone Case V and the Luce Model #
Set d = u(a) - u(b) and k = π / (σ · √6). Then the exact identity:
d / (σ√2) = k · d · (√3/π)
rewrites the Thurstone formula as:
P_T(a,b) = Φ(d / (σ√2)) = Φ(k·d · √3/π)
Since Φ(y · √3/π) ≈ logistic(y) numerically (max error ~0.023 with
variance matching; see @cite{luce-1959} §2.D.2, Table 3), this gives:
P_T(a,b) ≈ logistic(k·d) = 1/(1 + exp(-k·(u(a) - u(b))))
The constant k = π/(σ√6) arises from matching variances: the standard
logistic has variance π²/3, while the Thurstone difference distribution
(two i.i.d. N(0,σ²) draws) has variance 2σ². Setting π²β²/3 = 2σ²
gives β = σ√6/π, so k = 1/β = π/(σ√6).
The Gumbel-Luce model (GumbelLuce.lean) gives exactly logistic(d/β)
by McFadden's theorem — no approximation. The Thurstone model gives
exactly Φ(d/(σ√2)). They agree up to Φ ≈ logistic which is a
purely numerical fact (~0.023 max error with variance matching, ~0.009
with the optimal constant 1.702).
The scaling constant connecting Thurstone and Luce:
k = π / (σ · √6) so that (u(a)-u(b))/(σ√2) = k·(u(a)-u(b))·(√3/π).
Instances For
Thurstone–Luce identity (@cite{luce-1959}, §2.D): the Thurstone
choice probability equals normalCDF evaluated at the variance-matched
Luce argument scaled by √3/π.
P_T(a,b) = Φ(d/(σ√2)) = Φ(k·d·√3/π)
where k = π/(σ√6) and d = u(a) - u(b). Since Φ(y·√3/π) ≈ logistic(y)
numerically, this gives P_T(a,b) ≈ logistic(k·d) — the Luce model.
The approximation Φ(y·√3/π) ≈ logistic(y) has max error ~0.023
(variance matching) and is a numerical fact without analytical proof.
Theorem 7: Luce and Thurstone Diverge for Three or More Alternatives #
@cite{luce-1959} Theorem 7 (§2.D.3): for pairwise comparisons (n = 2),
the Luce and Thurstone models are approximately equivalent
(thurstone_luce_identity). For n ≥ 3 alternatives, they are
fundamentally incompatible: no independent Thurstone discriminal
processes can generate both the "choose best" and "choose worst"
probabilities predicted by the Luce model.
The proof has two steps:
Thurstone integral identity: For independent discriminal processes,
P_best(x) - P_worst(x) = P(x,y) + P(x,z) - 1(expanding the product of CDFs).Algebraic contradiction: Under axiom 1,
P_best(x) = v(x)/ΣvandP_worst(x) = (1/v(x))/Σ(1/v). Setting the axiom 1 difference equal toP(x,y) + P(x,z) - 1forcesP(x,y)·P(y,x)·P(z,x) = 0, contradicting non-degeneracy.
The algebraic core (step 2) is formalized below.
Luce–Thurstone incompatibility (@cite{luce-1959}, Theorem 7):
for three alternatives with positive Luce scales, the axiom 1
"best-worst difference" does NOT equal P(x,y) + P(x,z) - 1
(the value predicted by independent Thurstone processes).
Specifically, axiom 1 gives:
P_best(0) = v₀ / (v₀ + v₁ + v₂)P_worst(0) = v₁v₂ / (v₀v₁ + v₀v₂ + v₁v₂)P(0,1) + P(0,2) - 1 = v₀/(v₀+v₁) + v₀/(v₀+v₂) - 1
If these are equal (as the Thurstone integral identity requires),
and P(0,1) + P(0,2) ≠ 1 (Luce's hypothesis ii, equivalent to
v₀² ≠ v₁v₂), then v₀v₁v₂ = 0, contradicting positivity.
The proof clears denominators, factors out (v₀² - v₁v₂), and
shows the remaining factor is -v₀v₁v₂ ≠ 0.
Luce–Thurstone incompatibility (general): the n = 3 result extends to any n ≥ 3 alternatives by restricting to a 3-element subset.
By IIA (Luce's axiom 1), the pairwise probabilities P(i,j) = vᵢ/(vᵢ+vⱼ)
are the same regardless of choice set. The Thurstone integral identity
only needs to hold for one triple {i, j, k} to derive a
contradiction. So the incompatibility between Luce and independent
Thurstone processes holds whenever n ≥ 3 and any non-degenerate
triple exists.
This is why Luce states Theorem 7 for |T| = 3 — the general case follows immediately from IIA + the base case.