The Taxi Cab Problem: Why 80% Reliable Witnesses Are Usually Wrong

#probabilitytheory

A cab was involved in a hit-and-run accident at night. Two companies operate in the city: Green cabs (85% of all cabs) and Blue cabs (15%). A witness identified the cab as Blue. The court tested the witness under identical conditions and found an 80% correct identification rate (20% error rate). What is the probability the cab was actually Blue?

Most people — including attorneys and judges — immediately answer 80%. The correct answer is 41.4%. It is more likely (58.6%) that the cab was Green, despite a credible witness swearing it was Blue. This is the Base Rate Fallacy, documented by Kahneman and Tversky as one of the most durable cognitive errors in human reasoning.

The Arithmetic Intuition

Consider what happens across 10,000 accidents:

8,500 involve Green cabs → 1,700 witness incorrectly says "Blue" (20% error)
1,500 involve Blue cabs → 1,200 witness correctly says "Blue" (80% accuracy)

Total "Blue" claims: 2,900
Actually Blue: 1,200 → 1,200 / 2,900 = 41.4%

The false positives (1,700) outnumber the true positives (1,200) because Green cabs are so prevalent that even a small error rate generates a flood of false Blue identifications.

The Bayesian Proof

Let B = cab is Blue, G = cab is Green, W_B = witness says "Blue".

P(B) = 0.15, P(G) = 0.85
P(W_B | B) = 0.80, P(W_B | G) = 0.20

P(B | W_B) = P(W_B | B) × P(B) / [P(W_B | B) × P(B) + P(W_B | G) × P(G)]
           = (0.80 × 0.15) / [(0.80 × 0.15) + (0.20 × 0.85)]
           = 0.12 / (0.12 + 0.17)
           = 0.12 / 0.29
           ≈ 0.4138 (41.4%)

Python Simulation (1,000,000 Trials)

import random

def taxi_cab_simulation(n=1_000_000):
    witness_said_blue = 0
    actually_blue = 0
    for _ in range(n):
        cab_is_blue = random.random() < 0.15
        correct = random.random() < 0.80
        witness_says_blue = cab_is_blue if correct else not cab_is_blue
        if witness_says_blue:
            witness_said_blue += 1
            if cab_is_blue:
                actually_blue += 1
    return actually_blue / witness_said_blue

random.seed(42)
print(f"P(Blue | witness says Blue): {taxi_cab_simulation():.4f} (exact: 0.4138)")
# Output: P(Blue | witness says Blue): 0.4142

Litigation Applications

The Base Rate Fallacy governs how juries evaluate conditional evidence in every domain where rare events are tested by imperfect instruments:

Eyewitness testimony: An 80% accurate identification rate does not mean 80% probability of guilt. The posterior depends critically on how many individuals in the suspect pool could plausibly have committed the crime.
Breathalyzer evidence: A 95% accurate instrument applied to a population where base-rate impairment is low produces a non-trivial false positive rate. The probability of actual impairment given a positive reading is substantially lower than 95%.
eDiscovery / Technology-Assisted Review: A 90% accurate document classifier applied to a corpus where 5% of documents are responsive generates roughly equal numbers of true positives and false positives.
Financial fraud detection: A fraud model with 99% specificity applied to billions of transactions still flags tens of millions of legitimate ones annually.

Attorneys who quantify the posterior probability — not just cite the accuracy statistic — can neutralize evidence that appears overwhelming on its face. Gut instinct on conditional probability is demonstrably, mathematically broken.

Read the full article with complete proof and litigation framework →