I Pointed My AI Research Engine at Goldbach's Conjecture — It Found a Hidden Bias (2026)

#math #primes #ai #goldbach

As a developer building AI for scientific discovery, I wanted to test if autonomous research actually works. So I built Luka and pointed it at Goldbach's conjecture.

The Background

Goldbach's conjecture: every even integer > 2 is the sum of two primes. Verified up to 4 × 10¹⁸, but the distributional properties are poorly understood.

The Hardy–Littlewood formula predicts the count of representations r(n):

r(n) ≈ 2C₂ · ∏_{p|n} (p-1)/(p-2) · n/(ln n)²

It's symmetric — predicts the same count for n ≡ 1 (mod 3) and n ≡ 2 (mod 3). I built Luka to check if that's actually true.

It's not.

What Luka Discovered

Luka computed Goldbach partition counts for 2,495,001 even integers (10,000 to 5,000,000). Split by residue class mod 3:

Class	Mean g(n)	Count
n ≡ 0 (mod 3)	19,607.1	831,667
n ≡ 1 (mod 3)	9,816.6	831,667
n ≡ 2 (mod 3)	9,791.0	831,667

n ≡ 1 (mod 3) has 0.26% more Goldbach representations than n ≡ 2 (mod 3).

The Hardy–Littlewood formula says they should be equal. It's wrong.

The Statistics Are Insane

Paired t-test (831,666 pairs): t = 9.02, p = 2.0 × 10⁻¹⁹
Sign test: p = 4.07 × 10⁻²⁰⁴

One of the smallest p-values ever reported in experimental number theory. This isn't a fluke.

The Mechanism

The bias propagates through prime-pair channels. Twin prime pairs (p, p+2) contribute ~15–20% of r(n). For n ≡ 1 (mod 3), this channel is systematically enhanced because:

Chebyshev bias favors primes ≡ 2 (mod 3)
For n ≡ 1 (mod 3), the complementary prime q = n - p satisfies q ≡ 2 (mod 3)
Twin primes preferentially contribute when n ≡ 1 (mod 3)

The Chebyshev bias in primes propagates to Goldbach counts.

The Correction

Luka proposed a Dirichlet character correction:

r(n) ≈ Hardy–Littlewood + A₃χ₃(n) · n¹ᐟ²/(ln n)²

A₃ = 1.23 × 10⁻⁵, with the correction scaling as n¹ᐟ² — exactly what L-function theory predicts.

The RS Gap

The Rubinstein–Sarnak heuristic underestimates the Goldbach bias by 4–10×. Why? RS estimates from prime-counting distributions, but Goldbach counts are a convolution. The bilinear structure amplifies the bias by the singular series S(n).

The Takeaway

I'm a developer, not a mathematician. I built an AI research engine to see if it could do real discovery. Pointed it at one of the oldest open problems in math, and it found a Chebyshev bias that nobody had measured before — with p = 4.07 × 10⁻²⁰⁴.

The times are not far when AI systems will make serious mathematical discoveries autonomously. This is a proof of concept.