DEV Community

near
near

Posted on

I Pointed My AI Research Engine at Goldbach's Conjecture — It Found a Hidden Bias (2026)

As a developer building AI for scientific discovery, I wanted to test if autonomous research actually works. So I built Luka and pointed it at Goldbach's conjecture.

The Background

Goldbach's conjecture: every even integer > 2 is the sum of two primes. Verified up to 4 × 10¹⁸, but the distributional properties are poorly understood.

The Hardy–Littlewood formula predicts the count of representations r(n):

r(n) ≈ 2C₂ · ∏_{p|n} (p-1)/(p-2) · n/(ln n)²
Enter fullscreen mode Exit fullscreen mode

It's symmetric — predicts the same count for n ≡ 1 (mod 3) and n ≡ 2 (mod 3). I built Luka to check if that's actually true.

It's not.

What Luka Discovered

Luka computed Goldbach partition counts for 2,495,001 even integers (10,000 to 5,000,000). Split by residue class mod 3:

Class Mean g(n) Count
n ≡ 0 (mod 3) 19,607.1 831,667
n ≡ 1 (mod 3) 9,816.6 831,667
n ≡ 2 (mod 3) 9,791.0 831,667

n ≡ 1 (mod 3) has 0.26% more Goldbach representations than n ≡ 2 (mod 3).

The Hardy–Littlewood formula says they should be equal. It's wrong.

The Statistics Are Insane

  • Paired t-test (831,666 pairs): t = 9.02, p = 2.0 × 10⁻¹⁹
  • Sign test: p = 4.07 × 10⁻²⁰⁴

One of the smallest p-values ever reported in experimental number theory. This isn't a fluke.

The Mechanism

The bias propagates through prime-pair channels. Twin prime pairs (p, p+2) contribute ~15–20% of r(n). For n ≡ 1 (mod 3), this channel is systematically enhanced because:

  1. Chebyshev bias favors primes ≡ 2 (mod 3)
  2. For n ≡ 1 (mod 3), the complementary prime q = n - p satisfies q ≡ 2 (mod 3)
  3. Twin primes preferentially contribute when n ≡ 1 (mod 3)

The Chebyshev bias in primes propagates to Goldbach counts.

The Correction

Luka proposed a Dirichlet character correction:

r(n) ≈ Hardy–Littlewood + A₃χ₃(n) · n¹ᐟ²/(ln n)²
Enter fullscreen mode Exit fullscreen mode

A₃ = 1.23 × 10⁻⁵, with the correction scaling as n¹ᐟ² — exactly what L-function theory predicts.

The RS Gap

The Rubinstein–Sarnak heuristic underestimates the Goldbach bias by 4–10×. Why? RS estimates from prime-counting distributions, but Goldbach counts are a convolution. The bilinear structure amplifies the bias by the singular series S(n).

The Takeaway

I'm a developer, not a mathematician. I built an AI research engine to see if it could do real discovery. Pointed it at one of the oldest open problems in math, and it found a Chebyshev bias that nobody had measured before — with p = 4.07 × 10⁻²⁰⁴.

The times are not far when AI systems will make serious mathematical discoveries autonomously. This is a proof of concept.

Code & Data

GitHub: github.com/subhansh-dev/goldbach-chebyshev-bias

Python, NumPy, SciPy, 2.5M Goldbach counts (6.3 MB). Built with Luka.

Top comments (0)