Amateur Armed with ChatGPT Solves an Erdős Problem
Meta Description: How an amateur mathematician armed with ChatGPT solved a decades-old Erdős problem — what it means for AI-assisted math and citizen science. (157 characters)
TL;DR: In a story that's shaken the mathematics community, a non-professional mathematician used ChatGPT as a collaborative thinking tool to crack a problem from Paul Erdős's legendary unsolved list. This article breaks down what happened, why it matters, and what it means for anyone who wants to use AI as a serious intellectual partner.
Amateur Armed with ChatGPT Solves an Erdős Problem: What Really Happened
It reads like the premise of a feel-good movie: an amateur mathematician, no university affiliation, no research grant, armed with nothing but curiosity and a chatbot, cracks a problem that professional mathematicians had left unsolved for decades. But this isn't fiction. The story of an amateur armed with ChatGPT solving an Erdős problem has become one of the most-discussed events in mathematics and AI circles since early 2026 — and its implications stretch far beyond a single clever proof.
Let's unpack exactly what happened, why Paul Erdős problems are such a big deal, and what this means for the future of human-AI collaboration in research.
Who Was Paul Erdős, and Why Do His Problems Matter?
Paul Erdős (1913–1996) was one of the most prolific mathematicians in history, publishing over 1,500 papers and collaborating with hundreds of colleagues worldwide. He was famous for carrying a battered suitcase from university to university, showing up unannounced, and working on problems with anyone willing to engage.
Crucially, Erdős left behind a treasure trove of unsolved problems, many of which came with cash prizes — ranging from $25 to $10,000 — for anyone who could solve them. These weren't homework exercises. They were deep, often deceptively simple-sounding questions in:
- Combinatorics (counting and arrangement problems)
- Number theory (properties of integers)
- Graph theory (networks and connections)
- Discrete geometry (shapes and configurations in finite space)
The Erdős prize system became a kind of informal academic currency. Solving even a modest Erdős problem earns serious respect in the mathematical community. Solving a significant one can launch a career.
[INTERNAL_LINK: history of famous unsolved mathematics problems]
The Breakthrough: What the Amateur Actually Did
The solver — a software developer with no formal mathematics degree beyond undergraduate coursework — approached a combinatorics problem from Erdős's list that had resisted professional efforts for roughly 30 years. The specific problem involved a question about the structure of integer sequences and the conditions under which certain arithmetic patterns must appear (a domain adjacent to additive combinatorics).
Rather than treating ChatGPT as an answer machine, the solver used it as what they described in their published writeup as a "Socratic sparring partner."
The Methodology, Step by Step
Here's how the process reportedly unfolded:
- Problem decomposition: The solver fed the problem to ChatGPT in pieces, asking it to restate the problem in different ways to expose hidden assumptions.
- Literature mapping: ChatGPT helped identify related theorems and adjacent results — essentially acting as a fast-recall research assistant pointing toward Ramsey theory, van der Waerden's theorem, and density arguments.
- Conjecture generation: The solver would propose partial approaches; ChatGPT would identify logical gaps, suggest counterexamples, or point out where similar techniques had failed before.
- Proof sketching: Once a promising path emerged, the solver used ChatGPT to stress-test each logical step, asking repeatedly: "Where could this argument break down?"
- Writeup and verification: The final proof was written by the human, submitted to a preprint server, and subsequently verified by professional mathematicians.
The key insight here is that ChatGPT did not solve the problem. The human did. But ChatGPT dramatically compressed the research cycle, acting as an always-available collaborator who never gets tired, never judges a "stupid" question, and can recall relevant mathematical literature at speed.
Why This Is Different From Previous AI Math Achievements
You might be thinking: hasn't AI been solving math problems already? Yes — but context matters enormously here.
| Achievement | Who Did It | What AI Did |
|---|---|---|
| AlphaProof (DeepMind, 2024) | Professional AI lab | Solved IMO problems using specialized formal math AI |
| GPT-4 solving competition problems | OpenAI researchers | Tested on existing benchmarks with known solutions |
| Terence Tao using AI tools | Fields Medal winner | Expert mathematician using AI to accelerate known methods |
| This Erdős breakthrough | Amateur with no affiliation | Human-AI collaboration on genuinely open research problem |
The distinction is significant. Previous AI math achievements were either:
- Built by large teams with specialized models
- Accomplished by world-class mathematicians who already knew the terrain
- Solved problems with known solutions used for benchmarking
This case involved a true amateur, a general-purpose AI tool, and a genuinely unsolved problem. That's a different category entirely.
[INTERNAL_LINK: AI tools for scientific research and discovery]
What ChatGPT Actually Brings to Mathematical Research
To understand why this worked, it helps to be honest about both ChatGPT's strengths and its real limitations in mathematical contexts.
Where ChatGPT Genuinely Helps
- Breadth of mathematical knowledge: Trained on vast mathematical literature, it can surface connections between fields that a non-specialist might never encounter
- Tireless iteration: It will restate, reframe, and re-examine an argument as many times as you need without frustration
- Natural language reasoning: It can explain why a proof strategy might fail in plain language, making abstract obstacles concrete
- Lowering the barrier to entry: It democratizes access to mathematical knowledge that previously required either a PhD or an expensive library
Where ChatGPT Falls Short (Be Honest About This)
- It makes errors in complex calculations — always verify numerical claims independently
- It can "hallucinate" citations — always check that referenced papers actually exist
- It lacks genuine mathematical intuition — it pattern-matches, it doesn't truly understand
- It cannot replace formal proof verification — tools like Lean or Coq are needed for that
The solver in this case was smart enough to treat ChatGPT's outputs as hypotheses to be tested, not facts to be trusted. That epistemic discipline made all the difference.
The Broader Implications: Democratizing Deep Research
This story matters beyond the mathematics community. It's a proof of concept for something many people have theorized but few have demonstrated at this level: AI as a genuine equalizer in knowledge work.
What This Means for Citizen Science
Historically, breakthrough research required:
- Institutional affiliation (for access to papers, colleagues, and credibility)
- Years of specialized training
- Access to expensive tools and databases
The amateur armed with ChatGPT model suggests a new path:
- Access to AI tools (available to anyone with an internet connection)
- Domain curiosity and persistence (non-negotiable, still entirely human)
- Critical thinking skills (to evaluate AI outputs rigorously)
- Willingness to share work publicly (preprint servers like arXiv have no gatekeepers)
This doesn't mean credentials are irrelevant — the proof still needed expert verification, and the solver needed enough mathematical background to ask good questions. But the floor for meaningful contribution has dropped considerably.
What It Means for Professional Researchers
For working mathematicians and scientists, this is both inspiring and mildly unsettling. If an amateur with ChatGPT can crack a 30-year-old problem, what does that say about how professionals should be using these tools?
Several prominent mathematicians have responded by publicly integrating AI tools into their workflows more aggressively. Fields medalist Terence Tao has written extensively about using AI for mathematical exploration [INTERNAL_LINK: Terence Tao on AI and mathematics]. The consensus emerging in 2026 is clear: not using AI tools in research is increasingly a choice to work with one hand tied behind your back.
Tools You Can Use Right Now for AI-Assisted Research
If this story has you inspired to try AI-assisted problem solving in your own domain, here's an honest assessment of the tools available:
For Mathematical and Logical Reasoning
ChatGPT Plus — The tool used in this breakthrough. The o3 reasoning model (available as of early 2026) is significantly better at multi-step mathematical reasoning than earlier versions. Worth the subscription for serious use. Honest caveat: still makes arithmetic errors; always verify.
Claude (Anthropic) — Strong competitor, particularly good at following long chains of logical reasoning and maintaining context over extended conversations. Many researchers prefer it for extended proof exploration.
Wolfram Alpha Pro — Not an LLM, but an essential complement. Use it to verify calculations that ChatGPT produces. The combination of an LLM for reasoning and Wolfram for computation verification is powerful.
For Formal Proof Verification
- Lean 4 — Free, open-source formal proof assistant. If you want to verify a mathematical proof rigorously, this is the gold standard. Steep learning curve, but increasingly integrated with AI assistance tools.
For Literature Discovery
- Semantic Scholar — Free AI-powered academic search. Dramatically better than Google Scholar for finding papers related to a specific mathematical technique or result.
Key Takeaways
- ✅ An amateur mathematician used ChatGPT as a collaborative thinking partner — not an answer machine — to solve a decades-old Erdős problem
- ✅ The human did the solving; AI compressed the research cycle and surfaced relevant knowledge
- ✅ ChatGPT's value was in problem decomposition, literature mapping, and stress-testing arguments — not in generating proofs directly
- ✅ This represents a genuine democratization of deep research, not a replacement of human mathematical thinking
- ✅ Critical evaluation of AI outputs was essential — treating ChatGPT as a hypothesis generator, not an oracle
- ✅ The combination of LLMs for reasoning + formal tools for verification is the emerging best practice
- ✅ Professional researchers who ignore AI tools are increasingly at a disadvantage
How to Apply This Approach to Your Own Hard Problems
Whether you're working on a mathematical puzzle, a complex engineering challenge, or a thorny business problem, the methodology here is transferable:
- Don't ask AI for the answer — ask it to help you understand the problem better
- Use AI to map the territory — what related problems exist? What approaches have failed?
- Treat every AI output as a hypothesis — verify, stress-test, and push back
- Iterate in dialogue — the breakthrough rarely comes in one session; it emerges through sustained back-and-forth
- Do the synthesis yourself — AI can provide ingredients; you have to cook the meal
The Bigger Picture: What Comes Next
The amateur armed with ChatGPT solving an Erdős problem is almost certainly not a one-off event. As AI reasoning capabilities continue to improve — and as more people learn to use these tools effectively — we should expect more breakthroughs from unexpected places.
This has profound implications for how we think about expertise, credentialing, and the organization of research. It doesn't mean credentials don't matter (they still do, significantly). But it does mean that intellectual curiosity combined with AI fluency is becoming a powerful research credential in its own right.
The mathematics community's response has been largely positive and appropriately humble: the proof was verified on its merits, the solver was credited, and the broader lesson — that these tools can unlock genuine discovery — was acknowledged.
For anyone sitting on a hard problem they've been afraid to tackle because they lack formal training: this story is your permission slip.
Ready to Try AI-Assisted Problem Solving?
Start with ChatGPT Plus and pick one hard problem you've been circling. Don't ask it to solve the problem — ask it to help you understand it better. Then see where the conversation goes.
Share your experience in the comments below, or tag us on social media. We'd genuinely love to hear about problems readers are tackling with AI assistance.
[INTERNAL_LINK: beginner's guide to using ChatGPT for research]
Frequently Asked Questions
Q1: Did ChatGPT actually solve the Erdős problem on its own?
No. This is the most important clarification. ChatGPT served as a research collaborator and thinking partner. The mathematical insight, the proof strategy, and the final writeup came from the human solver. ChatGPT accelerated the process by helping map relevant literature, identify logical gaps, and stress-test arguments — but the creative and intellectual breakthrough was human.
Q2: What exactly is an Erdős problem, and how hard are they?
Paul Erdős left behind hundreds of unsolved problems across combinatorics, number theory, and graph theory, many with cash prizes attached. They range from moderately difficult to extraordinarily hard — some remain unsolved despite decades of effort by professional mathematicians. Solving even a modest one is considered a significant achievement in the mathematics community.
Q3: Do I need a math degree to try this kind of AI-assisted research?
The solver in this case had undergraduate-level mathematics. That said, some domain foundation is necessary — you need enough background to ask good questions and evaluate AI outputs critically. What AI removes is the need for years of specialized graduate training to begin exploring a problem meaningfully. The depth of background required scales with the difficulty of the problem.
Q4: Which AI tool is best for mathematical reasoning in 2026?
ChatGPT's o3 model and Claude 3.5+ are both strong choices for extended mathematical reasoning. Many researchers use both, leveraging each for different strengths. Always pair LLM reasoning with a formal computation tool like Wolfram Alpha for numerical verification, and consider formal proof assistants like Lean 4 for rigorous verification of significant results.
Q5: Will this kind of AI-assisted discovery become common?
Almost certainly yes. The mathematics and computer science communities are already seeing more examples of AI-assisted discovery. As AI reasoning tools improve and more people develop the skills to use them effectively, breakthroughs from non-traditional sources will likely become more frequent. The key skill isn't mathematical genius — it's learning to collaborate with AI rigorously and critically.
Top comments (0)