Delafosse Olivier

Posted on Feb 10 • Originally published at coreprose.com

NeurIPS 2025's Hallucinated Citations: How 100+ Fake References Slipped into Elite AI Research

#ai #machinelearning #research #programming

Originally published on CoreProse KB-incidents

In 2025, NeurIPS – the world’s flagship machine learning conference – quietly crossed a new frontier in AI risk: its own proceedings.

After the conference, GPTZero scanned 4,841 accepted papers and uncovered hundreds of hallucinated citations that had survived peer review, live presentation, and publication. At least 100 hallucinations across 51–53 papers were confirmed.

These were not fringe submissions. Teams from Google, Harvard, Meta, and the University of Cambridge all had papers implicated, marking the first documented case of hallucinated citations entering the final record of a major AI venue.

With an acceptance rate of 24.52%, every flawed paper had beaten more than 15,000 rejected submissions, yet still cited research that does not exist. This is more than an embarrassment; it exposes a structural weakness in how AI research is produced and vetted.

What GPTZero Found Inside NeurIPS 2025

GPTZero’s Hallucination Check tool flagged “100s of hallucinated citations” in the 4,841 accepted NeurIPS 2025 papers, then manually validated 100 of them across just over 50 papers.

📊 Key numbers

4,841 accepted papers scanned
100+ hallucinated citations confirmed
51–53 papers affected (≈1.1% of the program)
24.52% acceptance rate; >15,000 papers rejected

Additional context:

Affected work came from top industrial labs and universities, indicating a systemic vulnerability, not isolated misconduct.
Fortune reported “more than 4,000” accepted NeurIPS papers contained hundreds of hallucinated citations across at least 53 papers, unnoticed by reviewers.
NeurIPS treats hallucinated citations as grounds for rejection or revocation, equating them with fabrication.

Despite each paper receiving three or more reviews, hallucinations passed through submission, review, and publication. Given NeurIPS’s role in shaping research agendas, hiring, and funding, even a 1% contamination rate can distort the field for years.

Why Hallucinated Citations Are a Systemic Threat

GPTZero’s analysis shows hallucinations arise through multiple patterns that are hard to catch under time pressure.

Common patterns:

Fully invented works: Fake authors, titles, venues, or dead URLs.
Blended references: Fragments of several real papers fused into one fake citation.
Subtly corrupted citations: Real works with altered authors, titles, or details that break searchability.

Likely mechanisms:

Models complete partial prompts (e.g., a fragmentary title) by fabricating bibtex that looks plausible.
Errors are optimized to pass a quick “looks right” check, exploiting overworked reviewers’ heuristics.
Authors increasingly offload bibliography drafting to LLMs, then fail to verify outputs.

GPTZero estimates roughly half of the NeurIPS papers with hallucinated citations showed strong signs of AI-generated drafting or heavy AI assistance. When:

Authors trust LLMs for references, and
Reviewers assume bibliographies are mostly correct,

hallucinations bypass both defenses.

Scale amplifies this risk:

NeurIPS submissions grew from 9,467 in 2020 to 21,575 in 2025 – over 220% growth.
Reviewer pools expanded, diluting topical expertise and oversight.

In this environment, bibliographic fabrication becomes effectively invisible until it enters the literature.

Because modern AI research is often hard to fully reproduce, citations now function as “foundational” anchors for continuity and verification. Polluting this layer means:

Literature reviews inherit phantom prior work.
Meta-analyses absorb fabricated data points.
Early-career researchers follow misleading citation trails.

Over time, the map of the field drifts away from reality.

From Damage Control to a New Integrity Standard

NeurIPS is also a global recruiting marketplace where a strong paper can directly yield offers from OpenAI, Anthropic, and other top labs. Hallucinated citations therefore distort both knowledge and career outcomes.

GPTZero argues the incident threatens the reputations of researchers, institutions, and the conference, especially under “publish or perish” incentives that reward rapid, AI-assisted drafting over careful verification. Fixing this requires standardized safeguards, not one-off cleanups.

💼 Integrity response in motion

GPTZero is working with ICLR and other publishers to integrate hallucination checks as a formal publication step, aiming for “0 hallucinations” in print.
An earlier scan of ICLR 2026 submissions had already surfaced 50 hallucinated citations, showing NeurIPS is not unique.
Some experts call for strong sanctions: retracting all affected NeurIPS papers and temporarily banning their authors to reset norms.

Emerging consensus points toward:

Automated hallucination checks for references,
Transparent disclosure of LLM use in drafting, and
Aggressive post-publication corrections and retractions.

These must become core elements of scientific practice in AI, not optional add-ons.

Conclusion: Rebuilding Trust in AI Research

The discovery of 100+ hallucinated citations across more than 50 NeurIPS 2025 papers shows that even elite venues can no longer assume bibliographies are reliable in the age of large language models.

Conference organizers, reviewers, and authors now need institutionalized integrity measures: systematic hallucination scanning, explicit LLM-usage documentation, and prompt corrections or retractions when AI-assisted fabrication is found. The credibility of AI research depends on whether this episode is treated as a brief scandal or as the turning point that forced a new standard of rigor.

About CoreProse: Research-first AI content generation with verified citations. Zero hallucinations.

🔗 Try CoreProse | 📚 More KB Incidents

Top comments (1)

Hollow House Institute • Apr 10

This looks like a hallucination problem on the surface, but structurally it’s a Decision Boundary failure.

The system is allowed to generate references without enforcement at the point of creation.

Once that happens, everything downstream relies on verification catching it.

Decision Boundary:
If a citation cannot be resolved to a verifiable source, it should not be allowed into the document.

Intervention Threshold:
When reference generation occurs without validation, escalation should trigger immediately before submission.

Stop Authority:
If citations remain unverified, the paper should not proceed to review or publication.

Right now the system is:

generate → trust → review → publish

Instead of:

generate → validate → enforce → proceed

That’s why this passes through peer review.

The failure isn’t just model behavior.

It’s that verification is optional instead of enforced at runtime.