AI-Generated Citations: The Dark Side of NeurIPS 2025 | Exposing Fake References (2026)

Imagine discovering that over 100 citations in papers from one of the world’s most prestigious AI conferences were completely fabricated. That’s exactly what a recent report claims happened at NeurIPS 2025, sending shockwaves through the academic and AI communities. But here’s where it gets controversial: these AI-generated citations, known as 'hallucinations,' slipped past reviewers and made their way into the official record of a conference where acceptance rates are as low as 24.52%. How did this happen, and what does it mean for the future of AI research? Let’s dive in.

NeurIPS, short for Neural Information Processing Systems, held its 38th annual meeting in San Diego in December, attracting tens of thousands of submissions and participants. What began as a niche academic gathering has transformed into a high-stakes arena for top AI labs, where a standout paper can catapult researchers into lucrative careers. Being selected for a live presentation at NeurIPS is a badge of honor, marking the presenter as part of the field’s elite.

But this elite status is now under scrutiny. Canadian startup GPTZero analyzed over 4,000 papers accepted for NeurIPS 2025 and uncovered a startling issue: at least 53 papers contained AI-hallucinated citations, ranging from entirely fabricated references to subtly altered ones. These weren’t just minor errors—some included nonexistent authors, fake journals, and URLs leading to dead ends. And this is the part most people miss: these papers weren’t just submissions; they were accepted and presented, meaning they survived rigorous peer review.

Here’s the kicker: NeurIPS reviewers were explicitly instructed to flag hallucinations in 2025, yet these citations still made it through. The conference’s board responded by acknowledging the evolving role of Large Language Models (LLMs) in research and emphasizing that while 1.1% of papers may contain incorrect references, the core content isn’t necessarily invalid. For instance, authors might have used an LLM to generate a formatted citation from a partial description. Still, the question remains: if citations are meant to anchor research in existing work, what happens when those anchors are imaginary?

Edward Tian, cofounder and CEO of GPTZero, highlighted the gravity of the situation. Just weeks earlier, his team had uncovered 50 hallucinated citations in papers under review for ICLR, another top AI conference. In that case, the papers hadn’t been accepted yet, but the citations had already fooled peer reviewers. ICLR has since hired GPTZero to screen future submissions, but NeurIPS’s findings are more alarming because the errors appeared in accepted papers. In a field where ‘publish or perish’ is the mantra, even a single fabricated citation should, in theory, disqualify a paper. Yet these papers not only survived but thrived, outperforming 15,000 others.

Tian pointed out that roughly half of the papers with hallucinated citations were likely AI-generated or heavily AI-assisted. But GPTZero’s focus wasn’t on identifying AI-written text—a task often criticized for false positives—but on verifying citations. Their tool searches the open web and academic databases to confirm whether a cited paper exists, boasting over 99% accuracy. For the NeurIPS analysis, every flagged citation was also manually reviewed by a human expert.

Alex Cui, GPTZero’s CTO, explained how the tool works: it scans a paper, verifies each citation’s authors, title, publication venue, and link, and flags anything that doesn’t match. This catches not only fully fabricated citations but also subtle errors, like adding nonexistent authors to real papers. ‘These are mistakes no human would reasonably make,’ Cui noted.

But here’s the bigger question: Is AI making academic fraud easier, or are we just better at detecting it now? The sheer volume of submissions—21,575 in 2025 alone—makes deep scrutiny nearly impossible, even with thousands of volunteer reviewers. AI tools like GPTZero are stepping in to fill the gap, but the reputational risks remain. A flawed paper doesn’t just harm the authors; it undermines the credibility of the conference and the companies hiring based on those credentials.

Citations are particularly critical in AI research, where reproducibility is a persistent challenge. Hallucinated citations don’t just mislead readers—they erode trust in the scientific process. As Tian put it, ‘This is a big moment.’ But what’s the solution? Stricter review processes? Better AI detection tools? Or a fundamental shift in how we approach research in the age of AI?

What do you think? Is this a minor hiccup or a sign of deeper issues in AI research? Let us know in the comments—we’d love to hear your take on this controversial topic.

AI-Generated Citations: The Dark Side of NeurIPS 2025 | Exposing Fake References (2026)
Top Articles
Latest Posts
Recommended Articles
Article information

Author: Twana Towne Ret

Last Updated:

Views: 6046

Rating: 4.3 / 5 (64 voted)

Reviews: 95% of readers found this page helpful

Author information

Name: Twana Towne Ret

Birthday: 1994-03-19

Address: Apt. 990 97439 Corwin Motorway, Port Eliseoburgh, NM 99144-2618

Phone: +5958753152963

Job: National Specialist

Hobby: Kayaking, Photography, Skydiving, Embroidery, Leather crafting, Orienteering, Cooking

Introduction: My name is Twana Towne Ret, I am a famous, talented, joyous, perfect, powerful, inquisitive, lovely person who loves writing and wants to share my knowledge and understanding with you.