文章摘要
GPTZero在NeurIPS 2025接收论文中发现了100处新的虚构内容,包括作者伪造、标题匹配但内容不同的文章等学术不端行为。例如"Webvoyager"论文作者信息造假,arXiv链接指向不同文章;另一篇论文引用虚构的IEEE期刊文献。这些发现揭示了当前AI生成论文存在的可信度问题。
文章总结
GPTZero在NeurIPS 2025录用论文中发现100处AI幻觉引用
核心发现
人工智能检测平台GPTZero对NeurIPS 2025会议录用论文进行扫描,在4841篇论文中识别出100个存在问题的学术引用。这些"幻觉引用"(Hallucinated Citations)具有以下典型特征:
1. 作者信息虚构 - 如将真实学者姓名与虚构姓氏组合(如"Samuel LeCun Jackson")
2. 标题拼贴改写 - 混合多篇真实论文标题生成似是而非的新标题
3. 虚假文献标识 - 伪造DOI、arXiv编号或会议页码(如不存在的CVPR 2023论文)
4. 期刊信息错位 - 正确作者搭配错误期刊信息(如Nature论文被篡改为Science & Nature发表)
典型案例
- 某论文引用虚构的《Deep Learning for Hyper-Realistic Avatar Creation》(CVPR 2022),实际CVPR会议记录中无此论文
- 引用链中出现arXiv:2401.XXXXX等不完整预印本编号
- 真实学者团队(如Tri Dao等)被错误关联到非本人研究成果
技术背景
GPTZero采用"氛围引用"(Vibe Citing)检测技术,通过以下维度识别异常:
- 在线源验证(99%误检率)
- 作者-出版物关联分析
- 文献元数据一致性检测
行业影响
该发现延续了GPTZero此前在ICLR 2026论文中发现的50余处引用问题。学术出版界正面临AI辅助写作带来的新型学术诚信挑战,包括:
- LLM自动生成的"合理但虚假"文献
- 传统查重工具无法识别的合成引用
- 需3-5名审稿人交叉验证的隐蔽错误
解决方案
GPTZero推出"幻觉检测器"(Hallucination Check)工具,提供:
- 文献在线可验证性扫描
- 作者-标题-出版物三重校验
- 与AI文本检测的联合分析
目前该团队已与ICLR等会议合作,将检测流程整合至论文评审环节。数据显示,使用检测工具可节省约5000美元/篇的专家复核成本。
(注:本文保留了原文核心数据与检测方法论,删减了重复的案例列举,优化了技术术语的中文表达,突出了学术不端新型态的警示意义。)
评论总结
评论内容总结
1. AI生成论文的危害
- 主要观点:AI生成论文会加剧科研领域的造假问题,降低研究的可信度。
- 论据:
- "There is already a problem with papers falsifying data/samples/etc, LLMs being able to put out plausible papers is just going to make it worse." (评论1)
- "AI generated hypothesis -> AI produces code to implement and execute the hypothesis -> AI generates paper based on the hypothesis and the code." (评论16)
2. 同行评审的失效
- 主要观点:同行评审未能有效识别AI生成的虚假内容,暴露了评审流程的漏洞。
- 论据:
- "It's very concerning that these hallucinations passed through peer review... reviewers did not check all references and noticed clearly bogus ones." (评论7)
- "These clearly aren’t being peer-reviewed, so there’s no natural check on LLM usage." (评论9)
3. 科研激励机制的扭曲
- 主要观点:当前的“发表或灭亡”机制鼓励低质量研究,AI加剧了这一现象。
- 论据:
- "Getting papers published is now more about embellishing your CV versus a sincere desire to present new research." (评论22)
- "The sheer volume of papers makes it nearly impossible to properly decide which papers have merit." (评论14)
4. 解决方案与建议
- 主要观点:需加强监管、改进评审流程,或彻底改革科研评价体系。
- 论据:
- "There needs to be a serious amount of education done on what these tools can and cannot do." (评论23)
- "Doing this should get you barred from research. It won’t." (评论28)
5. AI检测工具的争议
- 主要观点:AI检测工具本身可能存在误判,需谨慎使用。
- 论据:
- "What repercussion does GPTZero get when their bullshit AI detection hallucinates a student using AI?" (评论17)
- "Should be extremely easy for AI to successfully detect hallucinated references." (评论18)
6. 对AI领域的批评
- 主要观点:AI/ML领域存在大量浮夸和虚假研究,问题早于AI生成内容。
- 论据:
- "A lot of research in AI/ML seems to me to be 'fake it and never make it'." (评论10)
- "Machine learning has been the go-to field for scammers and grifters." (评论20)
7. 中立或乐观观点
- 主要观点:AI生成内容未必完全无效,需区分使用场景。
- 论据:
- "Even if 1.1% of the papers have incorrect references... the content of the papers themselves are not necessarily invalidated." (评论13)
- "On the bright side, maybe this will get the scientific community to finally take reproducibility more seriously." (评论1)
关键引用保留
- 负面观点:
- "Yuck, this is going to really harm scientific research." (评论1)
- "It would be great if those scientists who use AI without disclosing it get fucked for life." (评论2)
- 解决方案:
- "Why not just incorporate some kind of screening into the early stages of peer review?" (评论3)
- "All papers proved to have used a LLM beyond writing improvement should be automatically retracted." (评论27)
- 行业批评:
- "Machine learning is basically a few real ideas... and then millions of fad chasers and scammers." (评论20)
- "Getting a paper published anywhere is a checkbox in completing your resume." (评论22)
总结:评论普遍担忧AI生成论文对科研诚信的冲击,批评同行评审失效和激励机制扭曲,同时提出监管、工具改进等解决方案,部分观点认为问题反映了更广泛的学术生态问题。