Hacker News 中文摘要

RSS订阅

GPTZero在NeurIPS 2025录用论文中发现100处新幻觉 -- GPTZero finds 100 new hallucinations in NeurIPS 2025 accepted papers

文章摘要

GPTZero在NeurIPS 2025接收论文中发现了100处新的虚构内容,包括作者伪造、标题匹配但内容不同的文章等学术不端行为。例如"Webvoyager"论文作者信息造假,arXiv链接指向不同文章;另一篇论文引用虚构的IEEE期刊文献。这些发现揭示了当前AI生成论文存在的可信度问题。

文章总结

GPTZero在NeurIPS 2025录用论文中发现100处AI幻觉引用

核心发现
人工智能检测平台GPTZero对NeurIPS 2025会议录用论文进行扫描,在4841篇论文中识别出100个存在问题的学术引用。这些"幻觉引用"(Hallucinated Citations)具有以下典型特征: 1. 作者信息虚构 - 如将真实学者姓名与虚构姓氏组合(如"Samuel LeCun Jackson") 2. 标题拼贴改写 - 混合多篇真实论文标题生成似是而非的新标题 3. 虚假文献标识 - 伪造DOI、arXiv编号或会议页码(如不存在的CVPR 2023论文) 4. 期刊信息错位 - 正确作者搭配错误期刊信息(如Nature论文被篡改为Science & Nature发表)

典型案例
- 某论文引用虚构的《Deep Learning for Hyper-Realistic Avatar Creation》(CVPR 2022),实际CVPR会议记录中无此论文 - 引用链中出现arXiv:2401.XXXXX等不完整预印本编号 - 真实学者团队(如Tri Dao等)被错误关联到非本人研究成果

技术背景
GPTZero采用"氛围引用"(Vibe Citing)检测技术,通过以下维度识别异常: - 在线源验证(99%误检率) - 作者-出版物关联分析 - 文献元数据一致性检测

行业影响
该发现延续了GPTZero此前在ICLR 2026论文中发现的50余处引用问题。学术出版界正面临AI辅助写作带来的新型学术诚信挑战,包括: - LLM自动生成的"合理但虚假"文献 - 传统查重工具无法识别的合成引用 - 需3-5名审稿人交叉验证的隐蔽错误

解决方案
GPTZero推出"幻觉检测器"(Hallucination Check)工具,提供: - 文献在线可验证性扫描 - 作者-标题-出版物三重校验 - 与AI文本检测的联合分析

目前该团队已与ICLR等会议合作,将检测流程整合至论文评审环节。数据显示,使用检测工具可节省约5000美元/篇的专家复核成本。

(注:本文保留了原文核心数据与检测方法论,删减了重复的案例列举,优化了技术术语的中文表达,突出了学术不端新型态的警示意义。)

评论总结

评论内容总结

1. AI生成论文的危害

  • 主要观点:AI生成论文会加剧科研领域的造假问题,降低研究的可信度。
  • 论据
    • "There is already a problem with papers falsifying data/samples/etc, LLMs being able to put out plausible papers is just going to make it worse." (评论1)
    • "AI generated hypothesis -> AI produces code to implement and execute the hypothesis -> AI generates paper based on the hypothesis and the code." (评论16)

2. 同行评审的失效

  • 主要观点:同行评审未能有效识别AI生成的虚假内容,暴露了评审流程的漏洞。
  • 论据
    • "It's very concerning that these hallucinations passed through peer review... reviewers did not check all references and noticed clearly bogus ones." (评论7)
    • "These clearly aren’t being peer-reviewed, so there’s no natural check on LLM usage." (评论9)

3. 科研激励机制的扭曲

  • 主要观点:当前的“发表或灭亡”机制鼓励低质量研究,AI加剧了这一现象。
  • 论据
    • "Getting papers published is now more about embellishing your CV versus a sincere desire to present new research." (评论22)
    • "The sheer volume of papers makes it nearly impossible to properly decide which papers have merit." (评论14)

4. 解决方案与建议

  • 主要观点:需加强监管、改进评审流程,或彻底改革科研评价体系。
  • 论据
    • "There needs to be a serious amount of education done on what these tools can and cannot do." (评论23)
    • "Doing this should get you barred from research. It won’t." (评论28)

5. AI检测工具的争议

  • 主要观点:AI检测工具本身可能存在误判,需谨慎使用。
  • 论据
    • "What repercussion does GPTZero get when their bullshit AI detection hallucinates a student using AI?" (评论17)
    • "Should be extremely easy for AI to successfully detect hallucinated references." (评论18)

6. 对AI领域的批评

  • 主要观点:AI/ML领域存在大量浮夸和虚假研究,问题早于AI生成内容。
  • 论据
    • "A lot of research in AI/ML seems to me to be 'fake it and never make it'." (评论10)
    • "Machine learning has been the go-to field for scammers and grifters." (评论20)

7. 中立或乐观观点

  • 主要观点:AI生成内容未必完全无效,需区分使用场景。
  • 论据
    • "Even if 1.1% of the papers have incorrect references... the content of the papers themselves are not necessarily invalidated." (评论13)
    • "On the bright side, maybe this will get the scientific community to finally take reproducibility more seriously." (评论1)

关键引用保留

  • 负面观点
    • "Yuck, this is going to really harm scientific research." (评论1)
    • "It would be great if those scientists who use AI without disclosing it get fucked for life." (评论2)
  • 解决方案
    • "Why not just incorporate some kind of screening into the early stages of peer review?" (评论3)
    • "All papers proved to have used a LLM beyond writing improvement should be automatically retracted." (评论27)
  • 行业批评
    • "Machine learning is basically a few real ideas... and then millions of fad chasers and scammers." (评论20)
    • "Getting a paper published anywhere is a checkbox in completing your resume." (评论22)

总结:评论普遍担忧AI生成论文对科研诚信的冲击,批评同行评审失效和激励机制扭曲,同时提出监管、工具改进等解决方案,部分观点认为问题反映了更广泛的学术生态问题。