Hacker News 中文摘要

文章摘要

文章指出开源社区正面临AI垃圾内容泛滥的问题，以GitHub案例为例，AI机器人大量生成低质量评论和攻击性内容，严重干扰了正常开发交流，导致开源协作环境恶化。这标志着传统开源模式正在发生根本性改变。

文章总结

当开源遭遇AI洪水：Archestra的反击战

开源社区的AI危机

当GitHub自豪地宣布AI对其平台指标的巨大贡献时，却忽视了贡献质量的下滑。Archestra团队最早察觉异常是在发布一个900美元悬赏任务后：虽然吸引了真实开发者参与讨论，但很快被253条AI生成的垃圾评论淹没——包括无意义的"实施方案"甚至对维护者的攻击。

这种AI账户泛滥现象迅速蔓延： - 仅添加x.ai支持的需求就收到27个未测试的PR - 团队每周需花费半天清理AI垃圾 - 真实贡献者@ethanwater等人的讨论被完全掩盖

反击措施三部曲

信誉评分系统
开发"London-Cat"机器人，通过合并PR等指标评估贡献者信誉（示例可见相关issue），但治标不治本。
AI治安官
自动关闭可疑PR的机器人（示例PR），但误伤率较高。
核弹级方案
实施五步准入流程（附图1），包括：
- 网站端道德AI规则确认+验证码
- 自动提交贡献者信息到EXTERNAL_CONTRIBUTORS.md
- 通过GitHub API进行身份绑定（具体代码流程详见原文技术细节部分）

技术实现巧思

利用Git的"Limit to prior contributors"设置（附图2），通过--author参数将外部贡献者标记为提交作者（附图3）： bash gh api users/their-username --jq '.id' git commit --author="their-username <ID+username@users.noreply.github.com>" -m "添加贡献者"

开源社区的警示

安全风险：LiteLLM仓库曾发生AI机器人引导恶意攻击事件
质量危机：GitHub增长指标中大量AI噪音
社区共识：必须建立AI时代的开源内容过滤机制

（注：原文中关于招聘测试任务等次要细节已精简，保留核心问题描述和技术解决方案）

评论总结

以下是评论内容的总结：

有效解决方案的认可
- 作者ildari提出的通过设置"require prior contribution"和验证码来减少AI垃圾PR的方法获得认可，首周成功拦截500个机器人。
  "Worked really well and we were able to block at least 500 bots in the first week."
- petterroea认为这是一个聪明的解决方案，同时肯定GitHub提供的工具。
  "What I see is a (clever) hack, and GitHub continuing to provide good tools to its users."
对解决方案的质疑与改进建议
- captn3m0指出该方法存在安全隐患，恶意用户可能通过简单修改获得更高权限。
  "A malicious user could meet this requirement by getting a simple typo or other innocuous change accepted by a maintainer."
- zzzeek建议使用钩子自动拒绝未通过验证的用户，而非依赖GitHub的临时功能。
  "why not use hooks to automatically reject issue comments / PRs etc. from users that didnt go through onboarding?"
- optionalsquid认为问题并未解决，只是从PR转移到了提交记录中。
  "It has just been moved from pull requests to commits."
其他替代方案的建议
- arecsu提出基于ELO评分系统的解决方案，根据用户贡献质量进行评分和过滤。
  "Issues and PRs could be sorted and filtered by their ELO score."
- silverwind建议GitHub对高拒绝率的PR提交者实施临时封禁。
  "Maybe GitHub should temporarily block accounts from raising PRs if like 95%+ of them are getting rejected."
- infinitifall半开玩笑地提出使用"猫娘"作为验证机制。
  "Is the solution to everything simply more catgirls?"
对问题的根本原因分析
- jart认为金钱奖励对开源社区有负面影响，应改为尊重和认可。
  "This is great example of the toxic effect money has on open source."
- embedding-shape指出问题部分源于悬赏任务描述不明确。
  "attaching $900 USD bounty to a very under-specified issue...Sounds like they got exactly what they'd been asking for."
- thih9发现项目文档中存在AI写作痕迹，认为措施不够完善。
  "The writing style in their onboarding doc has common AI tells...it all feels like inadequate half measures to me."
其他观察
- hiccuphippo注意到使用.ai域名的讽刺性。
  "The irony of the .ai domain."
- ramon156对文章使用破折号的方式表示赞赏。
  "See, this is an article that uses dashes correctly."

我们使用Git的--author标志阻止了GitHub仓库中的AI机器人垃圾信息 -- We stopped AI bot spam in our GitHub repo using Git's –author flag