Hacker News 中文摘要

文章摘要

Anthropic发布新版AI模型Fable，因过度限制网络安全相关查询引发专家不满。该模型为避免被用于开发恶意软件设置了严格防护措施，甚至拒绝无害请求。尽管公司出于安全考虑限制生物和网络安全话题，但专业人士认为其限制过于随意，影响正常使用。此前发布的Mythos模型也仅限特定机构使用。

文章总结

Anthropic发布新版AI模型Fable引发安全限制争议

人工智能公司Anthropic于本周二发布了其最新模型Fable，该模型被定位为备受瞩目的网络安全模型Mythos的公开限量版本。然而其严格的安全限制措施引发了网络安全研究人员的广泛不满。

多位安全专家在社交媒体上抱怨称，Fable对任何涉及网络安全的内容都设置了过度敏感的防护机制。IBM X-Force知名安全研究员Valentina "Chompie" Palmiotti表示："即使是阅读博客文章这样无害的请求，只要与网络安全稍有联系就会被拒绝。"当触发防护机制时，Fable会中断对话并提示"安全系统检测到该消息涉及网络安全或生物技术话题"。

这些限制措施源于Anthropic长期以来的担忧——防止AI被用于开发恶意软件或攻击系统。类似的生物技术限制则是为了防止开发生物武器。今年4月，Anthropic通过"Glasswing项目"向有限数量的企业和组织开放了Mythos模型，上周又将访问权限扩展至15个国家的数百家机构。

但网络安全专家们认为这些限制过于随意。资深安全专家Matt Suiche指出："当你要求它编写安全代码时，它会将其归类为网络安全工作而非软件工程最佳实践，从而导致功能降级。"触发防护机制后，Fable会自动切换至Claude Opus 4.8版本。Suiche推测其防护机制可能基于关键词触发。

尽管存在争议，Suiche表示理解："我们仍处于早期阶段，Anthropic等前沿模型公司需要时间与新一代网络安全企业磨合。在初始阶段设置严格限制是必要的，未来可以逐步放宽。"另一位研究人员在X平台上抱怨称，连"请求代码审查"这样的基础操作也会触发防护机制。

除模型内置防护外，Anthropic还要求网络安全专业人士申请"网络安全验证计划"，通过审核的用户在使用Claude进行安全研究时将获得更多权限。OpenAI也设有类似的"网络安全可信访问"计划。

（注：原文末尾重复的联系方式及与主题无关的段落已按编辑要求删除）

评论总结

总结评论内容：

1. 支持严格防护措施的观点

认为防护措施是必要的，尤其是在网络安全和生物安全领域，以避免潜在风险。
- "It’s better to catch more people than not enough when you do such a release and to relax the guardrails over time." (Suiche, 评论10)
- "AI can in principle help both the ‘good guys’ and the ‘bad guys’," (Dario Amodei, 评论19)

2. 反对过度防护的观点

认为当前的防护措施过于严格，影响了正常的研究和工作效率，尤其是网络安全和代码审计领域。
- "DeepSeek is the only one that I can directly ask about vulnerabilities and it will give me a PoC. The rest have guard rails that are so heavy, it makes them almost useless for cybersecurity." (jazz9k, 评论1)
- "Fable 5 rejects the vast majority of my prompts to analyze and improve the software that I’ve written. It’s bleak." (Sephr, 评论22)

3. 防护措施的双重标准问题

批评防护措施对攻击者和研究者的不公平对待，攻击者可以绕过限制，而研究者却被阻止。
- "So a determined attacker rewrites the prompt and gets through, and the IBM X-Force researcher trying to read a blog post gets blocked." (outageroom, 评论2)
- "If the frontier models get locked down so that they flat refuse to do this kind of work, but Chinese and open models aren’t, then a lot of large enterprise orgs will be left twisting in the wind." (jiggawatts, 评论19)

4. 防护措施的真实目的质疑

认为防护措施可能是为了数据收集或营销策略，而非真正的安全需求。
- "These guardrails are solely a reason for using your data for training purposes. Every flagged message can be used for training." (Iamtiberius, 评论4)
- "I can’t help but think that gimping itself for 'security' is a marketing ruse." (luxuryballs, 评论25)

5. 对模型透明度的不满

批评模型在降级处理时缺乏透明度，甚至可能故意误导用户。
- "It won’t just reject ML research, it will sabotage it silently by using a worse model without revealing it is doing so." (daedrdev, 评论3)
- "It’s just an insane level of deception and trust destruction." (daedrdev, 评论3)

6. 对市场竞争的乐观态度

认为过度限制的模型会被市场淘汰，其他更开放的产品将占据优势。
- "It’s a marketplace. Someone else will outdo this inferior product." (rdiddly, 评论20)

关键引用保留：

支持防护措施：
- "It’s better to catch more people than not enough..." (评论10)
- "AI can in principle help both the ‘good guys’ and the ‘bad guys’..." (评论19)
反对防护措施：
- "DeepSeek is the only one that I can directly ask about vulnerabilities..." (评论1)
- "Fable 5 rejects the vast majority of my prompts..." (评论22)
双重标准问题：
- "A determined attacker rewrites the prompt and gets through..." (评论2)
- "If the frontier models get locked down..." (评论19)
透明度与信任问题：
- "It will sabotage it silently by using a worse model..." (评论3)

总结来看，评论中既有对安全防护必要性的理解，也有对当前措施过度限制、不透明和潜在商业动机的强烈批评。市场对更开放的替代品抱有期待。

网络安全研究人员对Anthropic公司Fable的防护措施不满 -- Cybersecurity researchers aren't happy about the guardrails on Anthropic's Fable