文章摘要
Anthropic公司发布了Claude Opus 4.8版本,相比4.7版在基准测试和协作能力上有所提升。新版本新增了任务难度调节、动态工作流处理大规模问题等功能,且快速模式速度提升2.5倍、成本降低至原先的三分之一。早期测试者反馈该版本在代理任务中表现更可靠精准。
文章总结
标题:Claude Opus 4.8版本正式发布
核心内容: 1. 版本升级 - 基于Opus 4.7版本进行优化,在各项基准测试中表现更优 - 保持原有价格不变(输入每百万token 5美元,输出每百万token 25美元) - 快速模式价格降至原先的三分之一(输入每百万token 10美元,输出每百万token 50美元)
- 主要改进
- 判断力提升:能主动发现问题,减少无依据的断言
- 诚实度提高:比前代减少约75%的代码缺陷遗漏
- 协作性增强:支持更长的会话保持上下文
- 处理速度:快速模式可达2.5倍速
- 新增功能
- 动态工作流:支持并行运行数百个子代理处理大型任务
- 工作量控制:用户可调节AI响应投入程度
- API更新:支持任务中动态更新系统指令
- 测试反馈
- 在法律、金融、编程等领域展现出更强的推理能力
- 在Databricks的Genie代理中处理多步骤问题速度显著提升
- 处理非结构化内容的token成本降低61%
- 未来计划
- 正在开发成本更低但性能相当的模型
- 即将推出智能水平超越Opus的Mythos级模型
- 米兰办公室即将开业以支持欧洲业务
(注:原文中大量企业测试评价和图片引用已精简,保留了最具代表性的改进说明。与核心内容无关的办公室人事任命和宗教相关发言等内容未纳入摘要。)
评论总结
以下是评论内容的总结,平衡呈现不同观点并保留关键引用:
- 对升级幅度的质疑(负面评价为主)
- 认为升级幅度小:"seems like a really minor upgrade?"(mincer_ray)
- 类比iPhone更新:"I can't help but think of Iphone updates since about 2018...mostly the same"(pbmango)
- 关键引用:"Disappointed to say the least."(McDownloads)
- 对"诚实性"改进的讨论(观点分歧)
- 支持方:"The honesty improvement is the part I actually care about"(ashtondev101)
- 质疑方:"Crazy they bring up honest, when Claude models are literally known for straight up lying"(impulser_)
- 关键引用:"When companies say their model is more 'aligned', I automatically think they mean it's more censored."(behnamoh)
- 对实际性能的期待(中性评价)
- 谨慎乐观:"Numbers looking good. We'll see how it actually performs."(plumocracy)
- 写作风格担忧:"I hope the writing quality has returned to the Opus 4.5 level"(lostdog)
- 关键引用:"Would be awesome if true"(james_marks)
- 功能调整反馈(混合评价)
- 积极反馈:"Really appreciate the ability to select effort level again"(rumblefrog)
- 负面反馈:"this feels like a step backwards"(DGAP)
- 关键引用:"you can now turn off adaptive thinking...which is great"(colonCapitalDee)
- 市场策略批评
- 忽视低端市场:"the market for less-capable but cheaper models seems to be completely ignored"(skysthelimitt)
- 版本迭代质疑:"Seems like from now on the updates will be a minor upgrade"(worldsavior)
- 关键引用:"Anthropic has now upgraded their Claude slot machine to version 4.8"(rvz)
- 对Mythos模型的期待
- "Probably more interesting than the 4.8 release"(northern-lights)
- "Excited to see what this model looks like"(rsanek)
- 用户情绪变化
- "the bloviating and language...are starting to wear thin on my patience"(SimianSci)
- 关键引用:"This is a refreshing attitude!"(colonCapitalDee)