Hacker News 中文摘要

RSS订阅

Anthropic原创带回家作业开源 -- Anthropic's original take home assignment open sourced

文章摘要

Anthropic公司开源了其原始性能测试项目originalperformancetakehome,现已在GitHub上公开供开发者尝试使用。该项目可能用于评估系统性能或作为技术面试的测试题目。

文章总结

Anthropic开源性能测试项目:挑战Claude Opus 4.5的极限

主要内容: 1. 项目背景 - Anthropic公司公开了其原始性能测试项目"originalperformancetakehome" - 该项目原本用于评估AI模型Claude的性能表现 - 现在开放给公众尝试挑战Claude Opus 4.5的表现

  1. 性能基准
  • 列出了Claude各版本在不同条件下的表现(以时钟周期数衡量):
    • Claude Opus 4:2164周期(多次测试后)
    • Claude Opus 4.5:1790周期(2小时测试)
    • 最佳表现:1363周期(改进测试条件下)
  1. 挑战邀请
  • 鼓励开发者尝试优化性能
  • 如果优化结果能低于1487周期(超过Claude Opus 4.5发布时的最佳表现)
  • 可将代码和简历发送至performance-recruiting@anthropic.com
  • 优秀表现者可能获得面试机会
  1. 项目信息
  • 包含Python(88.7%)和HTML(11.3%)代码
  • 提供测试脚本tests/submission_tests.py用于验证优化结果
  • 已获得284星标和53次fork
  1. 联系方式
  • 性能相关问题可联系:performance-recruiting@anthropic.com

(注:已省略GitHub页面导航菜单、页脚信息等与核心内容无关的部分)

评论总结

评论总结:

  1. 对任务说明的困惑
  • 主要观点:任务说明不清晰,缺乏具体要求和评分标准
  • 关键引用: "What is the actual assignment here? The README only gives numbers without any information on what you're supposed to do or how you are rated."(koolba) "This is a knowledge test of GPU architecture?"(greesil)
  1. 对招聘方式的批评
  • 主要观点:认为这种限时优化测试过于片面,不能全面评估候选人能力
  • 关键引用: "Seems like they're trying to hire nerds who know a lot about hardware or compiler optimizations...hiring for creativity is a lot harder."(jackblemming) "It shocks me that anyone supposedly good enough for anthropic would subject themselves to such a one sided waste of time."(zeroCalories)
  1. 技术挑战的积极评价
  • 主要观点:认为这是一个有趣的学习机会,特别是对优化技术的学习
  • 关键引用: "Having recently learned more about SIMD, PTX and optimization techniques, this is a nice little challenge to learn even more."(sureglymop) "It's pretty interesting how close this assignment looks to demoscene golf."(avaer)
  1. 对公司态度的质疑
  • 主要观点:对Anthropic的招聘语气和方式表示不满
  • 关键引用: "The snarky writing...is really something, innit?"(tucnak) "I suspect this was released by Anthropic as a DDOS attack on other AI companies."(pvalue005)
  1. AI表现的讨论
  • 主要观点:关注AI在编程竞赛中的表现及其影响
  • 关键引用: "The oAI 2nd place at the atcoder world championship competition was the first one...Sakana also got 1st place in another atcoder competition a few weeks ago."(NitpickLawyer) "Was the screening format here that this problem was sent out, and candidates had to reply with a solution within 2 hours?"(Maro)
  1. 对其他公司的猜测
  • 主要观点:猜测OpenAI是否会采取类似做法
  • 关键引用: "I wonder if OpenAI follows suit."(dhruv3006)