Hacker News 中文摘要

文章摘要

xAI宣布推出Grok 4，号称是全球最强大的AI模型，并邀请用户观看实时直播演示。该消息发布于2025年7月10日，已获得910万次浏览和3100条回复。

文章主要内容如下：

总结：xAI在X平台上发布了Grok 4的全球首发消息，并提供了直播链接，鼓励用户注册和互动，同时展示了公司头像和相关法律声明。

以下是评论内容的总结，涵盖了主要观点和论据：

观点：部分评论者认为Grok在多个基准测试中表现优异，尤其是在“Humanity's Last Exam”等测试中，超越了其他模型如Gemini和Claude。
论据：
- "Honestly if it actually does score 44.4% on Humanity's Last Exam, that would be super impressive as Gemini 2.5 Pro and o3 with tools only score 26.9% and 24.9%."（评论3）
- "Seems like it is indeed the new SOTA model, with significantly better scores than o3, Gemini, and Claude in Humanity's Last Exam, GPQA, AIME25, HMMT25, USAMO 2025, LiveCodeBench, and ARC-AGI 1 and 2."（评论6）

观点：Grok Heavy通过并行运行多个代理并比较结果的技术创新被认为是一个有趣且逻辑合理的进步。
论据：
- "The trick they announce for Grok Heavy is running multiple agents in parallel and then having them compare results at the end, with impressive benchmarks across the board. This is a neat idea!"（评论4）
- "Interested to see how it all works out. Elon has been using a lot of smoke and mirrors lately, but this seems like an area where they can genuinely make progress."（评论9）

观点：部分评论者对Grok的信任度存疑，认为其CEO的行为和未发布模型卡等问题削弱了其可信度。
论据：
- "benchmarks are very impressive but their CEO just eroded any trust in those benchmarks although some such as ARC are corroborated externally."（评论8）
- "They also have not released a model card, and I suspect they never will."（评论8）

观点：一些评论者批评Grok与Elon Musk的形象绑定，认为其品牌形象和伦理问题使其难以被广泛接受。
论据：
- "Soooo... are we just going to collectively pretend that this is a Totally Normal Productivity Tool and not something that was calling itself MechaHitler and advocating genocide literally days ago?"（评论12）
- "No comment on the AI, I wouldn’t have bought a VW during WWII, but I used to love using the word grok and now it’s attached to this disgusting product and man."（评论10）

观点：部分评论者对Grok的实际应用和可用性表示怀疑，认为其尚未成为主流选择。
论据：
- "Serious question who in their right mind would choose to integrate Grok into anything at this point?"（评论1）
- "Out of interest, has anyone ever integrated with Grok? I've done so many LLM integrations in the last few years, but never heard of anyone choosing Grok."（评论13）

观点：一些用户对Grok的语音模式表示认可，但也提出了改进建议，如关闭自动结束检测功能。
论据：
- "Grok's updated voice mode is indeed impressive. I wish there was a way to disable automatic turn detection, so that it wouldn't treat silence as an end of the response."（评论17）
- "I was pleasantly surprised that Grok even supports (to some degree) Lithuanian in voice mode, which is a quite niche language."（评论17）

观点：部分评论者对Grok的价格和访问权限表示不满，认为其定价过高且难以获取。
论据：
- "How do I use grok 4 heavy? SuperGrok is $3000 a year!! I can't find an option in openrouter either."（评论15）
- "Did they mention availability of the model for users?"（评论5）

总结：评论中对Grok的技术能力和创新表示了一定的认可，但也对其信任度、伦理问题、实际应用和可用性提出了质疑。部分用户对其语音模式表示支持，但也提出了改进建议。此外，Grok的高价格和访问权限问题也引发了不满。