Hacker News 中文摘要

文章摘要

Gemini 2.5版本的Deep Think功能现已推出，该功能通过扩展的并行思维和新型强化学习技术，显著提升了问题解决能力，用户可在Gemini应用中体验这一创新功能。

文章总结

标题：在Gemini应用中尝试Deep Think功能

主要内容：

Google宣布在Gemini应用中推出Deep Think功能，该功能现已向Google AI Ultra订阅用户开放。Deep Think是Gemini 2.5版本的一部分，采用了扩展的并行思维和新型强化学习技术，显著提升了问题解决能力。该功能在今年的国际数学奥林匹克竞赛（IMO）中达到了金牌标准，尽管竞赛模型需要数小时来解决复杂的数学问题，但此次发布的版本在日常使用中更为快速，同时在2025年IMO基准测试中达到了铜牌水平。

Deep Think通过并行思维技术，允许Gemini同时生成多个想法并加以考虑，甚至在最终得出最佳答案之前进行修订或组合。此外，通过延长推理时间或“思考时间”，Gemini能够探索不同的假设，从而找到复杂问题的创造性解决方案。

Deep Think在需要创造力、战略规划和逐步改进的任务中表现出色，例如迭代开发与设计、科学与数学发现以及算法开发与编码。在多个基准测试中，Gemini 2.5 Deep Think在编码、科学、知识和推理能力方面均表现出色。

Google在Gemini的训练和部署过程中持续关注安全性和责任性。尽管Deep Think在内容安全性和客观性方面有所改进，但在测试中显示出更高的拒绝良性请求的倾向。

Google AI Ultra订阅用户现在可以在Gemini应用中使用Deep Think功能，通过切换提示栏中的“Deep Think”选项来选择2.5 Pro模型。Deep Think自动与代码执行和Google搜索等工具配合使用，并能生成更长的响应。

Google还计划在未来几周内通过Gemini API向一组受信任的测试者发布带工具和不带工具的Deep Think版本，以更好地了解其在开发者和企业用例中的可用性。

总结： Deep Think的推出标志着Google在构建更强大、更有帮助的AI方面迈出了重要一步，进一步推动了Gemini在人类知识前沿的应用。

评论总结

价格与订阅问题：
- Deep Think 目前仅限 ULTRA 订阅用户使用，价格为每月 250 美元，且无法提前测试，引发了一些用户的不满。
  - "At the moment, Deep Think is only available with the ULTRA subscription ($250 per month)."
  - "139.99€/month for something you can't even test first, lol"
性能与比较：
- Deep Think 的性能被认为与 Grok 4 Heavy 和 o3 Pro 相当，但用户对其与 Grok 4 Heavy 的基准测试提出了质疑。
  - "Grok 4 heavy, o3 pro and Gemini Deep Think all are equivalent. I wonder how they compare?"
  - "Great results, though it would be more fair for the benchmark comparisons to be against Grok 4 Heavy rather than Grok 4 (the fast, single-agent model)."
使用限制与透明度：
- 用户对 Deep Think 的使用限制表示不满，认为其与高昂的价格不匹配，且公司未明确说明限制的具体内容。
  - "I started doing some experimentation with this new Deep Think agent, and after five prompts I reached my daily usage limit. For $250 USD/mo that’s what you’ll be getting folks."
  - "Why are these companies not upfront about what the limits are?"
技术实现与未来展望：
- Deep Think 的技术实现被认为类似于 Grok 4 Heavy，使用多个“推理”代理并行工作，用户对其未来的发展持观望态度。
  - "Approach is analogous to Grok 4 Heavy: use multiple 'reasoning' agents in parallel and then compare answers before coming back with a single response, taking ~30 minutes."
  - "I'm wondering if 'slow AI' like this is a temporary bridge, or a whole new category we need to get used to."
用户反馈与体验：
- 一些用户对 Deep Think 的实际体验表示失望，认为其性能与价格不成正比，且存在使用限制和透明度问题。
  - "Performance-wise. So far, I couldn’t even tell. I provided it with a challenging organizational problem that my business was facing, with the relevant context, and it proposed a lucid and well-thought-out solution that was consistent with our internal discussions on the matter."
  - "Upgraded and quickly hit my limit. And find that they have limits, I just wish that they were more transparent."
技术替代方案：
- 有用户提出了通过开源工具和插件实现类似 Deep Think 功能的替代方案，认为这更具灵活性和成本效益。
  - "You can spin up a version of this at home using simonw's LLM cli with the llm-consortium plugin."
  - "Bonus 1: Use any combination of models. Mix n match models from any lab."

双子座2.5深度思考 -- Gemini 2.5 Deep Think

文章摘要

文章总结

评论总结