文章摘要

文章探讨了DSPy框架虽技术先进但普及度低的原因，指出AI系统开发通常经历三个阶段：快速上线、调整提示词和优化输出质量，而DSPy可能因学习曲线陡峭或应用场景限制未能被广泛采用。

文章总结

标题：DSPy为何叫好不叫座？——一个AI工程框架的困境与启示

核心论点：

尽管DSPy框架能系统性解决AI工程中的核心痛点（如模型切换、提示优化、评估体系等），但其月下载量（470万）远低于LangChain（2.22亿）。这种反差源于DSPy要求开发者提前建立抽象思维，而多数团队更倾向于在痛苦中被动重构。

关键数据对比：

DSPy用户：JetBlue、Databricks等企业反馈其优势显著
→ 模型快速切换、系统可维护性强、聚焦业务逻辑而非底层实现
现实困境：开发者最终会自行实现DSPy的核心模式，但往往代价更高
→ 引用"Khattab定律"：任何复杂AI系统最终都会包含一个漏洞百出的"DSPy半成品"

AI系统的典型演化路径（以企业名称提取任务为例）：

初级阶段
python # 直接调用OpenAI API response = client.chat.completions.create(model="gpt-5.2", messages=[...])
中期补丁
- 提示词数据库化 → 版本管理问题
- 增加Pydantic结构化输出 → 处理格式错误
- 添加重试机制 → 应对API失败
后期复杂化
- 引入RAG系统 → 多提示词协同问题
- 构建评估体系 → 数据漂移挑战
- 切换Claude模型 → 全链路重构

DSPy的范式革新（相同功能的实现）：

```python

声明式签名（取代手工提示词）

class CompanyExtraction(dspy.Signature): text: str = dspy.InputField() company_name: str = dspy.OutputField()

模块化管道（内置RAG/思维链）

class CompanyExtractor(dspy.Module): def init(self): self.retrieve = dspy.Retrieve(k=5) self.extract = dspy.ChainOfThought(CompanyExtraction)

一键模型切换 & 自动优化

dspy.configure(lm="anthropic/claude-sonnet-4") optimizer = dspy.MIPROv2().compile(CompanyExtractor()) ```

根本矛盾解析：

| 传统开发思维 | DSPy设计哲学 | |---------|----------| | 即时满足（先让模型跑通） | 前置设计（类型系统/模块化） | | 提示词即代码（混合逻辑） | 声明式签名（分离关注点） | | 评估后置（出现问题再补） | 评估驱动（早期指标建设） |

实践建议：

激进方案：全面采用DSPy，克服学习曲线
渐进方案：借鉴其核心模式：
- 严格类型化输入输出
- 提示词与代码解耦
- 建立可组合的测试单元
- 抽象模型调用层

终极启示：

"DSPy的困境不在于它错了，而在于它太超前。当疼痛尚未发作时，人们总认为自己不需要止痛药。"
—— 但历史表明，所有成功的AI系统最终都会走向类似的架构设计，区别只在于是主动规划还是被动偿还技术债。

评论总结

以下是评论内容的总结：

正面观点

Dspy获得好评但采用率低
- "I consistently hear great things from Dspy users. At the same time, it feels like adoption is always low." (sbpayne)
- "The real killer feature is the prompt compilation... But good evals are hard and the really fancy algorithms will burn a lot of tokens to optimize your prompts." (ijk)
自动提示优化的优势
- "The absolute biggest time sink and 'here be dragons' of using LLMs is poke and hope prompt 'engineering' without proper evaluation metrics." (deepsquirrelnet)
- "They know that manual prompt engineering is brittle, and want a prompt that's optimized and robust against a model they're invoking, which DSPy offers." (sethkim)

负面观点

使用复杂性和灵活性不足
- "The fact that you have to bundle input+output signatures and everything is dynamically typed... just make it annoying to use in codebases that have type annotations everywhere." (ndr)
- "I ended up removing it from our production codebase because... it didn't quite work as effectively as just using Pydantic and so forth." (ijk)
缺乏实际价值或营销过度
- "A lot of these ideas Dspy and RLM (from the same people IIRC) are more marketing than solving a real problem." (tinyhouse)
- "I used dspy in production, then reverted the bloat as it literally gave me nothing of added value in practice but a lot of friction." (CraftingLinks)
评估数据集的构建困难
- "You have to really think carefully on how to build up a training and evaluation dataset... This takes a ton of upfront work and careful thinking." (memothon)
- "Outside of programming, most things where LLMs deliver actual value are very nondeterministic with no right answer." (deaux)

其他观点

替代方案的存在
- "And LiteLLM or ai (Vercel), the actually most used packages, aren't?" (deaux)
- "https://www.tensorzero.com/docs has similar abstractions but doesn't require Python and doesn't require committing to the framework." (panelcu)
文档和产品体验不足
- "A problem to DSPy is that they don't know the concept of THE WHOLE PRODUCT... Look at https://mastra.ai/ to see how more inviting their pages look." (giorgioz)
- "The comments on this post immediately make clear that the biggest differentiator of DSPy is the prompt optimization. Yet this article doesn't mention that at all?" (deaux)

总结

Dspy在自动提示优化方面受到认可，但其复杂性、灵活性不足以及实际价值受到质疑。用户认为其采用率低的原因包括使用门槛高、评估数据集构建困难，以及存在更轻量级的替代方案。同时，文档和产品体验的不足也影响了其推广。

Hacker News 中文摘要

如果DSPy如此优秀，为何无人问津？ -- If DSPy is so great, why isn't anyone using it?