# 13.3 Context Engineering 概览：从提示词工程到上下文工程

## 序言

上下文工程代表了一个范式转变：与其优化如何表达指令，不如优化向模型提供什么信息、以什么方式提供。当我们为 Claude 提供充分的、相关的、高质量的上下文信息时，即使使用相对简洁的指令，模型也能表现出色。相反，再精妙的提示词也无法弥补上下文信息的缺失。

> 💡 关于上下文工程的完整战略和高级技巧，请参阅《上下文工程指南》。想了解 MCP 作为上下文来源的更多细节？请参阅本书第 4 章《MCP 模型上下文协议》。

## 第一节 核心概念：Claude 视角下的上下文工程

## 13.3.1 为什么上下文工程对 Claude 特别重要

Claude 的设计理念强调上下文的价值：

* **大上下文窗口**：Claude 支持高达 200K 甚至 1M token 的上下文，使得提供丰富背景信息成为可能
* **模型能力的成熟**：Claude 足够强大，不需要复杂的提示技巧就能理解清晰指令
* **上下文质量优于提示词精妙**：我们发现上下文优化（选择更相关的信息）往往能带来 30-50% 的性能提升，而提示词优化通常只能带来 5-10%
* **知识补全**：对于知识密集型任务，充分的上下文甚至可以弥补模型知识的时效性限制

## 13.3.2 上下文的科学基础：Context Rot 与 Attention Budget

虽然大上下文窗口提供了可能性，但有两个根本的技术限制需要理解：

**Context Rot（上下文衰减）**：随着上下文中 token 数量的增加，模型准确回忆该上下文中信息的能力会逐步下降——这表现为“Needle-in-a-Haystack”问题：在大量信息中寻找关键信息时，即使具有大上下文窗口的模型也会表现出准确度下降。

**Attention Budget（注意力预算）**：LLM 拥有有限的“注意力预算”，源于 Transformer 的自注意力机制（为 n 个 token 创建 n² 个成对关系）。每增加一个新 token，注意力预算的消耗呈非线性增长。

**实践启示**：好的上下文工程的核心原则是**找到最小的高信号 token 集合，以最大化所需结果的可能性**。不是所有信息都应该加入上下文；上下文的质量（相关性、清晰度）比数量（token 总数）更重要。

## 第二节 上下文工程的四大策略

上下文工程通过四个主要维度来优化模型的表现：

## 13.3.3 策略一：写入 - 制造高质量的上下文

写入指的是为你的特定需求创建、编写或生成上下文内容。

**应用场景**

* 为 Claude 编写详细的背景说明、指导原则或参考资料
* 创建系统提示词（System Prompt）来定义 Claude 的角色和约束
* 编写示例和参考案例供 Claude 学习

**最佳实践**

```markdown
# 高质量系统提示词的结构

## 1. 角色定义
You are an expert [domain] professional with [X years] of experience.

## 2. 目标陈述
Your primary goal is to [specific objective].

## 3. 约束和原则
- Principle 1: [description]
- Principle 2: [description]

## 4. 风格和格式
- Tone: [descriptive]
- Format: [specific format]
- Language level: [proficiency level]

## 5. 边界条件
- Do not: [specific constraints]
- Always: [mandatory requirements]
```

**示例：金融顾问系统提示词**

```python
system_prompt = """
You are a certified financial advisor with 15 years of experience in personal finance
and investment strategy.

PRIMARY GOAL:
Provide comprehensive, evidence-based financial guidance tailored to the user's
specific situation and goals.

CORE PRINCIPLES:
1. Always disclose any assumptions about the user's financial situation
2. Consider tax implications and regulatory constraints
3. Present multiple options with clear trade-offs
4. Base recommendations on published research and data
5. Acknowledge uncertainty and limitations

CONSTRAINTS:
- Do NOT provide guaranteed returns or financial predictions
- Do NOT recommend specific securities without full disclosure of limitations
- Do NOT advise on insurance if not qualified
- Always recommend consulting with appropriate professionals for complex matters

OUTPUT FORMAT:
- Situation analysis (2-3 sentences)
- Key considerations (3-5 bullet points)
- Recommended approaches (2-3 options with pros/cons)
- Action items (specific, numbered steps)
"""
```

**写入的成本**

编写高质量的背景信息和系统提示词需要投入：

* 专业知识（了解领域的关键概念）
* 清晰的表达能力
* 对目标任务的深刻理解

但这个投入是一次性的，之后可以重复使用，所以 ROI 通常很高。

## 13.3.4 策略二：选择 - 检索相关的上下文

选择指的是从大量可用信息中智能地选择与当前任务最相关的内容。

**核心问题**

在信息爆炸的时代，问题不是信息的缺乏，而是过多。Claude 需要的是相关的、高质量的上下文，而不是所有可用的上下文。

**实现方法**

**方法 1：向量搜索和语义相似度**

```python
from anthropic import Anthropic
import numpy as np

class ContextSelector:
    def __init__(self, documents: list[str]):
        """初始化上下文选择器"""
        self.documents = documents
        self.embeddings = self._compute_embeddings()

    def _compute_embeddings(self):
        """计算文档的向量嵌入"""
        # 在实际应用中，可以使用 OpenAI、Cohere 或其他 embedding 服务
        # 这里是伪代码
        return [embed_document(doc) for doc in self.documents]

    def select_relevant_context(self, query: str, top_k: int = 5) -> list[str]:
        """基于查询选择最相关的文档"""
        query_embedding = embed_document(query)

        # 计算相似度
        similarities = [
            cosine_similarity(query_embedding, doc_emb)
            for doc_emb in self.embeddings
        ]

        # 获取 top-k 最相关的文档
        top_indices = np.argsort(similarities)[-top_k:][::-1]
        return [self.documents[i] for i in top_indices]

    def select_with_diversity(self, query: str, top_k: int = 5) -> list[str]:
        """选择相关且多样化的上下文（避免重复信息）"""
        relevant = self.select_relevant_context(query, top_k * 2)

        # 简单的多样性过滤：选择在主题上不同的文档
        selected = []
        for doc in relevant:
            if not any(has_high_overlap(doc, s) for s in selected):
                selected.append(doc)
            if len(selected) == top_k:
                break

        return selected


def cosine_similarity(a, b):
    """计算余弦相似度"""
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))


def embed_document(text: str):
    """嵌入文本（实现取决于具体服务）"""
    # 示例代码
    pass


def has_high_overlap(doc1: str, doc2: str, threshold: float = 0.7) -> bool:
    """检查两个文档是否有高度重叠"""
    words1 = set(doc1.lower().split())
    words2 = set(doc2.lower().split())
    overlap = len(words1 & words2) / len(words1 | words2)
    return overlap > threshold
```

**方法 2：结构化元数据选择**

```python
class StructuredContextSelector:
    def __init__(self):
        self.documents = []
        self.metadata_index = {}

    def add_document(self, content: str, metadata: dict):
        """添加带元数据的文档"""
        doc_id = len(self.documents)
        self.documents.append(content)

        # 为每个元数据字段建立索引
        for key, value in metadata.items():
            if key not in self.metadata_index:
                self.metadata_index[key] = {}
            if value not in self.metadata_index[key]:
                self.metadata_index[key][value] = []
            self.metadata_index[key][value].append(doc_id)

    def select_by_criteria(self, criteria: dict) -> list[str]:
        """基于元数据条件选择文档"""
        """
        criteria 示例：
        {
            "category": "technical",
            "date_range": ("2024-01-01", "2025-03-01"),
            "relevance_score": (0.7, 1.0)
        }
        """

        candidate_ids = set(range(len(self.documents)))

        for key, value in criteria.items():
            if key not in self.metadata_index:
                continue

            if isinstance(value, (tuple, list)):
                # 范围查询
                min_val, max_val = value
                matching_ids = set()
                for v, ids in self.metadata_index[key].items():
                    if min_val <= v <= max_val:
                        matching_ids.update(ids)
            else:
                # 精确查询
                matching_ids = set(self.metadata_index[key].get(value, []))

            candidate_ids &= matching_ids

        return [self.documents[i] for i in candidate_ids]
```

**选择的最佳实践**

1. **考虑时间鲜度**：优先选择最新的信息
2. **关注权威性**：选择来自可信源的信息
3. **平衡覆盖面和深度**：包括概览性和详细的内容
4. **避免冗余**：不要包含高度重叠的内容

## 13.3.5 策略三：压缩 - 去除冗余，保留精华

压缩指的是减少上下文的大小，同时保持其信息价值。

**压缩技术**

**技术 1：抽象式总结**

```python
def abstractive_compression(long_document: str, target_length: int = 1000) -> str:
    """将长文档压缩为摘要"""

    client = Anthropic()

    # 第一步：创建初始摘要
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=min(target_length, 2000),
        messages=[
            {
                "role": "user",
                "content": f"""Summarize this document focusing on key information:

{long_document}

Keep the summary to approximately {target_length} words.
Maintain all factual accuracy."""
            }
        ]
    )

    summary = response.content[0].text

    # 第二步：评估和迭代
    if len(summary) > target_length * 1.5:
        # 如果超过目标太多，进行第二轮压缩
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=int(target_length / 2),
            messages=[
                {
                    "role": "user",
                    "content": f"""Further compress this summary:

{summary}

Keep only the most critical information."""
                }
            ]
        )
        summary = response.content[0].text

    return summary
```

**技术 2：结构化提取**

```python
def structured_compression(document: str) -> dict:
    """将文档压缩为结构化格式"""

    client = Anthropic()

    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=2000,
        messages=[
            {
                "role": "user",
                "content": f"""Extract and organize this document into a structured format:

{document}

Provide in this JSON format:
{{
  "main_topic": "...",
  "key_points": ["point1", "point2", ...],
  "important_facts": {{"fact1": "value1", ...}},
  "entities": ["entity1", "entity2", ...],
  "action_items": ["item1", "item2", ...],
  "open_questions": ["question1", "question2", ...]
}}"""
            }
        ]
    )

    return json.loads(response.content[0].text)
```

**技术 3：去重和去冗余**

```python
class ContextDeduplicator:
    @staticmethod
    def remove_duplicate_information(contexts: list[str]) -> list[str]:
        """移除上下文之间的重复信息"""

        client = Anthropic()

        # 将所有上下文合并
        combined = "\n\n".join([f"Document {i}:\n{ctx}" for i, ctx in enumerate(contexts)])

        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=2000,
            messages=[
                {
                    "role": "user",
                    "content": f"""Identify and remove duplicate or overlapping information from these documents:

{combined}

For each document, keep only the unique information not covered by others.
Provide the deduplicated documents in the same order."""
                }
            ]
        )

        # 解析并返回去重后的上下文
        # 实现细节...
        return contexts  # 简化版本
```

**压缩的成本-收益分析**

| 压缩方法  | 压缩率    | 信息丧失   | 适用场景     |
| ----- | ------ | ------ | -------- |
| 摘要    | 50-80% | 5-10%  | 长文档、新闻文章 |
| 结构化提取 | 40-70% | 15-25% | 数据密集文档   |
| 去重    | 10-30% | 0%     | 多源信息     |
| 关键词提取 | 20-40% | 30-40% | 快速索引     |

## 13.3.6 策略四：隔离 - 分离不同类型的上下文

隔离指的是明确分离不同来源、不同类型的上下文，以避免混淆和冲突。

**隔离的意义**

在复杂的应用中，上下文通常来自多个源：

* 用户提供的信息
* 检索到的背景知识
* 系统定义的规则
* 历史对话记录
* 外部数据集

如果这些混在一起，Claude 可能会混淆哪些是事实、哪些是假设、哪些是用户的意见。

**隔离的实现**

```python
class IsolatedContextBuilder:
    """建立明确隔离的上下文结构"""

    def __init__(self):
        self.context_parts = {}

    def add_system_knowledge(self, knowledge: str):
        """添加系统级背景知识"""
        if "system_knowledge" not in self.context_parts:
            self.context_parts["system_knowledge"] = []
        self.context_parts["system_knowledge"].append(knowledge)

    def add_user_provided(self, information: str):
        """添加用户提供的信息"""
        if "user_provided" not in self.context_parts:
            self.context_parts["user_provided"] = []
        self.context_parts["user_provided"].append(information)

    def add_retrieved_context(self, context: str, source: str = "unknown"):
        """添加检索到的上下文"""
        if "retrieved_context" not in self.context_parts:
            self.context_parts["retrieved_context"] = []
        self.context_parts["retrieved_context"].append({
            "content": context,
            "source": source
        })

    def add_constraints(self, constraint: str):
        """添加系统约束"""
        if "constraints" not in self.context_parts:
            self.context_parts["constraints"] = []
        self.context_parts["constraints"].append(constraint)

    def build_prompt(self) -> str:
        """构建清晰隔离的提示"""

        prompt = ""

        # 系统知识
        if "system_knowledge" in self.context_parts:
            prompt += "## SYSTEM KNOWLEDGE\n"
            for item in self.context_parts["system_knowledge"]:
                prompt += f"- {item}\n"
            prompt += "\n"

        # 用户提供的信息
        if "user_provided" in self.context_parts:
            prompt += "## USER-PROVIDED INFORMATION\n"
            for item in self.context_parts["user_provided"]:
                prompt += f"- {item}\n"
            prompt += "\n"

        # 检索到的上下文
        if "retrieved_context" in self.context_parts:
            prompt += "## RETRIEVED CONTEXT\n"
            for item in self.context_parts["retrieved_context"]:
                prompt += f"### From: {item['source']}\n{item['content']}\n\n"

        # 约束
        if "constraints" in self.context_parts:
            prompt += "## CONSTRAINTS\n"
            for item in self.context_parts["constraints"]:
                prompt += f"- {item}\n"

        return prompt

    def example_usage(self):
        """使用示例"""
        builder = IsolatedContextBuilder()

        # 添加不同类型的上下文
        builder.add_system_knowledge("Claude 是由 Anthropic 开发的 AI 助手")
        builder.add_system_knowledge("当前日期是 2025 年 3 月 5 日")

        builder.add_user_provided("用户是一名产品经理")
        builder.add_user_provided("正在开发一个 AI 应用")

        builder.add_retrieved_context(
            "关于 LLM 成本优化的最新研究...",
            source="https://arxiv.org/..."
        )

        builder.add_constraints("不要提供具体的财务建议")
        builder.add_constraints("基于已知信息进行回答，如果不确定则说明")

        return builder.build_prompt()
```

**隔离的最佳实践**

1. **使用 XML 标签分隔**：`<background>`, `<user_input>`, `<constraints>` 等
2. **明确标注来源**：标记每个信息的来源和可信度
3. **区分事实和观点**：明确哪些是验证的事实，哪些是假设或观点
4. **版本控制**：标记上下文的版本和更新时间

## 13.3.6.1 补充：系统提示词的 Altitude 校准

系统提示词的设计涉及一个关键平衡问题：在两个常见失败模式之间找到“金发姑娘区”（Goldilocks zone）。

**两个极端的陷阱**

1. **过度硬编码 (Overly Hardcoded)**
   * 工程师在提示词中编写复杂的 if-else 逻辑和具体指令
   * 问题：变得脆弱、难以维护，随时间推移复杂性急剧增加
   * 例：逐条列出 50 项“禁止做”的指令，最后发现每条都需要异常处理
2. **过度宽泛 (Overly Vague)**
   * 提示词太过笼统，没有给予具体的行为信号
   * 假设 Claude 能理解未明确说明的隐含上下文
   * 问题：模型缺乏清晰的约束，往往偏离预期方向

**正确的 Altitude（高度）**

最优的系统提示词应该：

* **足够具体**：清晰地指导行为和输出格式，提供足够的约束
* **足够灵活**：给予模型解释空间和创意发展空间，而不是命令式的每步指令
* **提供启发式而非规则**：使用“倾向于...”和“优先考虑...”而不是“永远不能...”
* **自解释**：好的启发式提示自然会导出正确行为，而无需详细列举所有情况

**Altitude 校准技巧**

1. **从最小化开始**：先用最简洁的提示测试最能干的模型
2. **基于失败模式迭代**：遇到问题时添加明确指导（而不是预先填满提示词）
3. **使用多阶段示例**：提供少量示例来演示期望的思维过程，而不是罗列规则
4. **优先使用正向指导**：说“在 X 情况下做 Y”比“永远不要做 Z”更有效
5. **定期审查和精简**：随着经验积累，移除已成为模型直觉的冗余指导

**实践例子**

不好的写法：

```
- 永远不要使用行话
- 永远不要给出不确定的答案
- 永远不要超过 5 句话
- 永远不要涉及政治议题
```

好的写法：

```
使用清晰、直白的语言。如果有不确定，明确说明并解释原因。
将回答限制在 2-3 个重点上，每个用 1-2 句话说明。
对于敏感的社会议题，提供平衡的多个视角而不是个人观点。
```

***

*融入自 Anthropic 的《Effective context engineering for AI agents》中关于系统提示词设计的altitude校准原则*

## 第三节 Claude 中的上下文工程实践

## 13.3.7 系统提示词的精心设计

系统提示词是上下文工程中最强大的工具。与普通的用户消息不同，系统提示词定义了模型的基本行为和约束。

**系统提示词的三层结构**

```python
system_prompt_template = """

# ROLE & AUTHORITY
[定义 Claude 的角色、专业背景、权威性]

# PRIMARY OBJECTIVE
[明确的、可测量的目标]

# CONTEXT & CONSTRAINTS
[任务的约束条件、边界和限制]

# INTERACTION STYLE
[交互方式、语气、格式]

# DECISION-MAKING FRAMEWORK
[做出决策时应遵循的框架或原则]
"""

# 具体示例
system_prompt = """

# ROLE & AUTHORITY
You are an expert software architect with 20+ years of experience designing scalable,
high-performance systems. You have worked at companies like Google, Amazon, and Microsoft.

# PRIMARY OBJECTIVE
Help users design robust system architectures that are scalable, maintainable, and cost-effective.

# CONTEXT & CONSTRAINTS
- You MUST consider real-world constraints like latency, bandwidth, and cost
- You SHOULD provide multiple design options with clear trade-offs
- You MUST NOT recommend unproven technologies as primary solutions
- Always prioritize simplicity and proven patterns over novelty

# INTERACTION STYLE
- Be direct and data-driven
- Use diagrams and pseudocode when helpful
- Explain technical concepts clearly
- Ask clarifying questions if requirements are unclear

# DECISION-MAKING FRAMEWORK
1. Understand requirements and constraints
2. Identify alternative approaches
3. Evaluate each approach against key criteria (scalability, cost, complexity)
4. Recommend the best fit with clear reasoning
5. Highlight potential future scalability issues
"""
```

## 13.3.8 使用 XML 结构化上下文

XML 标签提供了清晰的结构化方式来组织复杂的上下文。

```python
def build_structured_prompt(task_description: str, background: str,
                            examples: list[str], constraints: list[str]) -> str:
    """构建高度结构化的提示"""

    return f"""
<task>
<description>{task_description}</description>
<objective>
[What you want Claude to do or produce]
</objective>
</task>

<background>
<context>{background}</context>
</background>

<examples>
<count>{len(examples)}</count>
{chr(10).join([f'<example>{ex}</example>' for ex in examples])}
</examples>

<constraints>
<requirement priority="high">You MUST...</requirement>
<requirement priority="medium">You SHOULD...</requirement>
<requirement priority="low">You MAY...</requirement>
{chr(10).join([f'<constraint>{c}</constraint>' for c in constraints])}
</constraints>

<output>
<format>
[Specify the exact format of the output]
</format>
<length>
[Specify length constraints if any]
</length>
</output>
"""
```

## 13.3.9 RAG 与上下文工程的结合

RAG 是上下文工程中最实用的技术之一。

**RAG 的基本流程**

```python
class RAGPipeline:
    def __init__(self, client, knowledge_base: list[str]):
        self.client = client
        self.knowledge_base = knowledge_base
        self.embeddings = self._embed_knowledge_base()

    def _embed_knowledge_base(self):
        """对知识库进行嵌入"""
        # 在实际应用中，使用真实的 embedding 服务
        pass

    def retrieve_relevant_context(self, query: str, top_k: int = 5) -> list[str]:
        """检索与查询相关的上下文"""
        # 对查询进行嵌入
        # 计算相似度
        # 返回最相关的文档
        pass

    def generate_response(self, query: str, context: list[str]) -> str:
        """基于检索到的上下文生成回答"""

        # 构建上下文
        context_str = "\n\n".join([f"[Source {i}]\n{ctx}" for i, ctx in enumerate(context)])

        response = self.client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=2000,
            system="""You are a helpful assistant that answers questions based on
provided context. Always cite your sources when using the provided documents.""",
            messages=[
                {
                    "role": "user",
                    "content": f"""Based on the following context, answer this question:

CONTEXT:
{context_str}

QUESTION:
{query}

Important: Only use information from the provided context. If the context doesn't contain
the answer, say so explicitly."""
                }
            ]
        )

        return response.content[0].text

    def query(self, question: str) -> dict:
        """完整的 RAG 查询流程"""

        # 步骤 1：检索
        relevant_context = self.retrieve_relevant_context(question)

        # 步骤 2：生成
        answer = self.generate_response(question, relevant_context)

        return {
            "question": question,
            "answer": answer,
            "sources": relevant_context
        }
```

### 补充：混合检索策略与工具Token效率

在构建实际的上下文工程系统时，选择何种检索策略是一个关键权衡问题。

**三种主要检索方式对比**

1. **Agentic Search（智能检索）**
   * 使用 bash 命令（grep、find、tail 等）让 Agent 主动探索文件系统
   * 优点：准确性高、透明度好、能捕捉文件命名和结构信息、避免索引过期
   * 缺点：速度较慢（需要多次文件 I/O）、需要对目录结构有良好设计
2. **语义搜索（Semantic Search）**
   * 基于向量化嵌入的相似度匹配
   * 优点：速度快、可以找到语义相关但措辞不同的内容
   * 缺点：准确性有时不如字面搜索、需要维护向量索引、嵌入模型选择很关键
3. **混合策略（Hybrid Approach）** - 推荐用于生产系统
   * 预先加载高优先级的关键信息（如 CLAUDE.md、系统配置）
   * 用 Agentic Search 处理大型文件和动态内容
   * 用语义搜索加速特定类型的信息查找
   * 允许 Agent 在发现新内容时灵活切换策略

**工具设计的Token效率**

工具的设计对整体 token 成本有重大影响：

* **返回的信息应精准**：工具不应返回整个对象，而是返回经过精心选择的相关信息
* **避免重复**：设计工具时考虑 Agent 可能多次调用同一工具，防止返回冗余数据
* **嵌套和关联**：相关的工具调用应该能在一个调用中完成，而不是分开的多个调用

例如，在电子邮件应用中：

* 不好：`list_all_emails()` 返回所有邮件（可能数千条），Agent 需要逐条读取
* 好：`search_emails(query, limit=10)` 直接返回相关摘录，或 `get_email_summary(email_id)` 返回关键信息

***

*融入自 Anthropic 的《Effective context engineering for AI agents》中关于混合检索策略的最佳实践*

## 13.3.10 MCP 与上下文工程的集成

模型上下文协议（MCP）与上下文工程是天然的配合。MCP 提供了一种标准化的方式来为 Claude 提供动态的、实时的上下文。

```python
class MCPContextIntegration:
    """MCP 与上下文工程的集成示例"""

    def __init__(self):
        self.mcp_servers = []

    def register_mcp_server(self, server_name: str, server_instance):
        """注册一个 MCP 服务器作为上下文源"""
        self.mcp_servers.append({
            "name": server_name,
            "instance": server_instance
        })

    def fetch_context_from_mcp(self, server_name: str, resource_path: str) -> str:
        """从 MCP 服务器获取上下文"""
        server = next((s for s in self.mcp_servers if s["name"] == server_name), None)
        if server:
            return server["instance"].get_resource(resource_path)
        return ""

    def build_mcp_enhanced_prompt(self, base_prompt: str, mcp_sources: dict) -> str:
        """构建增强了 MCP 上下文的提示"""
        """
        mcp_sources 格式:
        {
            "github": "repo/main/src",
            "notion": "database/project-docs"
        }
        """

        enhanced_prompt = base_prompt + "\n\n## CONTEXT FROM EXTERNAL SOURCES\n\n"

        for source_name, resource_path in mcp_sources.items():
            context = self.fetch_context_from_mcp(source_name, resource_path)
            if context:
                enhanced_prompt += f"### From {source_name}\n{context}\n\n"

        return enhanced_prompt
```

## 第四节 实际案例分析

## 案例 1：技术文档的实时答疑系统

**场景**

一个公司需要为产品的技术文档构建一个自动化答疑系统，用户可以提问关于产品的技术细节。

**上下文工程方案**

1. **写入**：创建系统提示词，定义回答者的角色
2. **选择**：使用 RAG 从文档库中检索相关内容
3. **压缩**：如果检索到的文档太长，进行摘要压缩
4. **隔离**：明确区分官方文档、已知问题、用户问题

```python
def build_documentation_qa_system():
    client = anthropic.Anthropic()

    # 步骤 1：建立文档索引（离线）
    doc_index = build_documentation_index("/path/to/docs")

    # 步骤 2：处理用户问题
    user_question = "How do I configure SSL certificates?"

    # 步骤 3：检索相关文档
    relevant_docs = doc_index.search(user_question, top_k=3)

    # 步骤 4：构建系统提示词
    system_prompt = """
You are a technical support specialist for our product. You have deep knowledge
of the product and access to official documentation.

GUIDELINES:
- Always cite the official documentation when providing answers
- If a question is not covered in the documentation, say so explicitly
- Provide step-by-step instructions when relevant
- Distinguish between official features, workarounds, and unsupported approaches
"""

    # 步骤 5：构建用户消息，包含上下文
    user_message = f"""
Based on the following documentation sections, answer this question:

DOCUMENTATION:
{chr(10).join([f"[Doc {i+1}]\n{doc}" for i, doc in enumerate(relevant_docs)])}

QUESTION:
{user_question}

Please provide a clear, step-by-step answer based on the documentation.
"""

    # 步骤 6：调用 Claude
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1000,
        system=system_prompt,
        messages=[{"role": "user", "content": user_message}]
    )

    return response.content[0].text
```

## 案例 2：代码审查系统

**场景**

一个开发团队需要一个 AI 辅助的代码审查系统，能够检查代码的质量、安全性和最佳实践。

**上下文工程方案**

1. **写入**：编写详细的代码审查标准和检查项清单
2. **选择**：选择相关的历史代码审查记录作为参考
3. **压缩**：对相关代码进行摘要（展示关键部分）
4. **隔离**：分离安全问题、性能问题、风格问题

````python
class CodeReviewContextBuilder:
    def __init__(self, client):
        self.client = client

    def build_review_prompt(self, code_to_review: str,
                           codebase_style_guide: str,
                           security_policies: str,
                           similar_reviewed_code: list[str]) -> str:
        """构建代码审查的上下文"""

        prompt = f"""

# CODE REVIEW GUIDELINES

## STYLE GUIDE
{codebase_style_guide}

## SECURITY POLICIES
{security_policies}

## SIMILAR CODE EXAMPLES (Previously Reviewed)
{chr(10).join([f'### Example {i+1}\\n{code}' for i, code in enumerate(similar_reviewed_code)])}

---

# CODE TO REVIEW

```text
{code_to_review}
```
## REVIEW CHECKLIST
1. Code Style Compliance
2. Security Issues
3. Performance Concerns
4. Best Practices
5. Maintainability
6. Test Coverage

For each category, identify specific issues and provide recommendations.
"""

        return prompt

    def review_code(self, code: str, context_dict: dict) -> dict:
        """执行代码审查"""

        prompt = self.build_review_prompt(
            code,
            context_dict["style_guide"],
            context_dict["security_policies"],
            context_dict["similar_code"]
        )

        response = self.client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=2000,
            system="""You are an expert code reviewer with expertise in multiple
programming languages and best practices. Provide constructive, specific feedback.""",
            messages=[{"role": "user", "content": prompt}]
        )

        return {
            "code_reviewed": code[:100] + "...",
            "review": response.content[0].text
        }
````

## 第四节：RAG 与 MCP 的上下文管理实践

## 13.3.11 RAG 的上下文工程

RAG 是上下文工程的核心应用场景之一。通过检索相关文档来动态构建上下文，可以在不扩大模型本身的情况下显著提升知识覆盖范围。

**完整的 RAG 实现示例**

```python
# 必需导入
import anthropic
from typing import List, Dict, Optional
import json


class RAGContextManager:
    """RAG 系统的上下文管理器"""

    def __init__(self, knowledge_base: List[Dict[str, str]]):
        """
        初始化 RAG 管理器
        knowledge_base: 知识库文档列表
        [{
            "id": "doc_1",
            "title": "Title",
            "content": "Content",
            "metadata": {"category": "...", "date": "..."}
        }]
        """
        self.client = anthropic.Anthropic()
        self.knowledge_base = knowledge_base
        self.model = "claude-haiku-4-5-20251001"

    def simple_keyword_search(self, query: str, top_k: int = 3) -> List[Dict]:
        """简单的关键词搜索（生产环境应使用向量数据库）"""
        query_terms = set(query.lower().split())
        scores = []

        for doc in self.knowledge_base:
            content = (doc.get("content", "") + " " + doc.get("title", "")).lower()
            content_terms = set(content.split())
            # 计算 Jaccard 相似度
            overlap = len(query_terms & content_terms)
            if overlap > 0:
                score = overlap / len(query_terms | content_terms)
                scores.append((doc, score))

        # 按分数排序并返回 top-k
        scores.sort(key=lambda x: x[1], reverse=True)
        return [doc for doc, score in scores[:top_k]]

    def build_system_prompt_with_context(self, relevant_docs: List[Dict]) -> str:
        """根据检索到的文档构建系统提示词"""
        context_section = "# 参考资料\n\n"

        for i, doc in enumerate(relevant_docs, 1):
            context_section += f"## 资料 {i}: {doc.get('title', 'Untitled')}\n"
            context_section += f"{doc.get('content', '')}\n\n"

        system_prompt = f"""你是一个知识助手，基于以下参考资料回答用户问题。

{context_section}

## 回答指南
1. 优先使用参考资料中的信息
2. 如果参考资料中没有相关信息，明确说明
3. 对于不确定的信息，表达你的不确定性
4. 在引用参考资料时，指明来源"""

        return system_prompt

    def query_with_rag(self, user_query: str) -> Dict:
        """执行 RAG 查询"""
        # 步骤 1: 检索相关文档
        relevant_docs = self.simple_keyword_search(user_query, top_k=3)

        # 步骤 2: 构建带有上下文的系统提示词
        system_prompt = self.build_system_prompt_with_context(relevant_docs)

        # 步骤 3: 调用 Claude API
        response = self.client.messages.create(
            model=self.model,
            max_tokens=2000,
            system=system_prompt,
            messages=[
                {
                    "role": "user",
                    "content": user_query
                }
            ]
        )

        return {
            "query": user_query,
            "relevant_documents": [
                {
                    "title": doc.get("title"),
                    "id": doc.get("id")
                }
                for doc in relevant_docs
            ],
            "answer": response.content[0].text,
            "tokens_used": {
                "input": response.usage.input_tokens,
                "output": response.usage.output_tokens
            }
        }


# 使用示例
if __name__ == "__main__":
    # 构建知识库
    knowledge_base = [
        {
            "id": "doc_1",
            "title": "Claude 能力概览",
            "content": "Claude Sonnet 4.6 是 Anthropic 推出的通用高性能模型。支持 1M token 上下文窗口，",
        },
        {
            "id": "doc_2",
            "title": "Claude API 定价",
            "content": "Claude Sonnet 4.6: 输入 $3/百万 token，输出 $15/百万 token。",
        },
        {
            "id": "doc_3",
            "title": "最佳实践",
            "content": "在使用 Claude 时，应该提供清晰的上下文和具体的示例。",
        },
    ]

    # 初始化 RAG 管理器
    rag = RAGContextManager(knowledge_base)

    # 执行查询
    result = rag.query_with_rag("Claude 的输出 token 价格是多少？")
    print(json.dumps(result, ensure_ascii=False, indent=2))
```

**成本优化：使用提示缓存**

```python
class CachedRAGManager(RAGContextManager):
    """使用提示缓存优化的 RAG 管理器"""

    def query_with_cached_context(self, user_query: str) -> Dict:
        """使用缓存的知识库上下文进行查询"""

        # 构建知识库的缓存块
        knowledge_content = "# 知识库\n\n"
        for doc in self.knowledge_base:
            knowledge_content += f"## {doc.get('title')}\n{doc.get('content')}\n\n"

        response = self.client.messages.create(
            model=self.model,
            max_tokens=2000,
            system=[
                {
                    "type": "text",
                    "text": "你是一个知识助手。根据以下知识库回答问题。"
                },
                {
                    "type": "text",
                    "text": knowledge_content,
                    "cache_control": {"type": "ephemeral"}  # 标记为缓存块
                }
            ],
            messages=[
                {
                    "role": "user",
                    "content": user_query
                }
            ]
        )

        return {
            "query": user_query,
            "answer": response.content[0].text,
            "cache_usage": {
                "cache_creation_input_tokens": response.usage.cache_creation_input_tokens or 0,
                "cache_read_input_tokens": response.usage.cache_read_input_tokens or 0,
                "input_tokens": response.usage.input_tokens,
                "output_tokens": response.usage.output_tokens
            }
        }
```

## 13.3.12 MCP 作为上下文工程的桥梁

MCP（Model Context Protocol）提供了一种标准化的方式来动态地获取和管理上下文。MCP 可以连接到各种数据源（数据库、APIs、文件系统等），为 Claude 提供实时的、结构化的上下文。

**MCP 在上下文工程中的角色**

```
用户查询
    ↓
上下文需求分析
    ↓
通过 MCP 查询相关数据源
    ├─ 文件系统 MCP（本地文件）
    ├─ 数据库 MCP（SQL/NoSQL 数据）
    ├─ API MCP（实时数据）
    └─ 其他服务 MCP（Slack、GitHub 等）
    ↓
聚合上下文
    ↓
构建提示词
    ↓
调用 Claude
    ↓
生成答案
```

**实现示例：使用 MCP 获取实时上下文**

```python
class MCPContextManager:
    """使用 MCP 协议管理上下文的管理器"""

    def __init__(self, model: str = "claude-sonnet-4-6"):
        self.client = anthropic.Anthropic()
        self.model = model
        # 在实际应用中，需要配置 MCP 服务器连接
        self.mcp_resources = {}

    def register_mcp_source(self, source_name: str, mcp_config: Dict):
        """注册 MCP 数据源"""
        self.mcp_resources[source_name] = mcp_config

    def fetch_context_from_mcp(self, sources: List[str], query: str) -> Dict:
        """从多个 MCP 数据源获取上下文"""
        context = {}

        for source_name in sources:
            if source_name not in self.mcp_resources:
                continue

            # 模拟从 MCP 源获取数据
            # 在实际应用中，这会通过 MCP 协议调用真实的数据源
            source_config = self.mcp_resources[source_name]

            if source_config["type"] == "file":
                # 从文件系统获取
                context[source_name] = f"文件内容: {source_config['path']}"

            elif source_config["type"] == "database":
                # 从数据库查询
                context[source_name] = f"数据库查询结果: {source_config['query']}"

            elif source_config["type"] == "api":
                # 从 API 获取
                context[source_name] = f"API 响应: {source_config['endpoint']}"

        return context

    def query_with_mcp_context(self, user_query: str, mcp_sources: List[str]) -> Dict:
        """使用 MCP 上下文进行查询"""

        # 获取 MCP 上下文
        mcp_context = self.fetch_context_from_mcp(mcp_sources, user_query)

        # 构建系统提示词
        system_prompt = "你是一个 AI 助手。根据以下实时数据回答问题。\n\n"
        for source, context in mcp_context.items():
            system_prompt += f"[{source}]\n{context}\n\n"

        response = self.client.messages.create(
            model=self.model,
            max_tokens=2000,
            system=system_prompt,
            messages=[
                {
                    "role": "user",
                    "content": user_query
                }
            ]
        )

        return {
            "query": user_query,
            "mcp_sources_used": mcp_sources,
            "answer": response.content[0].text
        }


# 使用示例
if __name__ == "__main__":
    mcp_manager = MCPContextManager()

    # 注册 MCP 数据源
    mcp_manager.register_mcp_source("filesystem", {
        "type": "file",
        "path": "/path/to/documentation"
    })
    mcp_manager.register_mcp_source("database", {
        "type": "database",
        "query": "SELECT * FROM products WHERE category='AI'"
    })

    # 查询
    result = mcp_manager.query_with_mcp_context(
        "我们有哪些 AI 产品？",
        mcp_sources=["filesystem", "database"]
    )
    print(json.dumps(result, ensure_ascii=False, indent=2))
```

## 13.3.13 大规模上下文的成本优化

对于处理大量上下文的应用，成本优化至关重要。

**策略 1：分层上下文**

```python
class HierarchicalContextManager:
    """分层上下文管理，优化成本"""

    def __init__(self):
        self.client = anthropic.Anthropic()
        self.model = "claude-sonnet-4-6"

    def build_hierarchical_context(self, documents: List[str], query: str) -> str:
        """为大型文档构建分层上下文"""

        # 层级 1: 超级总结（100 tokens）
        overview = self._summarize_documents(documents, max_tokens=100)

        # 层级 2: 相关章节的详细摘要（500 tokens）
        relevant_sections = self._extract_relevant_sections(documents, query, max_tokens=500)

        # 层级 3: 完整内容（仅在前两个不足时包含）

        context = f"""

# 文档概览
{overview}

# 相关章节详情
{relevant_sections}

[如需完整文档内容，请提示]
"""

        return context

    def _summarize_documents(self, documents: List[str], max_tokens: int) -> str:
        """为文档生成超级摘要"""
        # 实现文档总结逻辑
        pass

    def _extract_relevant_sections(self, documents: List[str], query: str, max_tokens: int) -> str:
        """提取与查询相关的章节"""
        # 实现相关性提取逻辑
        pass
```

**策略 2：使用 Batch API 降低成本**

对于不需要实时响应的任务，使用 Batch API 可以降低 50% 的成本：

```python
def submit_batch_context_jobs(queries: List[str], knowledge_base: List[Dict]) -> str:
    """提交批量任务到 Batch API"""
    requests = []

    for i, query in enumerate(queries):
        request = {
            "custom_id": f"request-{i}",
            "params": {
                "model": "claude-sonnet-4-6",
                "max_tokens": 2000,
                "system": f"使用以下知识库回答问题:\n{json.dumps(knowledge_base)}",
                "messages": [
                    {
                        "role": "user",
                        "content": query
                    }
                ]
            }
        }
        requests.append(request)

    # 提交批量请求（伪代码）
    # batch_id = client.beta.batch.submit(requests)
    # 返回 batch_id，稍后可以查询结果

    return "batch_submitted"
```

## 第五节 实践案例：MCP 驱动的动态上下文工程

## 13.3.14 场景：实时 GitHub 知识库与 AI 助手集成

**背景**：使用 MCP 连接 GitHub 知识库，动态获取最新文档和代码作为上下文。

````python
import anthropic
import hashlib
import json
from typing import Optional, List
import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

class MCPContextEngineer:
    """使用 MCP 的动态上下文工程"""

    def __init__(self, model: str = "claude-sonnet-4-6"):
        self.client = anthropic.Anthropic()
        self.model = model
        self.context_cache = {}
        self.mcp_tools = self._initialize_mcp_tools()

    def _initialize_mcp_tools(self) -> List[dict]:
        """初始化 MCP 工具定义"""
        return [
            {
                "name": "github_fetch_docs",
                "description": "从 GitHub 仓库获取最新的文档内容",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "repo": {"type": "string", "description": "Repository path (owner/repo)"},
                        "path": {"type": "string", "description": "File or directory path"},
                        "branch": {"type": "string", "description": "Branch name (default: main)"}
                    },
                    "required": ["repo", "path"]
                }
            },
            {
                "name": "github_search_issues",
                "description": "搜索 GitHub issues 和讨论",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "repo": {"type": "string"},
                        "query": {"type": "string", "description": "Search query"},
                        "limit": {"type": "integer", "description": "Max results (default: 5)"}
                    },
                    "required": ["repo", "query"]
                }
            },
            {
                "name": "fetch_recent_code",
                "description": "获取最近提交的代码片段",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "repo": {"type": "string"},
                        "file_type": {"type": "string", "description": "File extension (e.g., .py, .js)"},
                        "limit": {"type": "integer", "description": "Number of files"}
                    },
                    "required": ["repo"]
                }
            }
        ]

    def _execute_mcp_tool(self, tool_name: str, tool_input: dict) -> str:
        """执行 MCP 工具（模拟实现）"""
        logger.info(
            "Executing MCP tool",
            extra={
                "tool_name": tool_name,
                "param_keys": sorted(tool_input.keys()),
                "param_hash": hashlib.sha256(
                    json.dumps(tool_input, sort_keys=True).encode("utf-8")
                ).hexdigest()
            }
        )

        # 在实际应用中，这会调用真实的 MCP 服务
        # 这里是示例实现
        if tool_name == "github_fetch_docs":
            return f"# Documentation for {tool_input['repo']}\n\n[Document content would be fetched from GitHub via MCP]"
        elif tool_name == "github_search_issues":
            return f"Found issues for '{tool_input['query']}':\n- Issue 1: Bug in authentication\n- Issue 2: Performance issue"
        elif tool_name == "fetch_recent_code":
            return "Recent code samples:\n```python\ndef example():\n    pass\n```"
        else:
            return "Tool not recognized"

    def build_dynamic_context(self, user_query: str, repo: str) -> tuple[str, dict]:
        """基于用户查询动态构建上下文"""
        context_parts = []
        mcp_calls = {}

        # 步骤 1：分析查询以确定需要什么上下文
        analysis_response = self.client.messages.create(
            model=self.model,
            max_tokens=500,
            system="""You are a context analyzer. Analyze the user's query and determine what information
would be most helpful. Respond with a JSON object indicating which types of information to fetch:
{
    "needs_docs": boolean,
    "needs_issues": boolean,
    "needs_code": boolean,
    "doc_path": string (optional),
    "search_query": string (optional),
    "code_file_type": string (optional)
}""",
            messages=[
                {
                    "role": "user",
                    "content": f"User query: {user_query}\nRepository: {repo}"
                }
            ]
        )

        try:
            analysis = json.loads(analysis_response.content[0].text)
        except (json.JSONDecodeError, IndexError):
            logger.warning("Failed to parse analysis, using default context")
            analysis = {"needs_docs": True, "needs_code": True}

        # 步骤 2：根据分析获取上下文
        if analysis.get("needs_docs"):
            doc_path = analysis.get("doc_path", "README.md")
            docs = self._execute_mcp_tool("github_fetch_docs", {
                "repo": repo,
                "path": doc_path
            })
            context_parts.append(f"## Repository Documentation\n\n{docs}")
            mcp_calls["github_fetch_docs"] = {"repo": repo, "path": doc_path}

        if analysis.get("needs_issues"):
            search_query = analysis.get("search_query", user_query)
            issues = self._execute_mcp_tool("github_search_issues", {
                "repo": repo,
                "query": search_query,
                "limit": 3
            })
            context_parts.append(f"## Related Issues\n\n{issues}")
            mcp_calls["github_search_issues"] = {"repo": repo, "query": search_query}

        if analysis.get("needs_code"):
            code = self._execute_mcp_tool("fetch_recent_code", {
                "repo": repo,
                "file_type": analysis.get("code_file_type", ".py"),
                "limit": 3
            })
            context_parts.append(f"## Code Examples\n\n{code}")
            mcp_calls["fetch_recent_code"] = {"repo": repo}

        context = "\n\n".join(context_parts)
        return context, mcp_calls

    def answer_with_dynamic_context(self, user_query: str, repo: str) -> str:
        """使用动态上下文回答用户查询"""
        logger.info(
            "Building dynamic context",
            extra={
                "repo": repo,
                "query_length": len(user_query),
                "query_hash": hashlib.sha256(user_query.encode("utf-8")).hexdigest()
            }
        )

        # 构建动态上下文
        context, mcp_calls = self.build_dynamic_context(user_query, repo)

        # 使用上下文调用 Claude
        response = self.client.messages.create(
            model=self.model,
            max_tokens=2000,
            system=f"""You are a helpful assistant for the {repo} project.
Use the provided context (documentation, issues, and code) to answer questions accurately.
When referencing information from the context, cite the source.""",
            messages=[
                {
                    "role": "user",
                    "content": f"""Context from repository:

{context}

---

User question: {user_query}"""
                }
            ]
        )

        answer = response.content[0].text

        logger.info(
            "MCP calls made",
            extra={
                "tool_names": sorted(mcp_calls.keys()),
                "call_count": len(mcp_calls)
            }
        )
        return answer

# 使用示例
if __name__ == "__main__":
    engineer = MCPContextEngineer()

    # 示例查询
    query = "How do I authenticate users in this project?"
    repo = "anthropics/anthropic-sdk-python"

    answer = engineer.answer_with_dynamic_context(query, repo)
    print(f"Answer:\n{answer}")
````

## 13.3.15 MCP 上下文工程的优势

**动态性**：

* 上下文始终是最新的（从实时源获取）
* 能够在对话过程中调整上下文

**相关性**：

* 基于查询的智能选择（分析器确定需要什么）
* 避免不相关的信息堵塞

**可扩展性**：

* 易于添加新的 MCP 源（数据库、API、文件系统）
* 支持复杂的跨源查询

**成本效益**：

* 只获取必需的信息
* 减少冗余和低质量的上下文

## 13.3.16 与传统 RAG 的对比

| 特性    | 传统 RAG          | MCP 驱动      |
| ----- | --------------- | ----------- |
| 数据来源  | 静态知识库           | 动态，多个 MCP 源 |
| 更新频率  | 定期重新索引          | 实时          |
| 上下文选择 | 向量相似度           | 智能分析 + 向量搜索 |
| 集成成本  | 需要 embedding 模型 | 标准 MCP 协议   |
| 适用场景  | 静态文档            | 快速变化的数据     |

## 第六节 总结与最佳实践

## 13.3.17 上下文工程的黄金法则

1. **相关性优先**：不是信息越多越好，而是信息越相关越好
2. **一致性维护**：确保不同部分的上下文在逻辑上一致
3. **清晰结构化**：使用明确的标签和分隔符组织上下文
4. **定期更新**：保持上下文的时效性和准确性
5. **可测量优化**：建立指标来衡量上下文质量的改进

## 13.3.18 迁移检查清单

从提示词工程迁移到上下文工程：

* [ ] 评估当前任务的数据需求
* [ ] 建立高质量的上下文来源
* [ ] 实现上下文检索和选择机制
* [ ] 测试上下文压缩的有效性
* [ ] 建立上下文的版本控制
* [ ] 建立性能监测指标
* [ ] 定期进行上下文质量审计

## 13.3.19 工具和资源

**推荐工具**：

* 向量数据库：Pinecone、Weaviate、Milvus
* 文档处理：LangChain、LlamaIndex
* MCP 平台：Anthropic 官方 MCP 服务器

**深入学习**：

* [Anthropic 官方文档](https://docs.anthropic.com)
* [MCP 规范](https://modelcontextprotocol.io)
* 关于上下文工程的研究论文

***

上下文工程代表了与 LLM 交互的一个范式转变。通过关注提供什么信息、而非如何指示，我们能够更有效地利用 Claude 的能力，构建更强大、更可靠的 AI 应用。


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://yeasy.gitbook.io/claude_guide/di-wu-bu-fen-jin-jie-neng-li/13_advanced/13.3_context_engineering.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.