13.3 Context Engineering 概览

Context Engineering 概览：从提示词工程到上下文工程

序言

如果说 2023 年是“提示词工程”（Prompt Engineering）的年代，那么 2024-2025 年的焦点已经逐步转向“上下文工程”（Context Engineering）。这个范式转变反映了一个深刻的认知：与其优化如何表达指令，不如优化向模型提供什么信息、以什么方式提供。本章深入探讨这一转变的原因、实践和方法论。

💡 深入学习：想深入了解上下文工程的战略和高级技巧？请参阅《大模型上下文工程权威指南》。想了解 MCP 作为上下文来源的更多细节？请参阅本书第 4 章《MCP 模型上下文协议》。

第一节核心概念：为什么上下文工程正在取代提示词工程

13.3.1 问题的演进

提示词工程时代（2022-2023）的瓶颈

在提示词工程的黄金时代，我们关注的是如何写得更好的指令：

更清晰的表述
更多的示例（Few-shot learning）
更明确的格式要求
更长的思维链（Chain-of-Thought）

虽然这些技术确实有效，但存在几个本质的限制：

边际收益递减：经过一定的优化后，提示词改进带来的性能提升开始平缓
模型能力瓶颈：再好的提示词也无法让模型完成它能力范围之外的任务
输出质量的天花板：模型的最终表现由其训练数据和能力所决定，而不仅由提示词

新的认知

随着 Claude 3.5 系列模型的推出和用户规模的扩大，我们发现了一个更深层的真相：

问题不在于如何问，而在于给模型什么信息。

当我们为 Claude 提供充分的、相关的、高质量的上下文信息时，即使使用相对简洁的指令，模型也能表现出色。相反，再精妙的提示词也无法弥补上下文信息的缺失。

13.3.2 上下文工程的定义

上下文工程（Context Engineering）是指系统地设计、选择、组织和呈现给 LLM 的输入信息（上下文），以优化模型的输出质量、准确性、可靠性和效率的实践和方法论。

核心特点：

以信息为中心：关注提供给模型的信息的质量、相关性和组织
动态和自适应：上下文不是静态的，而是根据任务动态选择和调整的
多维度优化：优化维度包括信息的准确性、完整性、相关性、结构清晰度、时间鲜度等
系统化方法：采用明确的策略和流程来进行上下文设计

13.3.3 两种范式的对比

维度

提示词工程

上下文工程

优化焦点

指令的表述

提供的信息

关键资源

优化的语言和模板

高质量的上下文数据

主要工作

写、修改提示词

数据收集、整理、选择

衡量指标

提示词的清晰度、详细度

上下文的相关性、覆盖率

扩展性

有限的（人工调整）

可扩展的（自动化选择）

成本驱动因素

迭代调整的时间

上下文处理的 token 数

典型工具

文本编辑器、提示库

向量数据库、检索系统

13.3.4 为什么现在转向上下文工程

技术进步的驱动

模型能力的成熟：Claude 等现代 LLM 已经足够强大，不需要复杂的提示技巧就能理解简单指令
上下文窗口的扩大：从 4K 到 200K 甚至 1M token，提供丰富上下文成为可能
检索技术的进步：向量搜索和 RAG 使得从大型数据库中智能提取相关信息成为可能
缓存和成本优化：提示缓存等技术使得包含大量上下文变得经济可行

实际应用的反馈

通过分析数百万个 Claude 的实际使用案例，我们发现：

包含充分背景信息的请求，即使提示词简洁，也往往得到高质量的输出
提示词优化的改进通常只能带来 5-10% 的性能提升
上下文优化（选择更相关的信息）往往能带来 30-50% 的性能提升
对于知识密集型任务，充分的上下文甚至可以弥补模型知识的时效性限制

第二节上下文工程的四大策略

上下文工程通过四个主要维度来优化模型的表现：

13.3.5 策略一：写入（Writing）- 制造高质量的上下文

写入指的是为你的特定需求创建、编写或生成上下文内容。

应用场景

为 Claude 编写详细的背景说明、指导原则或参考资料
创建系统提示词（System Prompt）来定义 Claude 的角色和约束
编写示例和参考案例供 Claude 学习

最佳实践

# 高质量系统提示词的结构

## 1. 角色定义
You are an expert [domain] professional with [X years] of experience.

## 2. 目标陈述
Your primary goal is to [specific objective].

## 3. 约束和原则
- Principle 1: [description]
- Principle 2: [description]

## 4. 风格和格式
- Tone: [descriptive]
- Format: [specific format]
- Language level: [proficiency level]

## 5. 边界条件
- Do not: [specific constraints]
- Always: [mandatory requirements]

示例：金融顾问系统提示词

system_prompt = """
You are a certified financial advisor with 15 years of experience in personal finance
and investment strategy.

PRIMARY GOAL:
Provide comprehensive, evidence-based financial guidance tailored to the user's
specific situation and goals.

CORE PRINCIPLES:
1. Always disclose any assumptions about the user's financial situation
2. Consider tax implications and regulatory constraints
3. Present multiple options with clear trade-offs
4. Base recommendations on published research and data
5. Acknowledge uncertainty and limitations

CONSTRAINTS:
- Do NOT provide guaranteed returns or financial predictions
- Do NOT recommend specific securities without full disclosure of limitations
- Do NOT advise on insurance if not qualified
- Always recommend consulting with appropriate professionals for complex matters

OUTPUT FORMAT:
- Situation analysis (2-3 sentences)
- Key considerations (3-5 bullet points)
- Recommended approaches (2-3 options with pros/cons)
- Action items (specific, numbered steps)
"""

写入的成本

编写高质量的背景信息和系统提示词需要投入：

专业知识（了解领域的关键概念）
清晰的表达能力
对目标任务的深刻理解

但这个投入是一次性的，之后可以重复使用，所以 ROI 通常很高。

13.3.6 策略二：选择（Selection）- 检索相关的上下文

选择指的是从大量可用信息中智能地选择与当前任务最相关的内容。

核心问题

在信息爆炸的时代，问题不是信息的缺乏，而是过多。Claude 需要的是相关的、高质量的上下文，而不是所有可用的上下文。

实现方法

方法 1：向量搜索和语义相似度

from anthropic import Anthropic
import numpy as np

class ContextSelector:
    def __init__(self, documents: list[str]):
        """初始化上下文选择器"""
        self.documents = documents
        self.embeddings = self._compute_embeddings()

    def _compute_embeddings(self):
        """计算文档的向量嵌入"""
        # 在实际应用中，可以使用 OpenAI、Cohere 或其他 embedding 服务
        # 这里是伪代码
        return [embed_document(doc) for doc in self.documents]

    def select_relevant_context(self, query: str, top_k: int = 5) -> list[str]:
        """基于查询选择最相关的文档"""
        query_embedding = embed_document(query)

        # 计算相似度
        similarities = [
            cosine_similarity(query_embedding, doc_emb)
            for doc_emb in self.embeddings
        ]

        # 获取 top-k 最相关的文档
        top_indices = np.argsort(similarities)[-top_k:][::-1]
        return [self.documents[i] for i in top_indices]

    def select_with_diversity(self, query: str, top_k: int = 5) -> list[str]:
        """选择相关且多样化的上下文（避免重复信息）"""
        relevant = self.select_relevant_context(query, top_k * 2)

        # 简单的多样性过滤：选择在主题上不同的文档
        selected = []
        for doc in relevant:
            if not any(has_high_overlap(doc, s) for s in selected):
                selected.append(doc)
            if len(selected) == top_k:
                break

        return selected


def cosine_similarity(a, b):
    """计算余弦相似度"""
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))


def embed_document(text: str):
    """嵌入文本（实现取决于具体服务）"""
    # 示例代码
    pass


def has_high_overlap(doc1: str, doc2: str, threshold: float = 0.7) -> bool:
    """检查两个文档是否有高度重叠"""
    words1 = set(doc1.lower().split())
    words2 = set(doc2.lower().split())
    overlap = len(words1 & words2) / len(words1 | words2)
    return overlap > threshold

方法 2：结构化元数据选择

class StructuredContextSelector:
    def __init__(self):
        self.documents = []
        self.metadata_index = {}

    def add_document(self, content: str, metadata: dict):
        """添加带元数据的文档"""
        doc_id = len(self.documents)
        self.documents.append(content)

        # 为每个元数据字段建立索引
        for key, value in metadata.items():
            if key not in self.metadata_index:
                self.metadata_index[key] = {}
            if value not in self.metadata_index[key]:
                self.metadata_index[key][value] = []
            self.metadata_index[key][value].append(doc_id)

    def select_by_criteria(self, criteria: dict) -> list[str]:
        """基于元数据条件选择文档"""
        """
        criteria 示例：
        {
            "category": "technical",
            "date_range": ("2024-01-01", "2025-03-01"),
            "relevance_score": (0.7, 1.0)
        }
        """

        candidate_ids = set(range(len(self.documents)))

        for key, value in criteria.items():
            if key not in self.metadata_index:
                continue

            if isinstance(value, (tuple, list)):
                # 范围查询
                min_val, max_val = value
                matching_ids = set()
                for v, ids in self.metadata_index[key].items():
                    if min_val <= v <= max_val:
                        matching_ids.update(ids)
            else:
                # 精确查询
                matching_ids = set(self.metadata_index[key].get(value, []))

            candidate_ids &= matching_ids

        return [self.documents[i] for i in candidate_ids]

选择的最佳实践

考虑时间鲜度：优先选择最新的信息
关注权威性：选择来自可信源的信息
平衡覆盖面和深度：包括概览性和详细的内容
避免冗余：不要包含高度重叠的内容

13.3.7 策略三：压缩（Compression）- 去除冗余，保留精华

压缩指的是减少上下文的大小，同时保持其信息价值。

压缩技术

技术 1：抽象式总结

def abstractive_compression(long_document: str, target_length: int = 1000) -> str:
    """将长文档压缩为摘要"""

    client = Anthropic()

    # 第一步：创建初始摘要
    response = client.messages.create(
        model="claude-sonnet-4-5-20250929",
        max_tokens=min(target_length, 2000),
        messages=[
            {
                "role": "user",
                "content": f"""Summarize this document focusing on key information:

{long_document}

Keep the summary to approximately {target_length} words.
Maintain all factual accuracy."""
            }
        ]
    )

    summary = response.content[0].text

    # 第二步：评估和迭代
    if len(summary) > target_length * 1.5:
        # 如果超过目标太多，进行第二轮压缩
        response = client.messages.create(
            model="claude-sonnet-4-5-20250929",
            max_tokens=int(target_length / 2),
            messages=[
                {
                    "role": "user",
                    "content": f"""Further compress this summary:

{summary}

Keep only the most critical information."""
                }
            ]
        )
        summary = response.content[0].text

    return summary

技术 2：结构化提取

def structured_compression(document: str) -> dict:
    """将文档压缩为结构化格式"""

    client = Anthropic()

    response = client.messages.create(
        model="claude-sonnet-4-5-20250929",
        max_tokens=2000,
        messages=[
            {
                "role": "user",
                "content": f"""Extract and organize this document into a structured format:

{document}

Provide in this JSON format:
{{
  "main_topic": "...",
  "key_points": ["point1", "point2", ...],
  "important_facts": {{"fact1": "value1", ...}},
  "entities": ["entity1", "entity2", ...],
  "action_items": ["item1", "item2", ...],
  "open_questions": ["question1", "question2", ...]
}}"""
            }
        ]
    )

    return json.loads(response.content[0].text)

技术 3：去重和去冗余

class ContextDeduplicator:
    @staticmethod
    def remove_duplicate_information(contexts: list[str]) -> list[str]:
        """移除上下文之间的重复信息"""

        client = Anthropic()

        # 将所有上下文合并
        combined = "\n\n".join([f"Document {i}:\n{ctx}" for i, ctx in enumerate(contexts)])

        response = client.messages.create(
            model="claude-sonnet-4-5-20250929",
            max_tokens=2000,
            messages=[
                {
                    "role": "user",
                    "content": f"""Identify and remove duplicate or overlapping information from these documents:

{combined}

For each document, keep only the unique information not covered by others.
Provide the deduplicated documents in the same order."""
                }
            ]
        )

        # 解析并返回去重后的上下文
        # 实现细节...
        return contexts  # 简化版本

压缩的成本-收益分析

压缩方法

压缩率

信息丧失

适用场景

摘要

50-80%

5-10%

长文档、新闻文章

结构化提取

40-70%

15-25%

数据密集文档

去重

10-30%

多源信息

关键词提取

20-40%

30-40%

快速索引

13.3.8 策略四：隔离（Isolation）- 分离不同类型的上下文

隔离指的是明确分离不同来源、不同类型的上下文，以避免混淆和冲突。

隔离的意义

在复杂的应用中，上下文通常来自多个源：

用户提供的信息
检索到的背景知识
系统定义的规则
历史对话记录
外部数据集

如果这些混在一起，Claude 可能会混淆哪些是事实、哪些是假设、哪些是用户的意见。

隔离的实现

class IsolatedContextBuilder:
    """建立明确隔离的上下文结构"""

    def __init__(self):
        self.context_parts = {}

    def add_system_knowledge(self, knowledge: str):
        """添加系统级背景知识"""
        if "system_knowledge" not in self.context_parts:
            self.context_parts["system_knowledge"] = []
        self.context_parts["system_knowledge"].append(knowledge)

    def add_user_provided(self, information: str):
        """添加用户提供的信息"""
        if "user_provided" not in self.context_parts:
            self.context_parts["user_provided"] = []
        self.context_parts["user_provided"].append(information)

    def add_retrieved_context(self, context: str, source: str = "unknown"):
        """添加检索到的上下文"""
        if "retrieved_context" not in self.context_parts:
            self.context_parts["retrieved_context"] = []
        self.context_parts["retrieved_context"].append({
            "content": context,
            "source": source
        })

    def add_constraints(self, constraint: str):
        """添加系统约束"""
        if "constraints" not in self.context_parts:
            self.context_parts["constraints"] = []
        self.context_parts["constraints"].append(constraint)

    def build_prompt(self) -> str:
        """构建清晰隔离的提示"""

        prompt = ""

        # 系统知识
        if "system_knowledge" in self.context_parts:
            prompt += "## SYSTEM KNOWLEDGE\n"
            for item in self.context_parts["system_knowledge"]:
                prompt += f"- {item}\n"
            prompt += "\n"

        # 用户提供的信息
        if "user_provided" in self.context_parts:
            prompt += "## USER-PROVIDED INFORMATION\n"
            for item in self.context_parts["user_provided"]:
                prompt += f"- {item}\n"
            prompt += "\n"

        # 检索到的上下文
        if "retrieved_context" in self.context_parts:
            prompt += "## RETRIEVED CONTEXT\n"
            for item in self.context_parts["retrieved_context"]:
                prompt += f"### From: {item['source']}\n{item['content']}\n\n"

        # 约束
        if "constraints" in self.context_parts:
            prompt += "## CONSTRAINTS\n"
            for item in self.context_parts["constraints"]:
                prompt += f"- {item}\n"

        return prompt

    def example_usage(self):
        """使用示例"""
        builder = IsolatedContextBuilder()

        # 添加不同类型的上下文
        builder.add_system_knowledge("Claude 是由 Anthropic 开发的 AI 助手")
        builder.add_system_knowledge("当前日期是 2025 年 3 月 5 日")

        builder.add_user_provided("用户是一名产品经理")
        builder.add_user_provided("正在开发一个 AI 应用")

        builder.add_retrieved_context(
            "关于 LLM 成本优化的最新研究...",
            source="https://arxiv.org/..."
        )

        builder.add_constraints("不要提供具体的财务建议")
        builder.add_constraints("基于已知信息进行回答，如果不确定则说明")

        return builder.build_prompt()

隔离的最佳实践

使用 XML 标签分隔：<background>, <user_input>, <constraints> 等
明确标注来源：标记每个信息的来源和可信度
区分事实和观点：明确哪些是验证的事实，哪些是假设或观点
版本控制：标记上下文的版本和更新时间

第三节 Claude 中的上下文工程实践

13.3.9 系统提示词的精心设计

系统提示词是上下文工程中最强大的工具。与普通的用户消息不同，系统提示词定义了模型的基本行为和约束。

系统提示词的三层结构

system_prompt_template = """
# ROLE & AUTHORITY
[定义 Claude 的角色、专业背景、权威性]

# PRIMARY OBJECTIVE
[明确的、可测量的目标]

# CONTEXT & CONSTRAINTS
[任务的约束条件、边界和限制]

# INTERACTION STYLE
[交互方式、语气、格式]

# DECISION-MAKING FRAMEWORK
[做出决策时应遵循的框架或原则]
"""

# 具体示例
system_prompt = """
# ROLE & AUTHORITY
You are an expert software architect with 20+ years of experience designing scalable,
high-performance systems. You have worked at companies like Google, Amazon, and Microsoft.

# PRIMARY OBJECTIVE
Help users design robust system architectures that are scalable, maintainable, and cost-effective.

# CONTEXT & CONSTRAINTS
- You MUST consider real-world constraints like latency, bandwidth, and cost
- You SHOULD provide multiple design options with clear trade-offs
- You MUST NOT recommend unproven technologies as primary solutions
- Always prioritize simplicity and proven patterns over novelty

# INTERACTION STYLE
- Be direct and data-driven
- Use diagrams and pseudocode when helpful
- Explain technical concepts clearly
- Ask clarifying questions if requirements are unclear

# DECISION-MAKING FRAMEWORK
1. Understand requirements and constraints
2. Identify alternative approaches
3. Evaluate each approach against key criteria (scalability, cost, complexity)
4. Recommend the best fit with clear reasoning
5. Highlight potential future scalability issues
"""

13.3.10 使用 XML 结构化上下文

XML 标签提供了清晰的结构化方式来组织复杂的上下文。

def build_structured_prompt(task_description: str, background: str,
                            examples: list[str], constraints: list[str]) -> str:
    """构建高度结构化的提示"""

    return f"""
<task>
<description>{task_description}</description>
<objective>
[What you want Claude to do or produce]
</objective>
</task>

<background>
<context>{background}</context>
</background>

<examples>
<count>{len(examples)}</count>
{chr(10).join([f'<example>{ex}</example>' for ex in examples])}
</examples>

<constraints>
<requirement priority="high">You MUST...</requirement>
<requirement priority="medium">You SHOULD...</requirement>
<requirement priority="low">You MAY...</requirement>
{chr(10).join([f'<constraint>{c}</constraint>' for c in constraints])}
</constraints>

<output>
<format>
[Specify the exact format of the output]
</format>
<length>
[Specify length constraints if any]
</length>
</output>
"""

13.3.11 RAG（检索增强生成）与上下文工程的结合

RAG 是上下文工程中最实用的技术之一。

RAG 的基本流程

class RAGPipeline:
    def __init__(self, client, knowledge_base: list[str]):
        self.client = client
        self.knowledge_base = knowledge_base
        self.embeddings = self._embed_knowledge_base()

    def _embed_knowledge_base(self):
        """对知识库进行嵌入"""
        # 在实际应用中，使用真实的 embedding 服务
        pass

    def retrieve_relevant_context(self, query: str, top_k: int = 5) -> list[str]:
        """检索与查询相关的上下文"""
        # 对查询进行嵌入
        # 计算相似度
        # 返回最相关的文档
        pass

    def generate_response(self, query: str, context: list[str]) -> str:
        """基于检索到的上下文生成回答"""

        # 构建上下文
        context_str = "\n\n".join([f"[Source {i}]\n{ctx}" for i, ctx in enumerate(context)])

        response = self.client.messages.create(
            model="claude-sonnet-4-5-20250929",
            max_tokens=2000,
            system="""You are a helpful assistant that answers questions based on
provided context. Always cite your sources when using the provided documents.""",
            messages=[
                {
                    "role": "user",
                    "content": f"""Based on the following context, answer this question:

CONTEXT:
{context_str}

QUESTION:
{query}

Important: Only use information from the provided context. If the context doesn't contain
the answer, say so explicitly."""
                }
            ]
        )

        return response.content[0].text

    def query(self, question: str) -> dict:
        """完整的 RAG 查询流程"""

        # 步骤 1：检索
        relevant_context = self.retrieve_relevant_context(question)

        # 步骤 2：生成
        answer = self.generate_response(question, relevant_context)

        return {
            "question": question,
            "answer": answer,
            "sources": relevant_context
        }

13.3.12 MCP 与上下文工程的集成

模型上下文协议（MCP）与上下文工程是天然的配合。MCP 提供了一种标准化的方式来为 Claude 提供动态的、实时的上下文。

class MCPContextIntegration:
    """MCP 与上下文工程的集成示例"""

    def __init__(self):
        self.mcp_servers = []

    def register_mcp_server(self, server_name: str, server_instance):
        """注册一个 MCP 服务器作为上下文源"""
        self.mcp_servers.append({
            "name": server_name,
            "instance": server_instance
        })

    def fetch_context_from_mcp(self, server_name: str, resource_path: str) -> str:
        """从 MCP 服务器获取上下文"""
        server = next((s for s in self.mcp_servers if s["name"] == server_name), None)
        if server:
            return server["instance"].get_resource(resource_path)
        return ""

    def build_mcp_enhanced_prompt(self, base_prompt: str, mcp_sources: dict) -> str:
        """构建增强了 MCP 上下文的提示"""
        """
        mcp_sources 格式:
        {
            "github": "repo/main/src",
            "notion": "database/project-docs"
        }
        """

        enhanced_prompt = base_prompt + "\n\n## CONTEXT FROM EXTERNAL SOURCES\n\n"

        for source_name, resource_path in mcp_sources.items():
            context = self.fetch_context_from_mcp(source_name, resource_path)
            if context:
                enhanced_prompt += f"### From {source_name}\n{context}\n\n"

        return enhanced_prompt

第四节实际案例分析

案例 1：技术文档的实时答疑系统

场景

一个公司需要为产品的技术文档构建一个自动化答疑系统，用户可以提问关于产品的技术细节。

上下文工程方案

写入：创建系统提示词，定义回答者的角色
选择：使用 RAG 从文档库中检索相关内容
压缩：如果检索到的文档太长，进行摘要压缩
隔离：明确区分官方文档、已知问题、用户问题

def build_documentation_qa_system():
    client = anthropic.Anthropic()

    # 步骤 1：建立文档索引（离线）
    doc_index = build_documentation_index("/path/to/docs")

    # 步骤 2：处理用户问题
    user_question = "How do I configure SSL certificates?"

    # 步骤 3：检索相关文档
    relevant_docs = doc_index.search(user_question, top_k=3)

    # 步骤 4：构建系统提示词
    system_prompt = """
You are a technical support specialist for our product. You have deep knowledge
of the product and access to official documentation.

GUIDELINES:
- Always cite the official documentation when providing answers
- If a question is not covered in the documentation, say so explicitly
- Provide step-by-step instructions when relevant
- Distinguish between official features, workarounds, and unsupported approaches
"""

    # 步骤 5：构建用户消息，包含上下文
    user_message = f"""
Based on the following documentation sections, answer this question:

DOCUMENTATION:
{chr(10).join([f"[Doc {i+1}]\n{doc}" for i, doc in enumerate(relevant_docs)])}

QUESTION:
{user_question}

Please provide a clear, step-by-step answer based on the documentation.
"""

    # 步骤 6：调用 Claude
    response = client.messages.create(
        model="claude-sonnet-4-5-20250929",
        max_tokens=1000,
        system=system_prompt,
        messages=[{"role": "user", "content": user_message}]
    )

    return response.content[0].text

案例 2：代码审查系统

场景

一个开发团队需要一个 AI 辅助的代码审查系统，能够检查代码的质量、安全性和最佳实践。

上下文工程方案

写入：编写详细的代码审查标准和检查项清单
选择：选择相关的历史代码审查记录作为参考
压缩：对相关代码进行摘要（展示关键部分）
隔离：分离安全问题、性能问题、风格问题

class CodeReviewContextBuilder:
    def __init__(self, client):
        self.client = client

    def build_review_prompt(self, code_to_review: str,
                           codebase_style_guide: str,
                           security_policies: str,
                           similar_reviewed_code: list[str]) -> str:
        """构建代码审查的上下文"""

        prompt = f"""
# CODE REVIEW GUIDELINES

## STYLE GUIDE
{codebase_style_guide}

## SECURITY POLICIES
{security_policies}

## SIMILAR CODE EXAMPLES (Previously Reviewed)
{chr(10).join([f'### Example {i+1}\\n{code}' for i, code in enumerate(similar_reviewed_code)])}

---

# CODE TO REVIEW

{code_to_review}


## REVIEW CHECKLIST
1. Code Style Compliance
2. Security Issues
3. Performance Concerns
4. Best Practices
5. Maintainability
6. Test Coverage

For each category, identify specific issues and provide recommendations.
"""

        return prompt

    def review_code(self, code: str, context_dict: dict) -> dict:
        """执行代码审查"""

        prompt = self.build_review_prompt(
            code,
            context_dict["style_guide"],
            context_dict["security_policies"],
            context_dict["similar_code"]
        )

        response = self.client.messages.create(
            model="claude-sonnet-4-5-20250929",
            max_tokens=2000,
            system="""You are an expert code reviewer with expertise in multiple
programming languages and best practices. Provide constructive, specific feedback.""",
            messages=[{"role": "user", "content": prompt}]
        )

        return {
            "code_reviewed": code[:100] + "...",
            "review": response.content[0].text
        }

第四节（扩展）：实践应用 - RAG 与 MCP 的上下文管理

13.3.13 RAG（检索增强生成）的上下文工程

RAG 是上下文工程的核心应用场景之一。通过检索相关文档来动态构建上下文，可以在不扩大模型本身的情况下显著提升知识覆盖范围。

完整的 RAG 实现示例

# 必需导入
import anthropic
from typing import List, Dict, Optional
import json


class RAGContextManager:
    """RAG 系统的上下文管理器"""

    def __init__(self, knowledge_base: List[Dict[str, str]]):
        """
        初始化 RAG 管理器
        knowledge_base: 知识库文档列表
        [{
            "id": "doc_1",
            "title": "Title",
            "content": "Content",
            "metadata": {"category": "...", "date": "..."}
        }]
        """
        self.client = anthropic.Anthropic()
        self.knowledge_base = knowledge_base
        self.model = "claude-sonnet-4-5-20250929"

    def simple_keyword_search(self, query: str, top_k: int = 3) -> List[Dict]:
        """简单的关键词搜索（生产环境应使用向量数据库）"""
        query_terms = set(query.lower().split())
        scores = []

        for doc in self.knowledge_base:
            content = (doc.get("content", "") + " " + doc.get("title", "")).lower()
            content_terms = set(content.split())
            # 计算 Jaccard 相似度
            overlap = len(query_terms & content_terms)
            if overlap > 0:
                score = overlap / len(query_terms | content_terms)
                scores.append((doc, score))

        # 按分数排序并返回 top-k
        scores.sort(key=lambda x: x[1], reverse=True)
        return [doc for doc, score in scores[:top_k]]

    def build_system_prompt_with_context(self, relevant_docs: List[Dict]) -> str:
        """根据检索到的文档构建系统提示词"""
        context_section = "# 参考资料\n\n"

        for i, doc in enumerate(relevant_docs, 1):
            context_section += f"## 资料 {i}: {doc.get('title', 'Untitled')}\n"
            context_section += f"{doc.get('content', '')}\n\n"

        system_prompt = f"""你是一个知识助手，基于以下参考资料回答用户问题。

{context_section}

## 回答指南
1. 优先使用参考资料中的信息
2. 如果参考资料中没有相关信息，明确说明
3. 对于不确定的信息，表达你的不确定性
4. 在引用参考资料时，指明来源"""

        return system_prompt

    def query_with_rag(self, user_query: str) -> Dict:
        """执行 RAG 查询"""
        # 步骤 1: 检索相关文档
        relevant_docs = self.simple_keyword_search(user_query, top_k=3)

        # 步骤 2: 构建带有上下文的系统提示词
        system_prompt = self.build_system_prompt_with_context(relevant_docs)

        # 步骤 3: 调用 Claude API
        response = self.client.messages.create(
            model=self.model,
            max_tokens=2000,
            system=system_prompt,
            messages=[
                {
                    "role": "user",
                    "content": user_query
                }
            ]
        )

        return {
            "query": user_query,
            "relevant_documents": [
                {
                    "title": doc.get("title"),
                    "id": doc.get("id")
                }
                for doc in relevant_docs
            ],
            "answer": response.content[0].text,
            "tokens_used": {
                "input": response.usage.input_tokens,
                "output": response.usage.output_tokens
            }
        }


# 使用示例
if __name__ == "__main__":
    # 构建知识库
    knowledge_base = [
        {
            "id": "doc_1",
            "title": "Claude 能力概览",
            "content": "Claude 3.5 Sonnet 是 Anthropic 推出的通用高性能模型。支持 200K token 上下文窗口，",
        },
        {
            "id": "doc_2",
            "title": "Claude API 定价",
            "content": "Claude 3.5 Sonnet: 输入 $3/百万 token，输出 $15/百万 token。",
        },
        {
            "id": "doc_3",
            "title": "最佳实践",
            "content": "在使用 Claude 时，应该提供清晰的上下文和具体的示例。",
        },
    ]

    # 初始化 RAG 管理器
    rag = RAGContextManager(knowledge_base)

    # 执行查询
    result = rag.query_with_rag("Claude 的输出 token 价格是多少？")
    print(json.dumps(result, ensure_ascii=False, indent=2))

成本优化：使用提示缓存

class CachedRAGManager(RAGContextManager):
    """使用提示缓存优化的 RAG 管理器"""

    def query_with_cached_context(self, user_query: str) -> Dict:
        """使用缓存的知识库上下文进行查询"""

        # 构建知识库的缓存块
        knowledge_content = "# 知识库\n\n"
        for doc in self.knowledge_base:
            knowledge_content += f"## {doc.get('title')}\n{doc.get('content')}\n\n"

        response = self.client.messages.create(
            model=self.model,
            max_tokens=2000,
            system=[
                {
                    "type": "text",
                    "text": "你是一个知识助手。根据以下知识库回答问题。"
                },
                {
                    "type": "text",
                    "text": knowledge_content,
                    "cache_control": {"type": "ephemeral"}  # 标记为缓存块
                }
            ],
            messages=[
                {
                    "role": "user",
                    "content": user_query
                }
            ]
        )

        return {
            "query": user_query,
            "answer": response.content[0].text,
            "cache_usage": {
                "cache_creation_input_tokens": response.usage.cache_creation_input_tokens or 0,
                "cache_read_input_tokens": response.usage.cache_read_input_tokens or 0,
                "input_tokens": response.usage.input_tokens,
                "output_tokens": response.usage.output_tokens
            }
        }

13.3.14 MCP 作为上下文工程的桥梁

MCP（Model Context Protocol）提供了一种标准化的方式来动态地获取和管理上下文。MCP 可以连接到各种数据源（数据库、APIs、文件系统等），为 Claude 提供实时的、结构化的上下文。

MCP 在上下文工程中的角色

用户查询
    ↓
上下文需求分析
    ↓
通过 MCP 查询相关数据源
    ├─ 文件系统 MCP（本地文件）
    ├─ 数据库 MCP（SQL/NoSQL 数据）
    ├─ API MCP（实时数据）
    └─ 其他服务 MCP（Slack、GitHub 等）
    ↓
聚合上下文
    ↓
构建提示词
    ↓
调用 Claude
    ↓
生成答案

实现示例：使用 MCP 获取实时上下文

class MCPContextManager:
    """使用 MCP 协议管理上下文的管理器"""

    def __init__(self):
        self.client = anthropic.Anthropic()
        self.model = "claude-sonnet-4-5-20250929"
        # 在实际应用中，需要配置 MCP 服务器连接
        self.mcp_resources = {}

    def register_mcp_source(self, source_name: str, mcp_config: Dict):
        """注册 MCP 数据源"""
        self.mcp_resources[source_name] = mcp_config

    def fetch_context_from_mcp(self, sources: List[str], query: str) -> Dict:
        """从多个 MCP 数据源获取上下文"""
        context = {}

        for source_name in sources:
            if source_name not in self.mcp_resources:
                continue

            # 模拟从 MCP 源获取数据
            # 在实际应用中，这会通过 MCP 协议调用真实的数据源
            source_config = self.mcp_resources[source_name]

            if source_config["type"] == "file":
                # 从文件系统获取
                context[source_name] = f"文件内容: {source_config['path']}"

            elif source_config["type"] == "database":
                # 从数据库查询
                context[source_name] = f"数据库查询结果: {source_config['query']}"

            elif source_config["type"] == "api":
                # 从 API 获取
                context[source_name] = f"API 响应: {source_config['endpoint']}"

        return context

    def query_with_mcp_context(self, user_query: str, mcp_sources: List[str]) -> Dict:
        """使用 MCP 上下文进行查询"""

        # 获取 MCP 上下文
        mcp_context = self.fetch_context_from_mcp(mcp_sources, user_query)

        # 构建系统提示词
        system_prompt = "你是一个 AI 助手。根据以下实时数据回答问题。\n\n"
        for source, context in mcp_context.items():
            system_prompt += f"[{source}]\n{context}\n\n"

        response = self.client.messages.create(
            model=self.model,
            max_tokens=2000,
            system=system_prompt,
            messages=[
                {
                    "role": "user",
                    "content": user_query
                }
            ]
        )

        return {
            "query": user_query,
            "mcp_sources_used": mcp_sources,
            "answer": response.content[0].text
        }


# 使用示例
if __name__ == "__main__":
    mcp_manager = MCPContextManager()

    # 注册 MCP 数据源
    mcp_manager.register_mcp_source("filesystem", {
        "type": "file",
        "path": "/path/to/documentation"
    })
    mcp_manager.register_mcp_source("database", {
        "type": "database",
        "query": "SELECT * FROM products WHERE category='AI'"
    })

    # 查询
    result = mcp_manager.query_with_mcp_context(
        "我们有哪些 AI 产品？",
        mcp_sources=["filesystem", "database"]
    )
    print(json.dumps(result, ensure_ascii=False, indent=2))

13.3.15 大规模上下文的成本优化

对于处理大量上下文的应用，成本优化至关重要。

策略 1：分层上下文

class HierarchicalContextManager:
    """分层上下文管理，优化成本"""

    def __init__(self):
        self.client = anthropic.Anthropic()
        self.model = "claude-sonnet-4-5-20250929"

    def build_hierarchical_context(self, documents: List[str], query: str) -> str:
        """为大型文档构建分层上下文"""

        # 层级 1: 超级总结（100 tokens）
        overview = self._summarize_documents(documents, max_tokens=100)

        # 层级 2: 相关章节的详细摘要（500 tokens）
        relevant_sections = self._extract_relevant_sections(documents, query, max_tokens=500)

        # 层级 3: 完整内容（仅在前两个不足时包含）

        context = f"""
# 文档概览
{overview}

# 相关章节详情
{relevant_sections}

[如需完整文档内容，请提示]
"""

        return context

    def _summarize_documents(self, documents: List[str], max_tokens: int) -> str:
        """为文档生成超级摘要"""
        # 实现文档总结逻辑
        pass

    def _extract_relevant_sections(self, documents: List[str], query: str, max_tokens: int) -> str:
        """提取与查询相关的章节"""
        # 实现相关性提取逻辑
        pass

策略 2：使用 Batch API 降低成本

对于不需要实时响应的任务，使用 Batch API 可以降低 50% 的成本：

def submit_batch_context_jobs(queries: List[str], knowledge_base: List[Dict]) -> str:
    """提交批量任务到 Batch API"""
    requests = []

    for i, query in enumerate(queries):
        request = {
            "custom_id": f"request-{i}",
            "params": {
                "model": "claude-sonnet-4-5-20250929",
                "max_tokens": 2000,
                "system": f"使用以下知识库回答问题:\n{json.dumps(knowledge_base)}",
                "messages": [
                    {
                        "role": "user",
                        "content": query
                    }
                ]
            }
        }
        requests.append(request)

    # 提交批量请求（伪代码）
    # batch_id = client.beta.batch.submit(requests)
    # 返回 batch_id，稍后可以查询结果

    return "batch_submitted"

第五节实践案例：MCP 驱动的动态上下文工程

13.3.16 场景：实时 GitHub 知识库与 AI 助手集成

背景：使用 MCP 连接 GitHub 知识库，动态获取最新文档和代码作为上下文。

import anthropic
import json
from typing import Optional, List
import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

class MCPContextEngineer:
    """使用 MCP 的动态上下文工程"""

    def __init__(self, model: str = "claude-opus-4-6-20251101"):
        self.client = anthropic.Anthropic()
        self.model = model
        self.context_cache = {}
        self.mcp_tools = self._initialize_mcp_tools()

    def _initialize_mcp_tools(self) -> List[dict]:
        """初始化 MCP 工具定义"""
        return [
            {
                "name": "github_fetch_docs",
                "description": "从 GitHub 仓库获取最新的文档内容",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "repo": {"type": "string", "description": "Repository path (owner/repo)"},
                        "path": {"type": "string", "description": "File or directory path"},
                        "branch": {"type": "string", "description": "Branch name (default: main)"}
                    },
                    "required": ["repo", "path"]
                }
            },
            {
                "name": "github_search_issues",
                "description": "搜索 GitHub issues 和讨论",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "repo": {"type": "string"},
                        "query": {"type": "string", "description": "Search query"},
                        "limit": {"type": "integer", "description": "Max results (default: 5)"}
                    },
                    "required": ["repo", "query"]
                }
            },
            {
                "name": "fetch_recent_code",
                "description": "获取最近提交的代码片段",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "repo": {"type": "string"},
                        "file_type": {"type": "string", "description": "File extension (e.g., .py, .js)"},
                        "limit": {"type": "integer", "description": "Number of files"}
                    },
                    "required": ["repo"]
                }
            }
        ]

    def _execute_mcp_tool(self, tool_name: str, tool_input: dict) -> str:
        """执行 MCP 工具（模拟实现）"""
        logger.info(f"Executing MCP tool: {tool_name} with input: {tool_input}")

        # 在实际应用中，这会调用真实的 MCP 服务
        # 这里是示例实现
        if tool_name == "github_fetch_docs":
            return f"# Documentation for {tool_input['repo']}\n\n[Document content would be fetched from GitHub via MCP]"
        elif tool_name == "github_search_issues":
            return f"Found issues for '{tool_input['query']}':\n- Issue 1: Bug in authentication\n- Issue 2: Performance issue"
        elif tool_name == "fetch_recent_code":
            return "Recent code samples:\n```python\ndef example():\n    pass\n```"
        else:
            return "Tool not recognized"

    def build_dynamic_context(self, user_query: str, repo: str) -> tuple[str, dict]:
        """基于用户查询动态构建上下文"""
        context_parts = []
        mcp_calls = {}

        # 步骤 1：分析查询以确定需要什么上下文
        analysis_response = self.client.messages.create(
            model=self.model,
            max_tokens=500,
            system="""You are a context analyzer. Analyze the user's query and determine what information
would be most helpful. Respond with a JSON object indicating which types of information to fetch:
{
    "needs_docs": boolean,
    "needs_issues": boolean,
    "needs_code": boolean,
    "doc_path": string (optional),
    "search_query": string (optional),
    "code_file_type": string (optional)
}""",
            messages=[
                {
                    "role": "user",
                    "content": f"User query: {user_query}\nRepository: {repo}"
                }
            ]
        )

        try:
            analysis = json.loads(analysis_response.content[0].text)
        except (json.JSONDecodeError, IndexError):
            logger.warning("Failed to parse analysis, using default context")
            analysis = {"needs_docs": True, "needs_code": True}

        # 步骤 2：根据分析获取上下文
        if analysis.get("needs_docs"):
            doc_path = analysis.get("doc_path", "README.md")
            docs = self._execute_mcp_tool("github_fetch_docs", {
                "repo": repo,
                "path": doc_path
            })
            context_parts.append(f"## Repository Documentation\n\n{docs}")
            mcp_calls["github_fetch_docs"] = {"repo": repo, "path": doc_path}

        if analysis.get("needs_issues"):
            search_query = analysis.get("search_query", user_query)
            issues = self._execute_mcp_tool("github_search_issues", {
                "repo": repo,
                "query": search_query,
                "limit": 3
            })
            context_parts.append(f"## Related Issues\n\n{issues}")
            mcp_calls["github_search_issues"] = {"repo": repo, "query": search_query}

        if analysis.get("needs_code"):
            code = self._execute_mcp_tool("fetch_recent_code", {
                "repo": repo,
                "file_type": analysis.get("code_file_type", ".py"),
                "limit": 3
            })
            context_parts.append(f"## Code Examples\n\n{code}")
            mcp_calls["fetch_recent_code"] = {"repo": repo}

        context = "\n\n".join(context_parts)
        return context, mcp_calls

    def answer_with_dynamic_context(self, user_query: str, repo: str) -> str:
        """使用动态上下文回答用户查询"""
        logger.info(f"Building dynamic context for query: {user_query}")

        # 构建动态上下文
        context, mcp_calls = self.build_dynamic_context(user_query, repo)

        # 使用上下文调用 Claude
        response = self.client.messages.create(
            model=self.model,
            max_tokens=2000,
            system=f"""You are a helpful assistant for the {repo} project.
Use the provided context (documentation, issues, and code) to answer questions accurately.
When referencing information from the context, cite the source.""",
            messages=[
                {
                    "role": "user",
                    "content": f"""Context from repository:

{context}

---

User question: {user_query}"""
                }
            ]
        )

        answer = response.content[0].text

        logger.info(f"MCP calls made: {json.dumps(mcp_calls, indent=2)}")
        return answer

# 使用示例
if __name__ == "__main__":
    engineer = MCPContextEngineer()

    # 示例查询
    query = "How do I authenticate users in this project?"
    repo = "anthropics/anthropic-sdk-python"

    answer = engineer.answer_with_dynamic_context(query, repo)
    print(f"Answer:\n{answer}")

13.3.17 MCP 上下文工程的优势

动态性：

上下文始终是最新的（从实时源获取）
能够在对话过程中调整上下文

相关性：

基于查询的智能选择（分析器确定需要什么）
避免不相关的信息堵塞

可扩展性：

易于添加新的 MCP 源（数据库、API、文件系统）
支持复杂的跨源查询

成本效益：

只获取必需的信息
减少冗余和低质量的上下文

13.3.18 与传统 RAG 的对比

特性

传统 RAG

MCP 驱动

数据来源

静态知识库

动态，多个 MCP 源

更新频率

定期重新索引

实时

上下文选择

向量相似度

智能分析 + 向量搜索

集成成本

需要 embedding 模型

标准 MCP 协议

适用场景

静态文档

快速变化的数据

第六节总结与最佳实践

13.3.19 上下文工程的黄金法则

相关性优先：不是信息越多越好，而是信息越相关越好
一致性维护：确保不同部分的上下文在逻辑上一致
清晰结构化：使用明确的标签和分隔符组织上下文
定期更新：保持上下文的时效性和准确性
可测量优化：建立指标来衡量上下文质量的改进

13.3.20 迁移检查清单

从提示词工程迁移到上下文工程：

评估当前任务的数据需求
建立高质量的上下文来源
实现上下文检索和选择机制
测试上下文压缩的有效性
建立上下文的版本控制
建立性能监测指标
定期进行上下文质量审计

13.3.21 工具和资源

推荐工具：

向量数据库：Pinecone、Weaviate、Milvus
文档处理：LangChain、LlamaIndex
MCP 平台：Anthropic 官方 MCP 服务器

深入学习：

Anthropic 官方文档：https://docs.anthropic.com
MCP 规范：https://modelcontextprotocol.io
关于上下文工程的研究论文

上下文工程代表了与 LLM 交互的一个范式转变。通过关注提供什么信息、而非如何指示，我们能够更有效地利用 Claude 的能力，构建更强大、更可靠的 AI 应用。

上一页13.2 Infinite Chats 实战指南下一页本章小结

最后更新于2小时前

hashtagContext Engineering 概览：从提示词工程到上下文工程

hashtag序言

hashtag第一节 核心概念：为什么上下文工程正在取代提示词工程

hashtag13.3.1 问题的演进

hashtag13.3.2 上下文工程的定义

hashtag13.3.3 两种范式的对比

hashtag13.3.4 为什么现在转向上下文工程

hashtag第二节 上下文工程的四大策略

hashtag13.3.5 策略一：写入（Writing）- 制造高质量的上下文

hashtag13.3.6 策略二：选择（Selection）- 检索相关的上下文

hashtag13.3.7 策略三：压缩（Compression）- 去除冗余，保留精华

hashtag13.3.8 策略四：隔离（Isolation）- 分离不同类型的上下文

hashtag第三节 Claude 中的上下文工程实践

hashtag13.3.9 系统提示词的精心设计

hashtag13.3.10 使用 XML 结构化上下文

hashtag13.3.11 RAG（检索增强生成）与上下文工程的结合

hashtag13.3.12 MCP 与上下文工程的集成

hashtag第四节 实际案例分析

hashtag案例 1：技术文档的实时答疑系统

hashtag案例 2：代码审查系统

hashtag第四节（扩展）：实践应用 - RAG 与 MCP 的上下文管理

hashtag13.3.13 RAG（检索增强生成）的上下文工程

hashtag13.3.14 MCP 作为上下文工程的桥梁

hashtag13.3.15 大规模上下文的成本优化

hashtag第五节 实践案例：MCP 驱动的动态上下文工程

hashtag13.3.16 场景：实时 GitHub 知识库与 AI 助手集成

hashtag13.3.17 MCP 上下文工程的优势

hashtag13.3.18 与传统 RAG 的对比

hashtag第六节 总结与最佳实践

hashtag13.3.19 上下文工程的黄金法则

hashtag13.3.20 迁移检查清单

hashtag13.3.21 工具和资源