9.3 RAG 系统的提示词优化

在 RAG 系统中，Retrieval 负责“找对资料”，而 Prompt Engineering 负责“用好资料”。一个设计低劣的提示词会让高质量的检索结果付诸东流。本节将深入探讨 RAG 专用的提示词设计模式。

1. 上下文注入格式：Context Injection

最直观的问题是：如何将检索到的文档片段喂给模型？

❌ 错误示范：直接拼接，不加分隔符。

参考资料：
这是文档 1 的内容...这是文档 2 的内容...
用户问题：...

模型很难区分文档的边界，容易造成信息混淆。

✅ 最佳实践：使用明确的分隔符或 XML 标签（Claude 和 GPT-4 对 XML 标签支持极佳）。

Here are the retrieved documents relevant to the user's question:

<documents>
    <document index="1">
        <source>employee_handbook.pdf</source>
        <content>报销必须在费用发生后的 30 天内提交...</content>
    </document>
    <document index="2">
        <source>travel_policy.docx</source>
        <content>所有超过 5000 元的差旅费需要 VP 审批...</content>
    </document>
</documents>

这种结构化的注入方式有三个好处：

边界清晰：模型能明确区分不同文档。
便于引用：每个文档都有 index，方便要求模型引用。
元数据感知：可以包含 source、date 等元数据，帮助模型判断时效性。

2. 引用与溯源

RAG 的核心价值在于“有据可查”。我们需要强制模型在回答中标注来源。

提示词策略：

"Answer the user's question using only the information in the <documents>. You must cite the document index for every statement you make, using the format [index]. For example: 'Expenses must be submitted within 30 days [1].'"

验证技巧：为了防止模型胡乱引用（例如引用了文档 [3] 但其实只给了 2 个文档），可以在后处理阶段编写简单的正则表达式代码，校验模型输出的 [index] 是否在提供的文档范围内。

3. 拒绝幻觉

当检索结果不包含答案时，模型倾向于利用自己的预训练知识去“脑补”一个答案，这对 RAG 系统通常是不可接受的。

防御性提示词：

"If the provided documents do not contain the necessary information to answer the question, you must say 'The provided documents do not contain this information.' Do NOT try to answer from your own knowledge. Do NOT make up an answer."

思维链强化：要求模型在回答前先检查文档覆盖率：

"Thought: First, check if the documents contain the answer to the user's specific question. If yes, extract the info. If no, output the refusal message."

4. 答案优化与合成

综合冲突信息

当不同文档在同一事实上有出入时（如旧文档说 A，新文档说 B）：

"If there are conflicting statements in the documents, prioritize the document with the more recent date metadata. If dates are not available, mention the conflict in your answer."

风格统一

RAG 的检索结果往往风格迥异（有的来自法条，有的来自聊天记录）。提示词需要统一下一步生成的语调：

"Synthesize the information into a coherent, professional response suitable for a customer service email. Avoid copying the text verbatim; rephrase it to be more natural."

9.3.1 完整模板示例


# Role

You are an expert support assistant for [Company Name].

# Task

Answer the user's question based strictly on the provided context.

# Context

<documents>
{context_str}
</documents>

# Rules

1. **Evidence-Based**: Every sentence in your answer must be supported by the context.
2. **Citation**: Cite sources using [index] format at the end of sentences.
3. **No Hallucination**: If the answer is not in the context, say "I cannot find the answer in the knowledge base."
4. **Tone**: Helpful, concise, and professional.

# User Question

{query_str}

# Answer

完整 API 调用示例

以下示例展示如何将上述 RAG 提示词模板与 API 调用结合，实现一个端到端的 RAG 问答流程：

“””
rag_api_example.py
使用 Anthropic Claude API 的 RAG 问答完整示例。
依赖: pip install anthropic
“””
import anthropic

# ── 1. 模拟检索结果（实际场景中来自向量数据库）──

retrieved_docs = [
    {“source”: “employee_handbook.pdf”, “content”: “报销必须在费用发生后的 30 天内提交。超过 30 天的报销申请需要部门经理特批。”},
    {“source”: “travel_policy.docx”, “content”: “国内差旅住宿标准：一线城市不超过 800 元/晚，其他城市不超过 500 元/晚。”},
]

# ── 2. 构造 RAG 提示词 ──

def build_rag_prompt(docs: list, query: str) -> str:
    doc_xml = “\n”.join(
        f'    <document index=”{i+1}”>\n'
        f'        <source>{d[“source”]}</source>\n'
        f'        <content>{d[“content”]}</content>\n'
        f'    </document>'
        for i, d in enumerate(docs)
    )
    return f”””Here are the retrieved documents relevant to the user's question:

<documents>
{doc_xml}
</documents>

# Rules

1. **Evidence-Based**: Every sentence must be supported by the documents above.
2. **Citation**: Cite sources using [index] format.
3. **No Hallucination**: If the answer is not in the documents, say “根据现有知识库，我无法找到相关答案。”
4. **Tone**: Helpful, concise, professional.

# User Question

{query}

# Answer”””

# ── 3. 调用 API ──

client = anthropic.Anthropic()  # 使用 ANTHROPIC_API_KEY 环境变量

query = “报销的截止时间是多久？”
prompt = build_rag_prompt(retrieved_docs, query)

response = client.messages.create(
    model=”claude-sonnet-4-6”,
    max_tokens=1024,
    messages=[{“role”: “user”, “content”: prompt}]
)

print(response.content[0].text)

# 预期输出：报销必须在费用发生后的 30 天内提交 [1]。如果超过 30 天，需要获得部门经理的特批 [1]。

[!TIP] 在生产环境中，建议将 System Prompt（角色定义和规则）放入 system 参数，将检索文档和用户问题放入 messages，以便利用提示词缓存降低重复调用的成本。

动手试试

试着写一条 RAG 系统提示词，要求模型”仅根据以下文档回答，如果文档中找不到答案则明确说明”——然后用一个文档未覆盖的问题测试它。
在你的 RAG 系统中，模型是否会忽略检索结果而依赖自身知识回答？如果会，你如何在提示词中加强”忠实度”？

上一页9.2 高级检索策略与上下文组装下一页9.4 高级 RAG 架构与前沿趋势

最后更新于 10天前

hashtag1. 上下文注入格式：Context Injection

hashtag2. 引用与溯源

hashtag3. 拒绝幻觉

hashtag4. 答案优化与合成

hashtag综合冲突信息

hashtag风格统一

hashtag9.3.1 完整模板示例

hashtag完整 API 调用示例

hashtag动手试试