# 9.2 贪心搜索与束搜索：确定性与近似搜索

## 9.2.1 贪心搜索

**贪心搜索**（Greedy Search）是最简单的策略：每一步选择概率最高的词元。

$$x\_t = \arg\max\_{x} P(x | x\_1, \dots, x\_{t-1})$$

贪心搜索速度快且确定性强，但有一个根本缺陷：**局部最优不等于全局最优。** 在某一步选择概率最高的词元，可能导致后续步骤进入较差的生成路径。

## 9.2.2 束搜索如何扩大搜索范围

**束搜索**（Beam Search）通过同时维护 $$B$$ 条（$$B$$ 称为束宽）最优候选序列来缓解贪心搜索的局限。每一步，为每条候选序列扩展所有可能的下一个词元，保留总分数最高的 $$B$$ 条序列。束搜索维护的是部分序列 $$x\_{1:t}$$ 的累积对数概率：

$$s(x\_{1:t}) = \sum\_{i=1}^{t} \log P(x\_i | x\_{1:i-1})$$

在第 $$t$$ 步，保留分数最高的 $$B$$ 条序列，每条序列生成 $$|V|$$ 个候选（$$V$$ 为词汇表），再从 $$B \times |V|$$ 个候选中选出分数最高的 $$B$$ 条。

束搜索在机器翻译等有明确“正确答案”的任务中效果显著——通过扩大搜索范围，更可能找到高概率候选。但固定束宽搜索仍是启发式近似：早期被剪掉的序列不会再回来，长度惩罚、停止条件和重排序方式都会影响最终结果，它不保证找到真正的全局最优输出。对于创造性文本生成，束搜索往往产生**重复、无趣、过于保守**的输出——因为最高概率的序列并不总是最“好”的回答。

束搜索的计算成本和延迟是贪心搜索的 $$B$$ 倍，对于实时对话场景通常不实用。


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://yeasy.gitbook.io/llm_internals/di-san-bu-fen-tui-li-yu-bu-shu-pian/09_decoding/9.2_greedy_beam.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.