# 9.5 解码侧的推理时扩展：生成、搜索与验证

本节只从解码角度定位推理时计算扩展，避免与 [14.6 节](/llm_internals/di-si-bu-fen-mo-xing-yu-qian-yan-pian/14_future_trends/14.6_test_time_scaling.md) 重复展开。前面几节讨论的贪心、束搜索、采样和约束解码，都主要回答“下一步选哪个词元”；推理时扩展则回答“是否要为同一个问题投入更多生成、验证或搜索计算”。

从解码系统看，它主要有三类形态：

* **更长的单路径生成**：让模型输出更完整的思维链或草稿，再给出答案
* **多路径采样与选择**：并行生成多个候选，用多数投票、奖励模型或验证器选择
* **结构化搜索**：把候选推理组织成树或图，在中间步骤剪枝、回溯和合并

这些方法会提高单次请求成本和延迟，但在数学、代码、规划等难题上可能显著提升答案质量。它们与采样策略互补：采样决定每条路径如何展开，推理时扩展决定展开多少路径、是否验证以及如何分配计算预算。

**详细内容与前沿方向参阅** [14.6 节](/llm_internals/di-si-bu-fen-mo-xing-yu-qian-yan-pian/14_future_trends/14.6_test_time_scaling.md)，其中系统讨论思维链、ICL、ToT/GoT、GRPO、验证策略、隐空间推理和并行推理。


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://yeasy.gitbook.io/llm_internals/di-san-bu-fen-tui-li-yu-bu-shu-pian/09_decoding/9.5_test_time_scaling.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.