# 本章小结

本章深入解析了大语言模型的技术原理，从历史演进到核心机制，再到训练方法和主流模型。

## 核心要点回顾

**技术演进**

* 远古时代：Eliza（1966）靠 If/Else 套公式
* 统计时代：Siri（2011）靠统计学“查表”
* 大模型时代：GPT（2018-）靠大规模预训练实现质的飞跃

**工作原理**

* LLM 的核心任务是预测下一个 Token（Next Token Prediction）
* Token 是分词器切出来的子词片段，不等于一个字或一个词
* 温度参数控制输出随机性：低温严谨，高温创意

**训练范式**

* 预训练（“上大学”）：在海量数据上学习语言和知识，极其昂贵
* 微调 SFT（“岗前培训”）：用问答对教它遵循指令，成本较低
* RLHF（“实习”）：通过人类偏好排序对齐人类价值观
* 落地捷径：RAG（检索增强）、工具调用、工作流编排，通常比微调更快

**主流模型**

* 闭源三巨头：OpenAI（GPT）、Anthropic（Claude）、Google（Gemini）
* 开源生态：LLaMA、Qwen、Mistral、Gemma 等多条路线
* 国内竞争激烈：通义千问、文心、DeepSeek、Kimi 各有所长

**工程落地（推理机制）**

* **推理（Inference）**：让大模型在云端跑起来的过程，极度依赖显存带宽和分布式网络。
* **算力 vs 显存**：长文阅读阶段吃算力，逐字生成阶段吃显存和带宽。
* **KV Cache**：大模型的“记事本”，极度消耗显存限制并发数。

## 下章预告

下一章将介绍推理模型与推理计算，探讨模型如何通过显式思考、搜索、工具调用和更长计算预算解决复杂问题。

***

> 📝 **发现错误或有改进建议？** 欢迎提交 [Issue](https://github.com/yeasy/ai_beginner_guide/issues) 或 [PR](https://github.com/yeasy/ai_beginner_guide/pulls)。


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://yeasy.gitbook.io/ai_beginner_guide/di-er-bu-fen-he-xin-ji-shu-jie-xi/06_llm/summary.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.