# 第十章 多模态提示工程

多模态 AI 模型可以理解和处理文本、图像、音频、视频等多种形式的输入。本章将介绍如何设计有效的多模态提示词，充分发挥这些模型的能力。

***

## 本章目标

* 理解本章核心概念与适用场景
* 掌握可复用的提示词/工作流模式
* 能将方法迁移到自己的任务中

## 先修知识

* 建议先阅读上一章或同等基础内容
* 如涉及代码示例，具备基本编程与 API 调用常识

## 本章内容

* [10.1 多模态模型概述](/prompt_engineering_guide/di-san-bu-fen-gao-ji-ying-yong-pian/10_multimodal/10.1_multimodal_overview.md)
* [10.2 图像理解与视觉提示](/prompt_engineering_guide/di-san-bu-fen-gao-ji-ying-yong-pian/10_multimodal/10.2_image_prompting.md)
* [10.3 音频与视频处理](/prompt_engineering_guide/di-san-bu-fen-gao-ji-ying-yong-pian/10_multimodal/10.3_audio_video.md)
* [10.4 跨模态推理与融合](/prompt_engineering_guide/di-san-bu-fen-gao-ji-ying-yong-pian/10_multimodal/10.4_cross_modal_reasoning.md)
* [10.5 多模态提示词工程进阶：融合文本、图像、音频与视频](/prompt_engineering_guide/di-san-bu-fen-gao-ji-ying-yong-pian/10_multimodal/10.5_multimodal_prompting_advanced.md)
* [10.6 本章实战练习](/prompt_engineering_guide/di-san-bu-fen-gao-ji-ying-yong-pian/10_multimodal/10.6_practice.md)
* [本章小结](/prompt_engineering_guide/di-san-bu-fen-gao-ji-ying-yong-pian/10_multimodal/summary.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://yeasy.gitbook.io/prompt_engineering_guide/di-san-bu-fen-gao-ji-ying-yong-pian/10_multimodal.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
