# 4.1 工具使用概述与分类

工具是智能体扩展能力边界的核心手段。理解工具的分类和设计原则，是构建强大智能体系统的基础。

## 4.1.1 什么是工具

在智能体系统中，**工具（Tool）** 是一个可被模型调用的函数或服务，它接收结构化输入，执行特定操作，并返回结果。以下是工具的基本结构定义：

```python
# 一个简单的工具定义

class Tool:
    name: str                    # 工具名称
    description: str             # 工具描述
    parameters: dict             # 参数定义

    def run(self, **kwargs):     # 执行函数
        pass
```

## 4.1.2 为什么工具使用如此重要

大语言模型虽然强大，但存在固有局限：

| 局限       | 示例       | 工具解决方案   |
| -------- | -------- | -------- |
| 知识截止日期   | 不知道最新新闻  | 网页搜索工具   |
| 无法计算     | 复杂数学容易出错 | 计算器/代码执行 |
| 无法访问私有数据 | 不知道用户文件  | 文件系统工具   |
| 无法执行操作   | 不能发送邮件   | API 调用工具 |
| 无法感知环境   | 不知道当前时间  | 系统信息工具   |

## 4.1.3 工具使用的核心流程

智能体使用工具通常遵循一个基本的 “感知-思考-行动” 循环：

### 基本循环

智能体使用工具的核心是一个循环过程：分析需求、选择工具、生成参数、执行工具、处理结果。以下代码展示了这一基本流程：

```python
while not task_completed:
    # 1. 理解当前状态和目标

    analysis = llm.analyze(context, goal)

    # 2. 决定是否需要使用工具

    if needs_tool(analysis):
        # 3. 选择合适的工具

        tool = select_tool(analysis, available_tools)

        # 4. 生成工具参数

        params = generate_params(analysis, tool)

        # 5. 执行工具

        result = execute_tool(tool, params)

        # 6. 将结果反馈给模型

        context.add(f"工具返回：{result}")
    else:
        # 直接生成回答

        response = llm.generate(context)
        return response
```

### 工具选择的决策过程

```mermaid
graph TD
    %% Agentic Design System
    classDef user fill:#fff7e6,stroke:#fa8c16,stroke-width:2px;
    classDef agent fill:#e6f7ff,stroke:#1890ff,stroke-width:2px;
    classDef tool fill:#f6ffed,stroke:#52c41a,stroke-width:2px;
    classDef error fill:#fff1f0,stroke:#ff4d4f,stroke-width:2px,stroke-dasharray: 5 5;

    UserRequest(["用户请求：'现在北京天气怎么样？'"]) --> Analysis
    Analysis[/"分析需求：<br>1. 需要实时天气数据<br>2. 这不是我的训练数据能回答的<br>3. 需要使用外部工具"/]
    Analysis --> Tools

    subgraph AvailableTools [可用工具]
        T1["search_web: 通用搜索 (不够精确)"]
        T2["get_weather: 获取天气 (最合适)"]
        T3["calculator: 计算器 (不相关)"]
    end

    Tools --> T2
    T2 --> Call["工具调用：<br>get_weather(location='北京')"]
    Call --> Result(["结果处理 → 生成回答"])

    class UserRequest user;
    class Analysis agent;
    class T1,T3 error;
    class T2 tool;
    class Call agent;
    class Result user;
```

图 4-1：工具选择决策流程

## 4.1.4 工具与提示词的区别

| 维度    | **提示词** | **工具**  |
| ----- | ------- | ------- |
| 执行者   | 模型自身    | 外部系统    |
| 能力边界  | 模型内在知识  | 可无限扩展   |
| 结果确定性 | 可能产生幻觉  | 确定性结果   |
| 实时性   | 训练数据截止  | 可获取实时信息 |

## 4.1.5 工具的分类体系

根据功能和用途，智能体工具主要可以分为五大类别：

### 信息获取类

从外部源获取信息的工具。包括 Web 搜索、数据库查询、API 调用、文档检索等。

```python
def search_web(query: str, num_results: int = 5) -> List[SearchResult]:
    """搜索网页获取信息"""
    pass

def query_database(sql: str) -> QueryResult:
    """执行 SQL 查询"""
    pass
```

其他工具：`call_api`（调用外部 API）、`retrieve_documents`（从知识库检索文档）。

### 计算执行类

执行计算和数据处理的工具。包括数学计算、代码执行、数据分析。工具集合：`calculate`、`run_python`、`analyze_data`。

### 内容生成类

生成特定格式内容的工具。包括图像生成、图表生成、文档生成。工具集合：`generate_image`、`create_chart`、`generate_pdf`。

### 环境交互类

与运行环境交互的工具，包括文件操作、命令行操作、浏览器控制等。

```python
def read_file(path: str) -> str:
    """读取文件内容"""
    pass

def run_shell(command: str) -> ShellResult:
    """执行 shell 命令"""
    pass
```

其他工具：`write_file`（写入文件）、`browser_navigate`（打开网页）。

### 通信类

与外部实体通信的工具。包括发送消息、调用其他智能体、人机交互。工具集合：`send_email`、`send_slack`、`call_agent`、`ask_human`。

## 4.1.6 按接口类型分类

工具可根据执行环境和调用方式进行区分，主要分为客户端工具和服务端工具：

| 维度       | 客户端工具                                      | 服务端工具                                |
| -------- | ------------------------------------------ | ------------------------------------ |
| **执行位置** | 用户设备                                       | 远程服务器                                |
| **访问资源** | 本地文件、系统命令、本地数据库                            | 代码解释器、Web 搜索、第三方 API                 |
| **优点**   | 低延迟、完全访问本地资源                               | 安全可控、统一管理                            |
| **缺点**   | 安全风险、依赖环境配置                                | 无法访问本地资源、网络依赖                        |
| **示例**   | `read_file`、`run_shell`、`browser_navigate` | `call_api`、`search_web`、`run_python` |

## 4.1.7 工具设计原则

为了确保智能体能够准确、高效地使用工具，在设计工具时应遵循以下核心原则。这些原则旨在降低模型的认知负担，提高工具调用的成功率和系统的整体稳定性。

每个工具只做一件事：

```python
# ✅ 好的设计

tools = [
    Tool(name="search_web", description="搜索网页"),
    Tool(name="read_file", description="读取文件"),
    Tool(name="write_file", description="写入文件"),
]

# ❌ 不好的设计

tools = [
    Tool(name="file_operation", description="文件操作，包括读取、写入、删除等"),
]
```

### 清晰的描述

让模型准确理解何时使用：

```python
# ✅ 好的描述

Tool(
    name="calculate",
    description="""执行数学计算。

    适用场景：
    - 简单四则运算
    - 复杂数学表达式
    - 统计计算（均值、标准差等）

    不适用场景：
    - 需要推理的逻辑题
    - 需要外部数据的计算

    参数格式：标准数学表达式，如 "2 + 3 * 4" 或 "sqrt(16)"
    """
)

# ❌ 模糊的描述

Tool(
    name="calculate",
    description="计算"
)
```

### 明确的参数定义

使用 JSON Schema 规范参数：

```python
Tool(
    name="send_email",
    parameters={
        "type": "object",
        "properties": {
            "to": {
                "type": "string",
                "description": "收件人邮箱地址",
                "pattern": r"^[\w\.-]+@[\w\.-]+\.\w+$"
            },
            "subject": {
                "type": "string",
                "description": "邮件主题",
                "maxLength": 200
            },
            "body": {
                "type": "string",
                "description": "邮件正文，支持 Markdown 格式"
            },
            "cc": {
                "type": "array",
                "items": {"type": "string"},
                "description": "抄送列表（可选）"
            }
        },
        "required": ["to", "subject", "body"]
    }
)
```

### 友好的错误信息

返回可操作的错误提示：

```python
class ToolResult:
    success: bool
    data: Any = None
    error: str = None
    error_code: str = None
    suggestion: str = None

# 示例返回

ToolResult(
    success=False,
    error="API 调用失败：认证错误",
    error_code="AUTH_ERROR",
    suggestion="请检查 API Key 是否正确配置"
)
```

### 幂等性

相同输入产生相同结果：

```python
# ✅ 幂等操作

def get_weather(city: str) -> WeatherInfo:
    """获取天气（幂等）"""
    return weather_api.get(city)

# ⚠️ 非幂等操作（需要特殊处理）

def send_message(to: str, content: str) -> bool:
    """发送消息（非幂等）"""
    # 添加幂等性 ID

    idempotency_key = hash(f"{to}:{content}:{timestamp}")
    return message_api.send(to, content, idempotency_key)
```

## 4.1.8 工具组合模式

在面对复杂的现实任务时，往往无法仅靠单一工具解决问题。通过将多个工具按照特定的逻辑进行组合，可以构建出更强大的工作流，以应对更高级的挑战。常见的组合模式包括链式模式、分支模式和并行模式。

### 链式模式

多个工具按顺序调用：

```mermaid
graph LR
    %% Agentic Design System
    classDef tool fill:#f6ffed,stroke:#52c41a,stroke-width:2px;

    A[搜索新闻] --> B[提取关键信息]
    B --> C[生成摘要]
    C --> D[发送邮件]

    class A,B,C,D tool;
```

图 4-2：链式组合模式

### 分支模式

根据条件选择不同工具：

```mermaid
graph LR
    %% Agentic Design System
    classDef agent fill:#e6f7ff,stroke:#1890ff,stroke-width:2px;
    classDef tool fill:#f6ffed,stroke:#52c41a,stroke-width:2px;

    Query[查询数据] --> Local[本地文件]
    Query --> Remote[远程 API]
    Query --> DB[数据库]

    Local --> Read[read_local_file]
    Remote --> Call[call_api]
    DB --> QueryDB[query_database]

    class Query agent;
    class Local,Remote,DB agent;
    class Read,Call,QueryDB tool;
```

图 4-3：分支组合模式

### 并行模式

同时调用多个独立工具：

```python
async def parallel_search(query: str):
    results = await asyncio.gather(
        search_web(query),
        search_docs(query),
        search_code(query)
    )
    return merge_results(results)
```

## 4.1.9 为 AI 设计的信息架构

现有的 CLI 工具是为人设计的（人类可读的输出、隐式的上下文），而 AI 需要更明确的信息架构。将来支持 AI 调用的工具，需要支持结构化的输出、显式的上下文和进度感知。

### 结构化输出

* **Human**: `ls -l` (表格视图)
* **AI**: `ls --json` (JSON 格式，易于解析，无歧义)

### 显式上下文

当工具报错时，必须提供足够的上下文让智能体自行修复：

* ❌ `Error: file not found`
* ✅ `Error: file 'main.py' not found in current directory '/src/utils'. Did you mean '/src/app/main.py'?`

### 进度感知

对于长输出（如日志），不要让智能体盲目猜测。

* 提供 `head/tail` 的同时，告知总行数。
* 允许智能体缓存输出到文件，而不是塞满上下文。

***

**下一节**: [4.2 工具使用机制](/agentic_ai_guide/di-yi-bu-fen-dan-ti-zhi-neng-jia-gou/04_tools/4.2_tool_use.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://yeasy.gitbook.io/agentic_ai_guide/di-yi-bu-fen-dan-ti-zhi-neng-jia-gou/04_tools/4.1_overview.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.