# 本章小结：从 Chatbot 到 Agent

### 本章小结：从 Chatbot 到 Agent

Computer Use 是 AI 发展史上的一个分水岭。它标志着大模型不再满足于仅仅做一个“聊天机器人”，而是开始长出手脚，尝试成为现实世界的“操作者 (Agent)”。

#### 核心知识点回顾

**范式转移**

* **API-First vs GUI-First**：API 是给机器看的，GUI 是给人看的。Computer Use 打破了这堵墙，让 AI 能像人一样通过 GUI 工作，解决了 99% 的软件没有 API 的集成难题。
* **反脆弱性**：基于视觉的识别（“点击那个蓝色的登录按钮”）比基于代码的选择器（`#login-btn-v2`）更具适应性。

**工作原理**

* 这是一个 **OODA 循环**：观察(截图) -> 判断(视觉模型) -> 决策(ReAct) -> 行动(鼠标/键盘)。
* **三大工具**：`computer` (操作键鼠), `bash` (命令行), `editor` (文本编辑) 构成了 Claude 的工具腰带。

**安全至上**

* Computer Use 极其强大，也极其危险。
* **Docker 隔离** 是底线，绝对不可裸奔运行。
* **HITL (人机回环)** 是关键操作的保险丝。

**最佳实践**

* **混合架构**：不要为了用而用。能用 SQL 查库就别让 AI 去点数据库管理软件的 GUI。API + Computer Use = 最佳 ROI。
* **提示工程**：在此模式下，Prompt 需要包含屏幕分辨率、视觉特征描述（Visual Grounding）等新要素。

#### 开发者自检清单

* [ ] **沙箱环境**：我是否在 Docker 中运行 Demo，而不是在我的个人 MacBook 主机上？
* [ ] **权限控制**：我是否限制了容器的网络访问，防止数据外泄？
* [ ] **成本意识**：我是否意识到每一步操作都伴随着高额的截图 Token 消耗？
* [ ] **容错设计**：我的代码里是否包含了 `try-retry` 逻辑，以应对 UI 加载慢或点击失效的情况？

#### 下一站：职场进阶

掌握了 Computer Use，Claude 已经像一个刚入职的实习生，能看懂屏幕，也会点鼠标了。 但要让它胜任高级岗位，还需要教它一些特定领域的“专业技能”，比如怎么写高质量的 Python 代码，或者怎么撰写符合 SEO 规范的文章。

下一章，将为 Claude 装备 **Skills**。

### ➡️ [第六章：Skills 技能系统](/claude_guide/di-san-bu-fen-jin-jie-pian/06_skills.md)

> 📝 **发现错误或有改进建议？** 欢迎提交 [Issue](https://github.com/yeasy/claude_guide/issues) 或 [PR](https://github.com/yeasy/claude_guide/pulls)。


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://yeasy.gitbook.io/claude_guide/di-er-bu-fen-gong-ju-pian/05_computer_use/summary.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.