# 7.6 Agents Rule of Two 与智能体安全设计原则

“Rule of Two” 是 Meta 在 2025 年提出的一项重要 AI 智能体安全实践指导。公开可访问的转述显示，这一原则的核心不是“所有不可逆操作都必须由两个 Agent 批准”，而是：**在单个 session 中，一个 agent 最多只能同时满足以下三项性质中的两项：处理不可信输入、访问敏感系统/私有数据、改变状态或对外通信；若三项都需要，则至少要加入监督或可靠验证。** 本节将围绕这一更准确的定义来讨论其设计含义。

## 7.6.1 Rule of Two 的核心原则

### 基本概念

```
Rule of Two:
在单个 session 中，一个 agent 最多同时满足以下三项中的两项：
1. 处理不可信输入
2. 访问敏感系统或私有数据
3. 改变状态或对外通信

如果三项都需要，则至少要有监督或可靠验证。
```

这一原则可以类比“关键操作需要独立验证”的传统安全思想，但在 AI 场景中更准确的落点是：**不要让单个 agent 在同一 session 中同时具备看不可信输入、访问敏感资源、执行高影响动作这三项能力。**

### 原则的哲学基础

```mermaid
graph TD
    A["单个系统的风险"] -->|内在问题| A1["幻觉现象"]
    A -->|内在问题| A2["逻辑推理错误"]
    A -->|内在问题| A3["上下文理解偏差"]

    B["单个系统的防御不足"] -->|原因| B1["无法自我纠正"]
    B -->|原因| B2["错误认知强化"]
    B -->|原因| B3["系统性偏见"]

    C["引入监督或可靠验证"] -->|优势| C1["独立验证"]
    C -->|优势| C2["错误发现概率↑"]
    C -->|优势| C3["系统鲁棒性↑"]

    style A fill:#ffcccc
    style B fill:#ffcccc
    style C fill:#ccffcc
```

图 7-20：Rule of Two 的理论基础

### Rule of Two 与其他安全实践的关系

| 实践          | 焦点        | 与 Rule of Two 的关系             |
| ----------- | --------- | ----------------------------- |
| 权限最小化       | 限制单个系统的权限 | 辅助：减少单个系统的威胁面                 |
| 审计日志        | 记录所有操作    | 补充：提供事后追踪                     |
| 工具调用沙盒      | 隔离执行环境    | 辅助：限制操作影响                     |
| Rule of Two | **前置约束**  | **核心：避免单个 Agent 同时具备三类高风险能力** |

## 7.6.2 不可逆操作的安全网关设计

### 不可逆操作的定义和分类

```mermaid
graph TD
    A["操作分类"] -->|可逆| B["读取操作"]
    A -->|可逆| C["临时修改<br/>可恢复"]

    A -->|不可逆| D["永久删除"]
    A -->|不可逆| E["资金转账"]
    A -->|不可逆| F["权限降级<br/>难以恢复"]
    A -->|不可逆| G["生产系统修改<br/>大范围影响"]

    style D fill:#ff9999
    style E fill:#ff9999
    style F fill:#ff9999
    style G fill:#ff9999
```

图 7-21：操作可逆性分类

### 不可逆操作的风险矩阵

| 操作类型   | 影响范围      | 恢复成本       | 风险等级   |
| ------ | --------- | ---------- | ------ |
| 账户删除   | 用户数据完全丧失  | 极高（可能无法恢复） | **关键** |
| 资金转账   | 资金直接丧失    | 极高（需法律介入）  | **关键** |
| 生产数据删除 | 业务中断，数据丧失 | 很高（需完整备份）  | **关键** |
| 权限撤销   | 用户功能受限    | 中高（需重新审批）  | **高**  |
| 配置更改   | 系统行为改变    | 中（需回滚）     | **中高** |

### 安全网关架构

不可逆操作通常需要经过多层安全网关，并在必要时引入额外监督或可靠验证；双重验证只是其中一种实现方式：

```mermaid
graph LR
    A["用户请求<br/>delete_account"] -->|提交| B["Agent 1<br/>Primary Handler"]

    B -->|分析请求| C{"是否<br/>不可逆<br/>操作?"}

    C -->|是| D["创建审核任务"]
    C -->|否| E["直接执行"]

    D -->|发送| F["Agent 2<br/>Validator"]

    F -->|独立验证| F1["验证操作合法性"]
    F -->|独立验证| F2["验证用户身份"]
    F -->|独立验证| F3["验证业务规则"]

    F1 -->|通过| G["Double-Check<br/>Matrix"]
    F2 -->|通过| G
    F3 -->|通过| G

    G -->|独立验证| H{"额外验证<br/>是否通过?"}

    H -->|是| I["执行操作<br/>+ 审计日志"]
    H -->|否| J["拒绝操作<br/>+ 告警"]

    style D fill:#ffffcc
    style G fill:#ffffcc
    style I fill:#ccffcc
    style J fill:#ffcccc
```

图 7-22：不可逆操作安全网关流程

### 实现示例（伪代码）

下面的伪代码展示的是 **Rule of Two 的一种实现方式**：通过主处理器 + 独立验证通道来处理不可逆操作。它不应被误读为“Rule of Two 必然等于两个 Agent 审批”。

```python
from enum import Enum
from dataclasses import dataclass

class OperationReversibility(Enum):
    REVERSIBLE = 1      # 可逆
    PARTIALLY_REVERSIBLE = 2  # 部分可逆
    IRREVERSIBLE = 3    # 不可逆

@dataclass
class IrreversibleOperation:
    operation_type: str
    user_id: str
    operation_id: str
    reversibility: OperationReversibility
    approval_status: dict  # {agent_id: bool}

class SafeAgentGateway:
    """不可逆操作的安全网关"""

    def __init__(self):
        self.primary_agent = PrimaryAgent()
        self.validator_agent = ValidatorAgent()

    def process_operation(self, operation: IrreversibleOperation):
        """处理操作的主流程"""

        # Step 1: 初步分类
        if operation.reversibility != OperationReversibility.IRREVERSIBLE:
            # 可逆操作直接执行
            return self.execute_operation(operation)

        # Step 2: 第一个Agent检查
        primary_approval = self.primary_agent.review(operation)
        operation.approval_status['primary'] = primary_approval

        if not primary_approval:
            self.log_rejection(operation, 'primary_agent')
            raise OperationRejected("Primary agent rejected")

        # Step 3: 第二个Agent独立验证
        validator_approval = self.validator_agent.review(operation)
        operation.approval_status['validator'] = validator_approval

        if not validator_approval:
            self.log_rejection(operation, 'validator_agent')
            raise OperationRejected("Validator agent rejected")

        # Step 4: 双重检查通过，执行操作
        if primary_approval and validator_approval:
            return self.execute_operation(operation)

    def execute_operation(self, operation: IrreversibleOperation):
        """执行操作并记录审计日志"""
        try:
            result = self._do_execute(operation)
            self.audit_log.record(
                timestamp=time.time(),
                operation_id=operation.operation_id,
                status='SUCCESS',
                approvals=operation.approval_status
            )
            return result
        except Exception as e:
            self.audit_log.record(
                timestamp=time.time(),
                operation_id=operation.operation_id,
                status='FAILED',
                error=str(e)
            )
            raise
```

## 7.6.3 权限最小化实践

### 权限模型架构

```mermaid
graph TB
    A["权限分层"] -->|Layer 1| A1["System Level<br/>系统级权限"]
    A -->|Layer 2| A2["Agent Level<br/>Agent 级权限"]
    A -->|Layer 3| A3["Task Level<br/>任务级权限"]
    A -->|Layer 4| A4["Operation Level<br/>操作级权限"]

    A1 -->|涵盖| B1["代码执行、文件访问"]
    A2 -->|涵盖| B2["数据库访问、外部 API"]
    A3 -->|涵盖| B3["可执行的任务类型"]
    A4 -->|涵盖| B4["具体操作权限"]
```

图 7-23：多层权限模型

### Zero Trust 原则在 Agent 中的应用

```python
class ZeroTrustAgentSystem:
    """基于 Zero Trust 的智能体权限模型"""

    def __init__(self):
        self.permission_engine = PermissionEngine()

    def authorize_operation(self, agent_id, operation):
        """每次操作都进行完整的权限验证"""

        # 不信任 Agent 的身份声称
        agent_identity = self.verify_agent_identity(agent_id)

        # 不信任静态权限配置，而是动态评估
        context = {
            'agent': agent_identity,
            'operation': operation,
            'timestamp': time.time(),
            'current_resource_state': self.get_current_state(),
            'historical_behavior': self.get_agent_history(agent_id)
        }

        # 多维度权限评估
        checks = [
            self.check_role_permission(agent_identity, operation),
            self.check_resource_limit(agent_identity, operation),
            self.check_time_based_restriction(agent_identity),
            self.check_behavioral_anomaly(agent_identity, context),
            self.check_dependency_chain(operation)
        ]

        if all(checks):
            return True
        else:
            # 详细的拒绝原因记录
            self.log_authorization_failure(
                agent_id=agent_id,
                operation=operation,
                failed_checks=which_checks_failed(checks)
            )
            return False

    def check_dependency_chain(self, operation):
        """检查操作的依赖链是否都被授权"""
        dependencies = self.analyze_operation_dependencies(operation)

        for dependency in dependencies:
            if not self.is_authorized(dependency):
                return False

        return True
```

### 权限降级案例（典型场景示例）

以下是一个基于真实工程场景构造的假设案例，用以说明权限模型的重要性：假设一个 AI 代码生成 Agent 误删了生产数据库中的关键表：

**事件经过**：

1. Agent 被赋予“数据库修改”的广泛权限
2. Agent 在执行“清理过期日志”任务时，因为理解错误执行了 DROP TABLE 操作
3. 数据丧失，恢复耗时 3 天

**事件教训**：

* 权限过于宽泛（可以删除任何表，而不仅是日志表）
* 缺少第二层验证（没有要求人类确认DELETE/DROP操作）
* 操作前缺少dry-run（可以显示将要删除的内容）

**改进方案**：

```python
class ImprovedDatabaseAgent:
    """改进的数据库Agent权限模型"""

    def __init__(self):
        self.permission_level = PermissionLevel.MINIMAL

    def execute_delete(self, table_name, conditions):
        """删除操作的改进流程"""

        # Step 1: 权限检查
        if not self.has_permission('delete', specific_table=table_name):
            raise PermissionDenied(f"No permission to delete from {table_name}")

        # Step 2: Dry-run - 显示将要删除的记录
        affected_records = self.query(
            f"SELECT * FROM {table_name} WHERE {conditions}"
        )

        logger.info(f"Will delete {len(affected_records)} records:")
        for record in affected_records[:10]:  # 显示前10条
            logger.info(f"  - {record}")

        if len(affected_records) > 10:
            logger.info(f"  ... and {len(affected_records) - 10} more")

        # Step 3: 需要人类确认（不可逆操作）
        approval = self.request_human_approval(
            operation_type='DELETE',
            table=table_name,
            affected_count=len(affected_records),
            preview=affected_records[:10]
        )

        if not approval:
            logger.info("Delete operation cancelled by human")
            return

        # Step 4: 使用事务，允许回滚
        with self.database.transaction():
            cursor = self.database.execute(
                f"DELETE FROM {table_name} WHERE {conditions}"
            )
            self.audit_log.record(
                operation='DELETE',
                table=table_name,
                affected_rows=cursor.rowcount,
                timestamp=time.time()
            )
```

## 7.6.4 多智能体安全边界

### 智能体间的信任模型

```mermaid
graph TD
    A["多智能体系统<br/>Multi-Agent System"] -->|分类| B["同源 Agent<br/>Same Origin"]
    A -->|分类| C["不同源 Agent<br/>Different Origin"]

    B -->|信任级别| B1["高信任<br/>Direct Communication"]
    C -->|信任级别| C1["低信任<br/>Sandboxed Interaction"]

    B1 -->|风险| B2["一个 Agent 被 compromised<br/>可能影响全体"]
    C1 -->|防御| C2["通过消息队列<br/>权限隔离"]
```

图 7-24：多智能体信任模型

### 勾结攻击防御

在多智能体系统中，一个新兴的风险是两个或多个智能体进行“勾结”，共同规避安全检查。如果系统把 Rule of Two 实现成“双重验证”，那么这些验证通道之间的独立性就会成为新的安全前提。

#### 勾结攻击的形式

```
场景1：两个验证通道都被攻击者控制
- 攻击者通过某种方式 contaminate 两个验证角色
- 它们都“批准”一个实际上有害的操作
- Rule of Two变成假象

场景2：一个验证Agent被社工
- 通过某种社工手段或逻辑漏洞说服验证Agent
- 使其做出错误的批准决定
- 第二个Agent无法检测到这种细微的操纵

场景3：Rule of Two 的实现有缺陷
- 两个验证通道实际上共享某些内部状态
- 一个被 compromise 导致另一个也受影响
- 缺乏真正的独立性
```

#### 勾结攻击防御机制

**1. 来自不同供应商的验证通道**

```python
class VendorDiversificationDefense:
    """使用来自不同供应商的验证通道"""

    def __init__(self):
        # 使用完全不同来源的模型
        self.primary_agent = load_model(vendor='OpenAI', model='current-openai-primary')
        self.validator_agent = load_model(vendor='Anthropic', model='current-anthropic-primary')
        # 可选的第三个 Agent 来自不同供应商
        self.tertiary_agent = load_model(vendor='Google', model='current-google-primary')

    def verify_operation(self, operation):
        """多来源验证操作"""

        # 来自不同供应商的推理过程是独立的
        primary_approval = self.primary_agent.verify(operation)
        validator_approval = self.validator_agent.verify(operation)

        # 供应商多样性降低了被同时 compromised 的风险
        if primary_approval and validator_approval:
            return True
        else:
            return False
```

**2. 异步验证与时间隔离**

```python
class AsynchronousVerificationWithTimeIsolation:
    """异步验证与时间隔离防御勾结"""

    def execute_irreversible_operation(self, operation):
        """异步和时间隔离的验证流程"""

        # Step 1: 第一个Agent审批
        approval1 = self.primary_agent.verify(operation)
        if not approval1:
            raise OperationRejected("Primary agent rejected")

        # Step 2: 时间隔离（等待一段时间）
        # 这个延迟使得简单的同步勾结变得困难
        wait_duration = random.randint(60, 300)  # 1-5 分钟
        time.sleep(wait_duration)

        # Step 3: 独立的第二个审批
        # 在不同的时间、可能不同的计算资源上运行
        approval2 = self.secondary_agent.verify(operation)
        if not approval2:
            raise OperationRejected("Secondary agent rejected after isolation period")

        # Step 4: 执行
        if approval1 and approval2:
            return self.execute_and_log(operation)
```

**3. 防篡改的审计日志**

```python
class TamperProofAuditLog:
    """防篡改的审计日志机制"""

    def __init__(self):
        # 使用区块链或密码学哈希链确保日志完整性
        self.log_chain = CryptoHashChain()

    def record_verification(self, operation, agent_approvals, timestamp):
        """记录验证过程，防止篡改"""

        log_entry = {
            'operation_id': operation.id,
            'timestamp': timestamp,
            'approvals': agent_approvals,  # {agent_id: {approved: bool, reasoning: str}}
            'previous_hash': self.log_chain.last_hash,
        }

        # 添加加密签名
        log_entry['signature'] = self.sign_entry(log_entry)

        # 添加到不可变链
        self.log_chain.append(log_entry)

        return log_entry

    def verify_log_integrity(self):
        """验证整个日志链的完整性"""
        current_hash = None
        for entry in self.log_chain.entries:
            if not self.verify_entry_signature(entry):
                return False, f"Signature mismatch at {entry['operation_id']}"

            if current_hash and entry['previous_hash'] != current_hash:
                return False, f"Chain break at {entry['operation_id']}"

            current_hash = self.hash_entry(entry)

        return True, "Log integrity verified"
```

**4. 定期Agent轮换**

```python
class AgentRotationDefense:
    """定期轮换验证 Agent 防止被 compromised"""

    def __init__(self, rotation_interval_hours=24):
        self.rotation_interval = rotation_interval_hours * 3600
        self.last_rotation = time.time()
        self.agent_pool = self.initialize_agent_pool()
        self.current_agent_index = 0

    def initialize_agent_pool(self):
        """初始化多个 Agent 池"""
        return [
            ModelAgent(vendor='OpenAI', model='current-openai-primary'),
            ModelAgent(vendor='Anthropic', model='current-anthropic-primary'),
            ModelAgent(vendor='Google', model='current-google-primary'),
            ModelAgent(vendor='Meta', model='llama-3'),
            ModelAgent(vendor='Mistral', model='mistral-large'),
        ]

    def get_current_validator(self):
        """获取当前的验证 Agent"""
        current_time = time.time()

        # 检查是否需要轮换
        if current_time - self.last_rotation > self.rotation_interval:
            self.rotate_agent()
            self.last_rotation = current_time

        return self.agent_pool[self.current_agent_index]

    def rotate_agent(self):
        """轮换到下一个 Agent"""
        self.current_agent_index = (self.current_agent_index + 1) % len(self.agent_pool)
        logger.info(f"Agent rotated to {self.get_current_validator()}")

    def verify_with_rotation(self, operation):
        """使用轮换的 Agent 进行验证"""
        validator = self.get_current_validator()
        return validator.verify(operation)
```

#### 防御的综合应用

```mermaid
graph TD
    A["勾结攻击防御体系"] -->|第一层| B["供应商多样性"]
    A -->|第二层| C["时间隔离"]
    A -->|第三层| D["防篡改日志"]
    A -->|第四层| E["Agent 轮换"]

    B -->|原理| B1["来自不同供应商的 Agent<br/>技术栈完全不同<br/>难以同时 compromised"]

    C -->|原理| C1["不同时间的验证<br/>减少同步勾结的<br/>时间窗口"]

    D -->|原理| D1["密码学保护的日志<br/>任何篡改都可检测<br/>提供事后追踪"]

    E -->|原理| E1["定期更换验证 Agent<br/>失效期有限<br/>长期 compromised 变难"]

    style A fill:#ffffcc
```

### Agent间通信的安全协议

```python
class SecureAgentBridge:
    """Agent 间的安全通信桥梁"""

    def __init__(self):
        self.message_queue = VerifiedQueue()
        self.permission_checker = PermissionChecker()

    def call_external_agent(self, source_agent, target_agent, request):
        """调用外部 Agent 的安全方法"""

        # Step 1: 验证消息来源
        if not self.verify_agent_identity(source_agent):
            raise UntrustedAgentError(f"Cannot verify {source_agent}")

        # Step 2: 检查权限
        if not self.permission_checker.can_call(
            source_agent,
            target_agent,
            request.operation
        ):
            raise PermissionDenied(
                f"{source_agent} cannot call {target_agent}"
            )

        # Step 3: 消息加密和签名
        signed_request = self.sign_request(request, source_agent)

        # Step 4: 通过隔离通道发送
        response = self.message_queue.send(
            target=target_agent,
            message=signed_request,
            timeout=30
        )

        # Step 5: 验证响应
        if not self.verify_response_signature(response, target_agent):
            raise UntrustedResponseError("Response signature verification failed")

        # Step 6: 沙箱执行响应中包含的代码
        return self.execute_in_sandbox(response.payload)

    def execute_in_sandbox(self, code):
        """在隔离环境中执行外部 Agent 的代码"""
        sandbox = Sandbox(
            allowed_modules=['json', 'datetime'],
            memory_limit='100MB',
            timeout=5,
            no_network=True
        )
        return sandbox.execute(code)
```

### 组织级Agent协调

```mermaid
graph TD
    A["企业 Agent 生态"] -->|包含| B["数据访问 Agent"]
    A -->|包含| C["业务逻辑 Agent"]
    A -->|包含| D["审批 Agent"]
    A -->|包含| E["监控 Agent"]

    B -->|权限限制| B1["只读数据库"]
    B -->|权限限制| B2["不能修改数据"]

    C -->|权限限制| C1["使用数据 Agent 提供的数据"]
    C -->|权限限制| C2["调用必须经过审批 Agent"]

    D -->|权限| D1["评估请求的合法性"]
    D -->|权限| D2["授权高风险操作"]

    E -->|权限| E1["监控所有交互"]
    E -->|权限| E2["生成审计日志"]

    style B1 fill:#ffffcc
    style C2 fill:#ffffcc
    style D fill:#ccffcc
    style E fill:#ccffcc
```

图 7-25：企业 Agent 生态的权限架构

## 7.6.5 代表性场景分析

### 案例 1：高权限采购智能体的失控（教学化场景）

**背景**：以下为基于真实工程风险抽象出的代表性场景，用来说明如果没有 Rule of Two，采购类智能体为何容易造成高成本失误。

**事件**：

* Agent 无意中购买了错误的云服务配额
* 由于权限过于宽泛，没有价格上限检查
* 造成月度成本增加 300 万美元

**根本原因**：

* 缺少 Rule of Two 验证
* 采购权限设置不合理（无预算上限）
* 没有及时的支出告警

**改进方案**：

```python
class SafeProcurementAgent:
    """改进的采购 Agent"""

    # 权限最小化
    PERMISSION_LEVEL = PermissionLevel.LIMITED
    BUDGET_LIMIT = 10_000  # 单次上限 $10k
    MONTHLY_LIMIT = 100_000  # 月度上限 $100k

    def purchase(self, service, cost, duration):
        """采购操作的改进流程"""

        # Check 1: 基本权限检查
        if not self.has_purchase_permission():
            raise PermissionDenied("No purchase permission")

        # Check 2: 预算检查
        if cost > self.BUDGET_LIMIT:
            logger.warning(f"Cost {cost} exceeds limit {self.BUDGET_LIMIT}")
            return self.request_human_approval(cost)

        if self.monthly_spent + cost > self.MONTHLY_LIMIT:
            return self.request_human_approval(cost)

        # Check 3: 业务逻辑验证
        if not self.validate_business_case(service):
            raise ValidationError(f"No valid business case for {service}")

        # Check 4: 第一个 Agent 处理
        approval1 = self.process_with_agent1(service, cost)

        # Check 5: 第二个 Agent 独立验证
        approval2 = self.process_with_agent2(service, cost)

        # Rule of Two: 两个都通过才能执行
        if approval1.approved and approval2.approved:
            return self.execute_purchase(service, cost, duration)
        else:
            logger.warning("Purchase rejected by second review")
            return None

    def request_human_approval(self, cost):
        """大额采购需要人类确认"""
        ticket = self.create_approval_ticket(
            amount=cost,
            reason=self.reason,
            auto_expire_hours=24
        )
        # 等待人类审批
        return ticket.wait_for_approval(timeout=24*3600)
```

### 案例 2：代码审查 Agent 误操作的连锁反应（假设场景）

**背景**：以下为基于真实工程风险构造的典型场景示例。一个代码审查 Agent 被给予代码合并权限，在主分支上引入了漏洞。

**事件流**：

1. Agent 试图修复一个 bug（权限：修改代码）
2. 修复引入了一个微妙的逻辑错误
3. Agent 通过了自己的测试（测试用例不全）
4. Agent 自动合并了代码到主分支（权限：合并代码）
5. 在生产环境引起故障

**防御失败点**：

* 只有 Agent 的单一验证，缺少第二层
* 自动合并权限过于宽泛

**改进的 Code Review Agent**：

```python
class SafeCodeReviewAgent:
    """改进的代码审查和合并 Agent"""

    def review_and_merge(self, pull_request):
        """代码审查和合并的改进流程"""

        # Step 1: 自动代码审查
        review1 = self.perform_code_review(pull_request)

        if review1.has_critical_issues:
            return self.reject_pr(pull_request, review1.issues)

        # Step 2: 触发第二个 Agent 的独立审查
        review2 = self.trigger_secondary_review(pull_request)

        # Step 3: 运行测试套件
        test_results = self.run_tests(pull_request)

        if not test_results.passed:
            return self.request_human_review(
                pr=pull_request,
                reason="Tests failed",
                test_results=test_results
            )

        # Step 4: Rule of Two 检查
        if review1.approved and review2.approved:
            # 对于大型变更，仍需人类最终确认
            if self.is_large_change(pull_request):
                return self.request_human_approval(pull_request)
            else:
                return self.merge_with_audit_log(pull_request)
        else:
            return self.request_human_review(
                pr=pull_request,
                review1=review1,
                review2=review2
            )

    def is_large_change(self, pr):
        """评估变更规模"""
        return (
            pr.files_changed > 10 or
            pr.lines_added > 500 or
            pr.affects_critical_path()
        )
```

## 7.6.6 Rule of Two 的适用范围与经验总结

### 系统化的 Rule of Two 应用清单

基于 2025-2026 年的安全事件，形成了以下应用清单：

```mermaid
graph TD
    A["Rule of Two<br/>应用清单"] -->|财务类| B1["资金转账"]
    A -->|财务类| B2["采购决策"]
    A -->|财务类| B3["预算变更"]

    A -->|数据类| C1["数据删除"]
    A -->|数据类| C2["数据导出"]
    A -->|数据类| C3["权限修改"]

    A -->|系统类| D1["代码部署"]
    A -->|系统类| D2["配置变更"]
    A -->|系统类| D3["紧急关闭"]

    A -->|合规类| E1["访问授权"]
    A -->|合规类| E2["审计记录删除"]
    A -->|合规类| E3["数据保留政策"]

    B1 --> F["需要 Rule of Two"]
    B2 --> F
    B3 --> F
    C1 --> F
    C2 --> F
    C3 --> F
    D1 --> F
    D2 --> F
    D3 --> F
    E1 --> F
    E2 --> F
    E3 --> F
```

图 7-26：Rule of Two 应用场景清单

### 经验总结

Rule of Two 更适合被理解为 **高风险操作的约束框架**，而不是固定的“双 Agent 审批模板”或某一组通用量化指标。不同组织的收益、误操作率和审批成本会因业务类型、权限模型和自动化程度而显著不同，因此更稳妥的做法是把它作为设计原则，再结合本组织的监控数据选择实现方式。

## 7.6.7 实现 Rule of Two 的技术路径

### 技术选择决策树

```python

# 伪代码：实现选择
def choose_implementation(operation_type, system_scale):
    """选择合适的 Rule of Two 实现"""

    if operation_type == "IRREVERSIBLE":
        if system_scale == "small":
            # 小型系统可以用简单的人工审批
            return HumanApprovalGateway()
        elif system_scale == "medium":
            # 中型系统：自动化初步检查 + 人类审批
            return AutomatedCheckWithHumanApproval()
        else:  # large
            # 大型系统：双重验证通道 + 异常时人类介入
            return DualAgentVerification()
```

### 开源方案

```
推荐的开源工具与框架：

1. LLM 应用框架
   - LangChain: 链式调用管理
   - AutoGPT: Agent 工作流
   - LlamaIndex: RAG 框架

2. 权限和审批
   - Open Policy Agent (OPA): 声明式权限策略
   - Keycloak: 身份与访问管理
   - Kyverno: Kubernetes 策略引擎

3. 审计和监控
   - Falco: 行为监控
   - Prometheus: 指标收集
   - ELK Stack: 日志分析
```

## 7.6.8 与其他 AI 安全实践的综合

```mermaid
graph TB
    A["完整的 AI 安全防线"] -->|最高级别| B["Rule of Two<br/>不可逆操作验证"]

    A -->|高级别| C["权限最小化<br/>限制单体风险"]
    A -->|高级别| D["沙盒隔离<br/>限制操作影响"]

    A -->|中级别| E["审计日志<br/>事后追踪"]
    A -->|中级别| F["异常检测<br/>实时告警"]

    A -->|基础级别| G["输入验证<br/>源头防守"]
    A -->|基础级别| H["输出审核<br/>终端防守"]

    style B fill:#ff9999
    style C fill:#ffcc99
    style D fill:#ffcc99
    style E fill:#ffff99
    style F fill:#ffff99
    style G fill:#ccffcc
    style H fill:#ccffcc
```

图 7-27：Rule of Two 在 AI 安全防线中的位置


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/07_agent_rag_security/7.6_agents_rule_of_two.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.