# 大模型安全权威指南 | AI Security Guide

## Docs

- [大模型安全权威指南](https://yeasy.gitbook.io/ai_security_guide/readme.md)
- [第一章 大语言模型安全导论](https://yeasy.gitbook.io/ai_security_guide/di-yi-bu-fen-ji-chu-pian/01_intro.md)
- [1.1 大语言模型概述](https://yeasy.gitbook.io/ai_security_guide/di-yi-bu-fen-ji-chu-pian/01_intro/1.1_llm_overview.md)
- [1.2 为什么大语言模型安全至关重要](https://yeasy.gitbook.io/ai_security_guide/di-yi-bu-fen-ji-chu-pian/01_intro/1.2_why_security_matters.md)
- [1.3 大语言模型安全与传统安全的异同](https://yeasy.gitbook.io/ai_security_guide/di-yi-bu-fen-ji-chu-pian/01_intro/1.3_llm_vs_traditional.md)
- [1.4 大语言模型安全威胁全景图](https://yeasy.gitbook.io/ai_security_guide/di-yi-bu-fen-ji-chu-pian/01_intro/1.4_threat_landscape.md)
- [本章小结](https://yeasy.gitbook.io/ai_security_guide/di-yi-bu-fen-ji-chu-pian/01_intro/summary.md)
- [第二章 大语言模型安全基础](https://yeasy.gitbook.io/ai_security_guide/di-yi-bu-fen-ji-chu-pian/02_fundamentals.md)
- [2.1 大语言模型架构与安全边界](https://yeasy.gitbook.io/ai_security_guide/di-yi-bu-fen-ji-chu-pian/02_fundamentals/2.1_architecture_boundary.md)
- [2.2 训练过程中的安全考量](https://yeasy.gitbook.io/ai_security_guide/di-yi-bu-fen-ji-chu-pian/02_fundamentals/2.2_training_security.md)
- [2.3 推理阶段的安全挑战](https://yeasy.gitbook.io/ai_security_guide/di-yi-bu-fen-ji-chu-pian/02_fundamentals/2.3_inference_security.md)
- [2.4 安全对齐技术入门](https://yeasy.gitbook.io/ai_security_guide/di-yi-bu-fen-ji-chu-pian/02_fundamentals/2.4_alignment_intro.md)
- [2.5 推理模型安全深度分析](https://yeasy.gitbook.io/ai_security_guide/di-yi-bu-fen-ji-chu-pian/02_fundamentals/2.5_reasoning_model_security.md)
- [本章小结](https://yeasy.gitbook.io/ai_security_guide/di-yi-bu-fen-ji-chu-pian/02_fundamentals/summary.md)
- [第三章 安全框架与标准](https://yeasy.gitbook.io/ai_security_guide/di-yi-bu-fen-ji-chu-pian/03_frameworks.md)
- [3.1 OWASP 大语言模型十大风险解析](https://yeasy.gitbook.io/ai_security_guide/di-yi-bu-fen-ji-chu-pian/03_frameworks/3.1_owasp_top10.md)
- [3.2 NIST AI 风险管理框架](https://yeasy.gitbook.io/ai_security_guide/di-yi-bu-fen-ji-chu-pian/03_frameworks/3.2_nist_framework.md)
- [3.3 行业安全标准与最佳实践](https://yeasy.gitbook.io/ai_security_guide/di-yi-bu-fen-ji-chu-pian/03_frameworks/3.3_industry_standards.md)
- [3.4 MITRE ATLAS：AI 系统对抗战术与技术矩阵](https://yeasy.gitbook.io/ai_security_guide/di-yi-bu-fen-ji-chu-pian/03_frameworks/3.4_mitre_atlas.md)
- [本章小结](https://yeasy.gitbook.io/ai_security_guide/di-yi-bu-fen-ji-chu-pian/03_frameworks/summary.md)
- [第四章 提示注入攻击与防御](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/04_prompt_injection.md)
- [4.1 提示注入原理与分类](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/04_prompt_injection/4.1_principles.md)
- [4.2 直接提示注入技术](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/04_prompt_injection/4.2_direct_injection.md)
- [4.3 间接提示注入技术](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/04_prompt_injection/4.3_indirect_injection.md)
- [4.4 公开案例与研究演示分析](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/04_prompt_injection/4.4_case_studies.md)
- [4.5 分层防御：构建可复制的安全门控架构](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/04_prompt_injection/4.5_injection_defense.md)
- [4.6 长上下文特有的安全风险与防御](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/04_prompt_injection/4.6_long_context_risks.md)
- [本章小结](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/04_prompt_injection/summary.md)
- [第五章 越狱攻击](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/05_jailbreak.md)
- [5.1 越狱攻击概述](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/05_jailbreak/5.1_jailbreak_overview.md)
- [5.2 经典越狱技术剖析](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/05_jailbreak/5.2_classic_techniques.md)
- [5.3 多模态越狱攻击](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/05_jailbreak/5.3_multimodal_attacks.md)
- [5.4 越狱检测与防御实践](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/05_jailbreak/5.4_jailbreak_defense.md)
- [5.5 多模态安全防御体系](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/05_jailbreak/5.5_multimodal_defense.md)
- [5.6 自动化越狱方法论完整对标](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/05_jailbreak/5.6_automated_jailbreak_methods.md)
- [本章小结](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/05_jailbreak/summary.md)
- [第六章 数据与模型攻击](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/06_data_model_attacks.md)
- [6.1 训练数据投毒](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/06_data_model_attacks/6.1_data_poisoning.md)
- [6.2 后门攻击](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/06_data_model_attacks/6.2_backdoor_attacks.md)
- [6.3 模型窃取与逆向工程](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/06_data_model_attacks/6.3_model_extraction.md)
- [6.4 成员推理与隐私攻击](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/06_data_model_attacks/6.4_privacy_attacks.md)
- [6.5 离散对抗攻击与模型鲁棒性](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/06_data_model_attacks/6.5_adversarial_robustness.md)
- [6.6 微调与 PEFT 的安全风险](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/06_data_model_attacks/6.6_finetuning_peft_security.md)
- [6.7 恶意模型工件与反序列化攻击](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/06_data_model_attacks/6.7_malicious_model_artifacts.md)
- [本章小结](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/06_data_model_attacks/summary.md)
- [第七章 智能体与 RAG 安全](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/07_agent_rag_security.md)
- [7.1 智能体系统安全风险](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/07_agent_rag_security/7.1_agent_risks.md)
- [7.2 RAG 架构攻击面分析](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/07_agent_rag_security/7.2_rag_attacks.md)
- [7.3 工具调用安全](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/07_agent_rag_security/7.3_tool_security.md)
- [7.4 智能体技能与生态安全](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/07_agent_rag_security/7.4_agent_skills.md)
- [7.5 多智能体协作系统的安全架构](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/07_agent_rag_security/7.5_multi_agent_security.md)
- [7.6 Agents Rule of Two 与智能体安全设计原则](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/07_agent_rag_security/7.6_agents_rule_of_two.md)
- [本章小结](https://yeasy.gitbook.io/ai_security_guide/di-er-bu-fen-gong-ji-pian/07_agent_rag_security/summary.md)
- [第八章 安全架构设计](https://yeasy.gitbook.io/ai_security_guide/di-san-bu-fen-fang-yu-pian/08_architecture.md)
- [8.1 纵深防御原则](https://yeasy.gitbook.io/ai_security_guide/di-san-bu-fen-fang-yu-pian/08_architecture/8.1_defense_depth.md)
- [8.2 大语言模型安全架构模式](https://yeasy.gitbook.io/ai_security_guide/di-san-bu-fen-fang-yu-pian/08_architecture/8.2_architecture_patterns.md)
- [8.3 权限与访问控制](https://yeasy.gitbook.io/ai_security_guide/di-san-bu-fen-fang-yu-pian/08_architecture/8.3_access_control.md)
- [8.4 安全开发生命周期](https://yeasy.gitbook.io/ai_security_guide/di-san-bu-fen-fang-yu-pian/08_architecture/8.4_security_sdlc.md)
- [8.5 隐私增强技术与数据保护](https://yeasy.gitbook.io/ai_security_guide/di-san-bu-fen-fang-yu-pian/08_architecture/8.5_privacy_enhancing.md)
- [8.6 供应链与基础设施环境安全](https://yeasy.gitbook.io/ai_security_guide/di-san-bu-fen-fang-yu-pian/08_architecture/8.6_supply_chain.md)
- [本章小结](https://yeasy.gitbook.io/ai_security_guide/di-san-bu-fen-fang-yu-pian/08_architecture/summary.md)
- [第九章 输入输出安全防护](https://yeasy.gitbook.io/ai_security_guide/di-san-bu-fen-fang-yu-pian/09_io_protection.md)
- [9.1 输入验证与过滤](https://yeasy.gitbook.io/ai_security_guide/di-san-bu-fen-fang-yu-pian/09_io_protection/9.1_input_validation.md)
- [9.2 输出内容安全审核](https://yeasy.gitbook.io/ai_security_guide/di-san-bu-fen-fang-yu-pian/09_io_protection/9.2_output_moderation.md)
- [9.3 敏感信息保护](https://yeasy.gitbook.io/ai_security_guide/di-san-bu-fen-fang-yu-pian/09_io_protection/9.3_sensitive_data.md)
- [9.4 AI 生成内容鉴伪与水印技术](https://yeasy.gitbook.io/ai_security_guide/di-san-bu-fen-fang-yu-pian/09_io_protection/9.4_watermarking_detection.md)
- [9.5 下一代 Constitutional Classifiers：级联架构与激活模式检测](https://yeasy.gitbook.io/ai_security_guide/di-san-bu-fen-fang-yu-pian/09_io_protection/9.5_constitutional_classifiers.md)
- [本章小结](https://yeasy.gitbook.io/ai_security_guide/di-san-bu-fen-fang-yu-pian/09_io_protection/summary.md)
- [第十章 安全运营与监控](https://yeasy.gitbook.io/ai_security_guide/di-san-bu-fen-fang-yu-pian/10_operations.md)
- [10.1 安全监控体系](https://yeasy.gitbook.io/ai_security_guide/di-san-bu-fen-fang-yu-pian/10_operations/10.1_monitoring.md)
- [10.2 异常检测与告警](https://yeasy.gitbook.io/ai_security_guide/di-san-bu-fen-fang-yu-pian/10_operations/10.2_anomaly_detection.md)
- [10.3 运行时安全与事件响应](https://yeasy.gitbook.io/ai_security_guide/di-san-bu-fen-fang-yu-pian/10_operations/10.3_incident_response.md)
- [10.4 红队演练与自动化评估](https://yeasy.gitbook.io/ai_security_guide/di-san-bu-fen-fang-yu-pian/10_operations/10.4_red_teaming.md)
- [10.5 服务降级与 Fallback 策略](https://yeasy.gitbook.io/ai_security_guide/di-san-bu-fen-fang-yu-pian/10_operations/10.5_fallback_strategy.md)
- [10.6 DeepTeam 与现代红队工具链](https://yeasy.gitbook.io/ai_security_guide/di-san-bu-fen-fang-yu-pian/10_operations/10.6_modern_redteam_tools.md)
- [10.7 隐蔽破坏检测：SHADE-Arena 基准与 Agent 监控](https://yeasy.gitbook.io/ai_security_guide/di-san-bu-fen-fang-yu-pian/10_operations/10.7_sabotage_monitoring.md)
- [本章小结](https://yeasy.gitbook.io/ai_security_guide/di-san-bu-fen-fang-yu-pian/10_operations/summary.md)
- [第十一章 安全治理与未来展望](https://yeasy.gitbook.io/ai_security_guide/di-si-bu-fen-zhi-li-yu-zhan-wang/11_governance.md)
- [11.1 AI 法规与合规要求](https://yeasy.gitbook.io/ai_security_guide/di-si-bu-fen-zhi-li-yu-zhan-wang/11_governance/11.1_regulations.md)
- [11.2 负责任 AI 实践](https://yeasy.gitbook.io/ai_security_guide/di-si-bu-fen-zhi-li-yu-zhan-wang/11_governance/11.2_responsible_ai.md)
- [11.3 新兴威胁趋势](https://yeasy.gitbook.io/ai_security_guide/di-si-bu-fen-zhi-li-yu-zhan-wang/11_governance/11.3_emerging_threats.md)
- [11.4 Agent 错位威胁：从压力测试到防护框架](https://yeasy.gitbook.io/ai_security_guide/di-si-bu-fen-zhi-li-yu-zhan-wang/11_governance/11.4_agentic_misalignment.md)
- [11.5 大语言模型安全成熟度模型](https://yeasy.gitbook.io/ai_security_guide/di-si-bu-fen-zhi-li-yu-zhan-wang/11_governance/11.5_maturity_model.md)
- [11.6 未来安全技术方向](https://yeasy.gitbook.io/ai_security_guide/di-si-bu-fen-zhi-li-yu-zhan-wang/11_governance/11.6_future.md)
- [11.7 AI 安全合规的可操作性指南](https://yeasy.gitbook.io/ai_security_guide/di-si-bu-fen-zhi-li-yu-zhan-wang/11_governance/11.7_compliance_operational.md)
- [11.8 可信 Agent 框架：五大核心原则与生态标准化](https://yeasy.gitbook.io/ai_security_guide/di-si-bu-fen-zhi-li-yu-zhan-wang/11_governance/11.8_trustworthy_agents.md)
- [本章小结](https://yeasy.gitbook.io/ai_security_guide/di-si-bu-fen-zhi-li-yu-zhan-wang/11_governance/summary.md)
- [附录](https://yeasy.gitbook.io/ai_security_guide/fu-lu/12_appendix.md)
- [附录 A：术语表](https://yeasy.gitbook.io/ai_security_guide/fu-lu/12_appendix/a_glossary.md)
- [附录 B：安全工具与资源](https://yeasy.gitbook.io/ai_security_guide/fu-lu/12_appendix/b_tools.md)
- [附录 C：参考文献](https://yeasy.gitbook.io/ai_security_guide/fu-lu/12_appendix/c_references.md)


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information, you can query the documentation dynamically by asking a question.
Perform an HTTP GET request on a page URL with the `ask` query parameter:
```
GET https://yeasy.gitbook.io/ai_security_guide/readme.md?ask=<question>
```
The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.
Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
