行业研究公司研究宏观策略财报招股书会议纪要 Token 低空经济十五五 AIGC 大模型

人工智能伤害的机制：从人工智能事件中吸取的教训

信息技术 2025-10-30 美国安全与新兴技术中心 SoftGreen

核心观点：该研报分析了人工智能（AI）造成伤害的多种机制，并强调了当前预防措施存在的不足。报告认为，单一的预防方法无法有效应对AI风险，需要根据不同的伤害机制采取多样化的社会技术方法。

关键数据：研报基于人工智能事件数据库（AIID）中的200多个真实案例，识别出六种主要的AI伤害机制，包括有意伤害（设计缺陷、AI滥用、针对AI系统的攻击）和无意伤害（AI故障、人类监督失败、集成伤害）。

研究结论：

多样化预防策略：由于AI伤害的途径多样，需要采用多样化的预防策略。单纯的技术解决方案无法解决所有问题，需要从测试和评估协议到参与式AI采用过程，采取广泛的社会技术措施。
模型能力并非关键因素： AI伤害的风险与模型能力并非完全相关。即使是功能单一的AI系统也可能造成伤害。因此，需要关注AI设计、测试和部署的背景、方式和治理，而不是仅仅关注模型的能力。
综合事件跟踪的重要性：全面的事件跟踪对于了解AI风险和制定有效的缓解策略至关重要。随着AI技术的不断发展，新的风险和伤害形式将不断出现。因此，需要建立有效的学习机制，以快速适应和应对这些新挑战。

六种AI伤害机制：

有意伤害：
- 设计缺陷：指AI系统被设计用于造成伤害的目的，例如用于军事或执法领域的AI系统，以及深度伪造应用程序。
- AI滥用：指用户将AI系统用于开发者意图之外的目的，例如使用AI生成恶意软件或进行网络钓鱼攻击。
- 针对AI系统的攻击：指网络攻击者通过攻击AI系统来绕过安全防护措施，例如通过提示注入攻击来生成有害内容。
无意伤害：
- AI故障：指AI系统出现错误、故障、性能下降或产生有偏见的输出，例如COMPAS风险评估算法中的种族偏见。
- 人类监督失败：指人类监督者未能检测到AI系统的异常行为、偏见或性能问题，例如英国政府基于有缺陷的语音识别系统取消了大量移民签证。
- 集成伤害：指AI系统在不适当的环境中部署造成的伤害，即使AI系统本身功能正常，也可能产生意想不到的后果，例如搜索引擎算法加剧了疫苗虚假信息的传播。

Executive Summary With recent advancements in artificial intelligence—particularly,powerful generativemodels—private and public sector actors have heralded the benefits of incorporating AImore prominently into our daily lives. Frequently cited benefits include increasedproductivity, efficiency, and personalization. However, the harm caused by AI remainsto be more fully understood. As a result of wider AI deployment and use, the number ofAI harm incidents has surged in recent years, suggesting that current approaches toharm preventionmay befalling short. This report argues that this is due to a limitedunderstanding of how AI risks materialize in practice. Leveraging AI incident reportsfrom the AI Incident Database, it analyzes how AI deployment results in harm andidentifies six key mechanisms that describe this process (Table 1). A review of AI incidents associated with these mechanisms leads to several keytakeaways that should inform AI governance approaches in the future. 1.A one-size-fits-all approach to harm prevention will fall short.This reportillustrates the diverse pathways to AI harm and the wide range of actorsinvolved. Effective mitigation requires an equally diverse response strategy thatincludes sociotechnical approaches.Adopting model-based approaches alonecouldespeciallyneglect integration harms and failures of human oversight. 2.To date, risk of harm correlates only weakly with model capabilities.Thisreport illustrates many instances of harm that implicate single-purpose AIsystems. Yet many policy approaches use broad model capabilities, often proxiedby computing power, as a predictor for the propensity to do harm. This fails tomitigate the significant risk associated with the irresponsible design,development, and deployment ofless powerfulAIsystems. 3.Tracking AI incidents offers invaluable insights into real AI risks and helpsbuild response capacity.Technical innovation, experimentation with new usecases, and novel attack strategies will result in new AI harm incidents in the future. Keeping pace with these developments requires rapid adaptation andagile responses. Comprehensive AI incident reporting allowsforlearningandadaptationat an accelerated pace, enabling improved mitigation strategies andidentification of novel AI risks as they emerge. Incident reporting must berecognized as a critical policy tool to address AI risks. Table of Contents Executive Summary................................................................................................................................1Introduction...............................................................................................................................................4Methodology............................................................................................................................................6Limitations............................................................................................................................................6AI Harm Mechanisms.............................................................................................................................9Intentional Harm.................................................................................................................................9Harm by Design..............................................................................................................................9AI Misuse........................................................................................................................................10Attacks on AI Systems...............................................................................................................12Unintentional Harm........................................................................................................................14AI Failures......................................................................................................................................14Failures of Human Oversight...................................................................................................16Integration Harm.........................................................................................................................19Discussion..............................................................................................................................................22Conclusion..............................................................................................................................................23Appendix................................................................................................................................................25Authors....................................................................................................................................................27Acknowledgments...................................................................................

点击免费查看完整报告

人工智能伤害的机制：从人工智能事件中吸取的教训

你可能感兴趣

从冠状病毒疾病疫苗试验中吸取的教训

难民和体面工作：从最近的难民就业契约中吸取的教训

如何加强非正式学徒制，以实现更好的工作未来？：从国家案例的比较分析中吸取的教训

从基于市场的包容性就业方法中吸取的教训

环境署评价办公室特别研究文件第 2 号。从评价中吸取的教训：分享知识的平台

从 2012 年巴黎科法斯国家风险会议中吸取的主要教训

为肯尼亚制定食品安全政策框架：从越南经验中吸取的教训和最佳做法

建设地方主导的农业政策分析能力：从发展中国家的经验中吸取的教训

让女孩留在学校：从学校和社区案例管理系统中吸取的教训（英）2025

管理老龄化社会：从日本吸取正确的教训

人工智能伤害的机制：从人工智能事件中吸取的教训

你可能感兴趣

从冠状病毒疾病疫苗试验中吸取的教训

难民和体面工作 ： 从最近的难民就业契约中吸取的教训

如何加强非正式学徒制 ， 以实现更好的工作未来 ？ ： 从国家案例的比较分析中吸取的教训

从基于市场的包容性就业方法中吸取的教训

环境署评价办公室特别研究文件第 2 号。从评价中吸取的教训 ： 分享知识的平台

从 2012 年巴黎科法斯国家风险会议中吸取的主要教训

为肯尼亚制定食品安全政策框架：从越南经验中吸取的教训和最佳做法

建设地方主导的农业政策分析能力：从发展中国家的经验中吸取的教训

让女孩留在学校：从学校和社区案例管理系统中吸取的教训（英）2025

管理老龄化社会：从日本吸取正确的教训

难民和体面工作：从最近的难民就业契约中吸取的教训

如何加强非正式学徒制，以实现更好的工作未来？：从国家案例的比较分析中吸取的教训

环境署评价办公室特别研究文件第 2 号。从评价中吸取的教训：分享知识的平台