REPORT 人工智能风险相关术语研究(一) Glossary Research on Artificial Intelligence Risks (Part I) 人工智能风险相关术语研究(一)Glossary Research on Artificial Intelligence Risks (Part I) CISS 人工智能与国际安全项目术语工作组By CISS Working Group on Artifi cial Intelligence Glossary 编者按:2025 年 2 月 13-14 日,清华大学战略与安全研究中心与美国布鲁金斯学会在慕尼黑举行第 12 轮“中美人工智能与国际安全二轨对话”。在术语讨论环节,双方共同提出 20 个人工智能风险相关的概念术语进行探讨交流,并就其中9个术语给出了各自的具体解释,以此增进彼此对人工智能潜在安全风险的认知和理解。 On February 13-14, 2025, the Center for International Security andStrategy (CISS) of Tsinghua University and the Brookings Institution heldthe 12th round of the“China-U.S. Track 2 Dialogue on Artificial Intelligenceand International Security” in Munich. During the Glossary Session, bothsides proposed 20 AI risk-related terms for exchange, and then providedtheir respective definitions for nine of these terms to enhance mutualunderstanding of the potential security risks posed by artifi cial intelligence. ●灾难性风险 灾难a性风险b特指破坏力极强、远超常规应对能力的极端事件,其核心特征可归纳为三点:破坏烈度极高、影响范围广泛、持续时间漫长。这类风险既可能源自自然力量(如超级火山爆发),也可能源于人为因素(如核战争),其破坏性不仅会摧毁社会基础架构,更将 引发跨地域、跨代际的深远影响。 在人工智能领域,该类风险可能体现为两种形态:其一,系统因复杂架构产生不可预测的突发行为;其二,技术遭恶意滥用导致大规模危害。其形成机制通常源于两大矛盾,即技术迭代速度超越人类监管能力,或现有治理体系无法有效约束技术发展使其符合社会整体利益。 ●Catastrophic risk Catastrophic risk refers to events with exceptional destructive potentialthat exceeds ordinary response capacities, characterized by significantmagnitude, geographic scope, and temporal persistence. Such risks,whether natural or anthropogenic in origin, disrupt fundamental societalstructures and produce multi-generational consequences that transcendjurisdictional boundaries. In the artificial intelligence domain, catastrophic risks emerge whensystems potentially inflict widespread harm through either complexity-induced emergent behaviors or deliberate misuse. These scenarios typicallydevelop through two primary mechanisms: technical systems evolvingbeyond effective human oversight, or governance frameworks provinginsufficient to ensure responsible development aligned with broadersocietal interests. ●生存风险 作为风险谱系中最严峻的类别,生存c风险直接威胁人类物种存续,可能彻底阻断文明发展轨迹。相较于灾难性风险,这类风险(如小行星撞击、超级病原体或失控 AI)具有不可逆特性,能够彻底消弭文明复苏的可能,根本上颠覆人类赖以生存的基础条件。 在人工智能领域,该类风险可能出现的情景包括:系统具备超越 人类认知边界的智能水平、文明演进方向出现根本性偏移、人类在关键领域丧失控制权、自主系统脱离监管破坏战略稳定,或有效治理框架之外运行的自主能力所导致的战略不稳定。 ●Existential risk Existential risk refers to threats capable of causing human extinction orpermanently curtailing humanity’s developmental trajectory, representingthe terminal category within risk taxonomies. Such threats, whethercosmic, ecological, biological, or technological, are distinguished fromcatastrophic risks by their irreversible elimination of recovery pathways,potential to extinguish humanity entirely, and capacity to fundamentallyalter parameters necessary for continued human existence. In artificial intelligence contexts, existential risks arise where systemspotentially undermine humanity’s continued existence or drasticallyconstrain its future potential. These scenarios typically emerge throughmechanisms including intellectual capabilities surpassing humancomprehension, fundamental transformation of civilizational trajectory,permanent displacement of human agency in critical domains, or strategicinstabilities resulting from autonomous capabilities operating beyondeffective governance frameworks. ●人工智能可控性 人工智能可控性是指人工智能系统在设定的条件(场景)和规定时间内,通过一系列策略、机制和技术保障,持续、稳定、可靠且安全地完成预定义的任务 / 功能的能力。特别是在发生故障时,它应能够:(1)即时隔离,切断故障模块,防止故障级联扩散;(2)维持安全状态,可即时切换到备用模式并保留基本功能;(3)具有可恢复性,在快速恢复后可重新投入任务。在当前条件下,鉴于前沿人工智能系统的“黑箱”技术特征,以及人类价值观的多元,实现对前沿人 工智能系统实施完全控制面临较大挑战。因此,需要更全面的风险缓解策略和敏捷治理方法。d ●AI Controllability AI controllability refers to the ability of an artificial intelligence systemto continuously, stably, reliably, and safely complete predefined tasks/functions under specified conditions and within a specified time framethrough a series of strategies, mechanisms, and technical safeguards.Especially in the event of a failure, it should be able to (1) isolate instantly,cutting off the faulty module to prevent cascading failures; (2) maintaina safe state, switching to a backup mode and preserving basic functions;(3) support rapid recovery and re-entry into tasks after quick repairs.Considering the technical mechanisms of cutting-edge AI systems (LLM),as well as the inherent contradictions in human values, fully controllingadvanced AI systems (including autonomous systems) will be verydifficult, if not impossible. Therefore, more comprehensive risk mitigationstrategies and agile governance approach are needed. AI controllability refers to the ability of an artificial intelligence system tocontinuously, stably, reliably, and safely complete predefined tasks/functionswithin a specified timeframe and under given conditions (scenarios),through a series of strategies, mechanisms, and technical safeguards.Particularly in the event of a failure, it should be capable of: (1) immediateisolation, cutting off the faulty module to prevent cascading failures; (2)maintaining a safe state, being able to instantly switch to a bac