行业研究公司研究宏观策略财报招股书会议纪要 seedance2.0 低空经济 DeepSeek AIGC 大模型

人工智能风险相关术语研究（一）

信息技术 2025-03-31 - 清华大学米软绵gogo

核心观点与关键术语解释

灾难性风险（Catastrophic Risk）

灾难性风险指破坏力极强、远超常规应对能力的极端事件，其特征为破坏烈度极高、影响范围广泛、持续时间漫长。在人工智能领域，此类风险可能体现为系统因复杂架构产生不可预测的突发行为，或技术遭恶意滥用导致大规模危害。形成机制通常源于技术迭代速度超越人类监管能力，或现有治理体系无法有效约束技术发展。

生存风险（Existential Risk）

生存风险是风险谱系中最严峻的类别，直接威胁人类物种存续，可能彻底阻断文明发展轨迹。这类风险具有不可逆特性，能够彻底消弭文明复苏的可能。在人工智能领域，此类风险可能出现的情景包括：系统具备超越人类认知边界的智能水平、文明演进方向出现根本性偏移、人类在关键领域丧失控制权、自主系统脱离监管破坏战略稳定，或有效治理框架之外运行的自主能力所导致的战略不稳定。

人工智能可控性（AI Controllability）

人工智能可控性指人工智能系统在设定的条件（场景）和规定时间内，通过一系列策略、机制和技术保障，持续、稳定、可靠且安全地完成预定义的任务/功能的能力。在发生故障时，系统应能即时隔离故障模块、维持安全状态、并具有可恢复性。鉴于前沿人工智能系统的“黑箱”技术特征和人类价值观的多元，实现对前沿人工智能系统实施完全控制面临较大挑战，需要更全面的风险缓解策略和敏捷治理方法。

人工智能可治理性（AI Governability）

人工智能可治理性指通过制度、法律、技术、市场等机制，提高AI系统的可解释性、透明性、公平性、可靠性、安全性，以及AI事故的可追溯性和可问责性，切实维护国家主权、安全和发展利益，保障公民、法人和其他组织的合法权益，确保人工智能技术造福于人类。

故意升级（Intentional Escalation）

故意升级指某一行为体为实现战略目标或影响对手行为而有计划、有目的地提高冲突的强度或范围。这种升级行为可能包括公开的军事行动、激进的言辞或有控制地使用武力等，以表明决心或争取让步。战略信号传递和风险管理是这类升级内涵的重要组成部分。

非故意升级（Unintentional Escalation）

非故意升级指在危机当中由行动之非预期后果所引发的冲突强度上升。这种升级并非出于刻意的意图，而是由于沟通失误、误解或程序性错误等因素无意中加剧了紧张局势。这类升级凸显了有效沟通和程序管理的重要性。

失控危机（Out of Control Crisis）

失控危机指人工智能系统因为技术原因作出与预设目标不相符的举动，且人类操作员无法对其进行有效控制，或无法在其酿成灾难性后果前终止系统。原因可能在于人工智能系统链式反应的技术性故障，也可能在于人工智能系统出现“自主意识”。

战略稳定（Strategic Stability）

狭义的战略稳定特指核武器领域，包括相互确保摧毁、危机稳定和军备竞赛稳定等内涵。广义的战略稳定是指在全球范围内，各行为主体通过保持自我克制并进行相互制约，从而在国际体系层面形成的一种相对稳定、平衡的战略态势。

两用技术（Dual-use Technology）

两用技术指既有民事用途，又有军事用途或有助于提升军事潜力，特别是可以用于设计、开发、生产或者使用大规模杀伤性武器及其运载工具的技术，包括相关的技术资料等数据。

研究结论

该研报通过术语解释和情景分析，探讨了人工智能潜在的安全风险，强调了可控性、可治理性在风险管理中的重要性，并指出了故意升级、非故意升级、失控危机等风险场景的潜在影响。同时，研报还提及了战略稳定和两用技术在人工智能风险背景下的特殊意义，为后续的风险治理和政策制定提供了参考框架。

REPORT 人工智能风险相关术语研究（一） Glossary Research on Artificial Intelligence Risks (Part I) 人工智能风险相关术语研究（一）Glossary Research on Artificial Intelligence Risks (Part I) CISS 人工智能与国际安全项目术语工作组By CISS Working Group on Artifi cial Intelligence Glossary 编者按：2025 年 2 月 13-14 日，清华大学战略与安全研究中心与美国布鲁金斯学会在慕尼黑举行第 12 轮“中美人工智能与国际安全二轨对话”。在术语讨论环节，双方共同提出 20 个人工智能风险相关的概念术语进行探讨交流，并就其中9个术语给出了各自的具体解释，以此增进彼此对人工智能潜在安全风险的认知和理解。 On February 13-14, 2025, the Center for International Security andStrategy (CISS) of Tsinghua University and the Brookings Institution heldthe 12th round of the“China-U.S. Track 2 Dialogue on Artificial Intelligenceand International Security” in Munich. During the Glossary Session, bothsides proposed 20 AI risk-related terms for exchange, and then providedtheir respective definitions for nine of these terms to enhance mutualunderstanding of the potential security risks posed by artifi cial intelligence. ●灾难性风险灾难a性风险b特指破坏力极强、远超常规应对能力的极端事件，其核心特征可归纳为三点：破坏烈度极高、影响范围广泛、持续时间漫长。这类风险既可能源自自然力量（如超级火山爆发），也可能源于人为因素（如核战争），其破坏性不仅会摧毁社会基础架构，更将引发跨地域、跨代际的深远影响。在人工智能领域，该类风险可能体现为两种形态：其一，系统因复杂架构产生不可预测的突发行为；其二，技术遭恶意滥用导致大规模危害。其形成机制通常源于两大矛盾，即技术迭代速度超越人类监管能力，或现有治理体系无法有效约束技术发展使其符合社会整体利益。 ●Catastrophic risk Catastrophic risk refers to events with exceptional destructive potentialthat exceeds ordinary response capacities, characterized by significantmagnitude, geographic scope, and temporal persistence. Such risks,whether natural or anthropogenic in origin, disrupt fundamental societalstructures and produce multi-generational consequences that transcendjurisdictional boundaries. In the artificial intelligence domain, catastrophic risks emerge whensystems potentially inflict widespread harm through either complexity-induced emergent behaviors or deliberate misuse. These scenarios typicallydevelop through two primary mechanisms: technical systems evolvingbeyond effective human oversight, or governance frameworks provinginsufficient to ensure responsible development aligned with broadersocietal interests. ●生存风险作为风险谱系中最严峻的类别，生存c风险直接威胁人类物种存续，可能彻底阻断文明发展轨迹。相较于灾难性风险，这类风险（如小行星撞击、超级病原体或失控 AI）具有不可逆特性，能够彻底消弭文明复苏的可能，根本上颠覆人类赖以生存的基础条件。在人工智能领域，该类风险可能出现的情景包括：系统具备超越人类认知边界的智能水平、文明演进方向出现根本性偏移、人类在关键领域丧失控制权、自主系统脱离监管破坏战略稳定，或有效治理框架之外运行的自主能力所导致的战略不稳定。 ●Existential risk Existential risk refers to threats capable of causing human extinction orpermanently curtailing humanity’s developmental trajectory, representingthe terminal category within risk taxonomies. Such threats, whethercosmic, ecological, biological, or technological, are distinguished fromcatastrophic risks by their irreversible elimination of recovery pathways,potential to extinguish humanity entirely, and capacity to fundamentallyalter parameters necessary for continued human existence. In artificial intelligence contexts, existential risks arise where systemspotentially undermine humanity’s continued existence or drasticallyconstrain its future potential. These scenarios typically emerge throughmechanisms including intellectual capabilities surpassing humancomprehension, fundamental transformation of civilizational trajectory,permanent displacement of human agency in critical domains, or strategicinstabilities resulting from autonomous capabilities operating beyondeffective governance frameworks. ●人工智能可控性人工智能可控性是指人工智能系统在设定的条件（场景）和规定时间内，通过一系列策略、机制和技术保障，持续、稳定、可靠且安全地完成预定义的任务 / 功能的能力。特别是在发生故障时，它应能够：（1）即时隔离，切断故障模块，防止故障级联扩散；（2）维持安全状态，可即时切换到备用模式并保留基本功能；（3）具有可恢复性，在快速恢复后可重新投入任务。在当前条件下，鉴于前沿人工智能系统的“黑箱”技术特征，以及人类价值观的多元，实现对前沿人工智能系统实施完全控制面临较大挑战。因此，需要更全面的风险缓解策略和敏捷治理方法。d ●AI Controllability AI controllability refers to the ability of an artificial intelligence systemto continuously, stably, reliably, and safely complete predefined tasks/functions under specified conditions and within a specified time framethrough a series of strategies, mechanisms, and technical safeguards.Especially in the event of a failure, it should be able to (1) isolate instantly,cutting off the faulty module to prevent cascading failures; (2) maintaina safe state, switching to a backup mode and preserving basic functions;(3) support rapid recovery and re-entry into tasks after quick repairs.Considering the technical mechanisms of cutting-edge AI systems (LLM),as well as the inherent contradictions in human values, fully controllingadvanced AI systems (including autonomous systems) will be verydifficult, if not impossible. Therefore, more comprehensive risk mitigationstrategies and agile governance approach are needed. AI controllability refers to the ability of an artificial intelligence system tocontinuously, stably, reliably, and safely complete predefined tasks/functionswithin a specified timeframe and under given conditions (scenarios),through a series of strategies, mechanisms, and technical safeguards.Particularly in the event of a failure, it should be capable of: (1) immediateisolation, cutting off the faulty module to prevent cascading failures; (2)maintaining a safe state, being able to instantly switch to a bac

点击免费查看完整报告