行业研究公司研究宏观策略财报招股书会议纪要 Token 低空经济十五五 AIGC 大模型

AI安全指数

信息技术 2025-07-17 未来生命研究所 dede

核心观点

行业整体安全准备不足：尽管所有公司都声称将在十年内实现通用人工智能（AGI），但他们在存在性安全规划方面均未达到 D 以上的分数，缺乏应对 AGI 的有效控制计划。
公司间差距扩大：部分公司在安全方面采取了更严格的措施，而其他公司则忽视了基本的安全保障，自愿承诺的不足凸显了监管缺失的问题。
安全评估方法存在缺陷：即使是行业领导者，其安全测试也存在方法学上的不足，缺乏对风险的明确解释和独立验证，导致对危险能力的检测缺乏信心。
信息共享不足：仅有 OpenAI 发布了完整的举报政策，大多数公司在系统提示和行为规范透明度方面存在不足，且缺乏对重大事件的公开报告机制。

关键数据

七家公司评估： Anthropic、OpenAI、Google DeepMind、Meta、xAI、Zhipu AI 和 DeepSeek。
评估指标：涵盖六个领域，包括风险评估、当前危害、安全框架、存在性安全、治理与问责制以及信息共享，共 33 个指标。
评分体系：采用美国 GPA 体系，从 A+ 到 F。
总体排名： Anthropic (C+) 位列第一，OpenAI (C) 和 Google DeepMind (C-) 紧随其后，其他公司得分均在 D 级或以下。

研究结论

需要加强安全监管：现有的行业自律机制不足以应对 AGI 带来的风险，需要更严格的监管措施。
提高安全透明度：公司应公开更多关于安全评估方法、系统提示、行为规范和重大事件的信息，以增强外界对其安全工作的信任。
改进安全框架：公司需要制定更全面、更具操作性的安全框架，明确风险识别、评估、处理和治理的流程，并建立独立的监督机制。
加大安全研究投入：公司应增加对安全研究，特别是对对齐和控制、可解释性、可扩展监督以及危险能力评估等方面的投入。

About the Organization:The Future of Life Institute (FLI) is an independentnonprofit organization with the goal of reducing large-scale risks and steeringtransformative technologies to benefit humanity, with a particular focus onartificial intelligence (AI). Learn more at futureoflife.org.Contents1.2 Improvement opportunities by company1.4 Independent review panel3.2 Index Design and Structure3.3 Related Work and Incorporated Research3.4 Data Sources and Evidence Collection3.5 Grading Process and Expert Review4.2 Improvement opportunities by companyGovernance & AccountabilityWhistleblowing Policies (15 Questions)External Pre-Deployment Safety Testing (6 Questions)Internal Deployments (3 Questions)Safety Practices, Frameworks, and Teams (9 Questions) 1.1 Key Findings231.3 Methodology452Introduction63Methodology73.1 Companies Assessed771010113.6 Limitations114Results134.1 Key Findings13144.3 Domain-level findings155Conclusions20Appendix A: Grading Sheets21Risk Assessment22Current Harms33Safety Frameworks41Existential Safety4859Information Sharing71Appendix B: Company Survey85Introduction8586919495 21Executive SummaryThe Future of Life Institute's AI Safety Index provides an independent assessment of seven leading AI companies'efforts to manage both immediate harms and catastrophic risks from advanced AI systems. Conducted withan expert review panel of distinguished AI researchers and governance specialists, this second evaluationreveals an industry struggling to keep pace with its own rapid capability advances—with critical gaps in riskmanagement and safety planning that threaten our ability to control increasingly powerful AI systems.Grading:Uses the US GPA system for grade boundaries: A+, A, A-, B+, [...], F letter values corresponding to numerical values 4.3, 4.0, 3.7, 3.3, [...], 0.1.1 Key Findings•Anthropic gets the best overall grade (C+).The firm led on risk assessments, conducting the only humanparticipant bio-risk trials, excelled in privacy by not training on user data, conducted world-leading alignmentresearch, delivered strong safety benchmark performance, and demonstrated governance commitmentthrough its Public Benefit Corporation structure and proactive risk communication.•OpenAI secured second place ahead of Google DeepMind.OpenAI distinguished itself as the onlycompany to publish its whistleblowing policy, outlined a more robust risk management approach in itssafety framework, and assessed risks on pre-mitigation models. The company also shared more detailson external model evaluations, provided a detailed model specification, regularly disclosed instances ofmalicious misuse, and engaged comprehensively with the AI Safety Index survey.•The industry is fundamentally unprepared for its own stated goals.Companies claim they will achieveartificial general intelligence (AGI) within the decade, yet none scored above D in Existential Safety planning. Onereviewer called this disconnect “deeply disturbing,” noting that despite racing toward human-level AI, “none of thecompanies has anything like a coherent, actionable plan” for ensuring such systems remain safe and controllable.•Only 3 of 7 firms report substantive testing for dangerous capabilities linked to large-scale risks suchas bio- or cyber-terrorism(Anthropic, OpenAI, and Google DeepMind). While these leaders marginallyimproved the quality of their model cards, one reviewer warns that the underlying safety tests still missbasic risk-assessment standards: “The methodology/reasoning explicitly linking a given evaluation orexperimental procedure to the risk, with limitations and qualifications, is usually absent. [...] I have veryAnthropicOpenAIGoogleDeepMindOverallGradeC+CC-OverallScore2.642.101.76Risk AssessmentC+CC-Current HarmsB-BC+Safety FrameworksCCD+Existential SafetyDFD-Governance &AccountabilityA-C-DInformation SharingA-BB x.AIMetaZhipu AIDeepSeekDDFF1.231.060.620.37FDFFD+D+DDD+D+FFFFFFC-D-D+D+C+DDF 3low confidence that dangerous capabilities are being detected in time to prevent significant harm. Minimaloverall investment in external 3rd party evaluations decreases my confidence further.”•Capabilities are accelerating faster than risk management practice, and the gap between firms iswidening. With no common regulatory floor, a few motivated companies adopt stronger controls whileothers neglect basic safeguards, highlighting the inadequacy of voluntary pledges.•Whistleblowing policy transparency remains a weak spot.Public whistleblowing policies are a commonbest practice in safety-critical industries because they enable external scrutiny. Yet, among the assessedcompanies, only OpenAI has published its full policy, and it did so only after media reports revealed thepolicy’s highly restrictive non-disparagement clauses.•Chinese AI firms Zhipu.AI and DeepSeek both received failing overall grades.However, the report scorescompanies on norms such as self-governance and information-sharing, which are far less prominent inChinese corporate c

点击免费查看完整报告

AI安全指数

核心观点

关键数据

研究结论

你可能感兴趣

FLI AI安全指数

2025年冬季全球AI安全指数报告

亚洲水发展展望2025：亚太地区水安全指数

2023年安全X部队威胁情报指数

一周集萃：股指-指数连续下挫安全边际较高

等待指数方向选择短炒安全边际高个股

新能源+AI行业周报（第5期）：重视欧洲市场机会，动力储能安全要求提升

2024 年确保成就式 AI 时代的数核心安全：以网络安全为重塑的战略推进动力研究报告

2025人工智能网络安全基准报告：大型组织AI安全挑战应对策略

2024 年全球调查报告： AI 时代的身份安全状况

AI安全指数

你可能感兴趣

FLI AI安全指数

2025年冬季全球AI安全指数报告

亚洲水发展展望2025：亚太地区水安全指数

2023年安全X部队威胁情报指数

一周集萃：股指-指数连续下挫安全边际较高

等待指数方向选择 短炒安全边际高个股

新能源+AI行业周报（第5期）：重视欧洲市场机会，动力储能安全要求提升

2024 年确保成就式 AI 时代的数核心安全 ： 以网络安全为重塑的战略推进动力研究报告

2025人工智能网络安全基准报告：大型组织AI安全挑战应对策略

2024 年全球调查报告 ： AI 时代的身份安全状况

等待指数方向选择短炒安全边际高个股

2024 年确保成就式 AI 时代的数核心安全：以网络安全为重塑的战略推进动力研究报告

2024 年全球调查报告： AI 时代的身份安全状况