您的浏览器禁用了JavaScript(一种计算机语言,用以实现您与网页的交互),请解除该禁用,或者联系我们。[美国国防部]:人工智能赋能系统研制试验与鉴定指南手册 - 发现报告

人工智能赋能系统研制试验与鉴定指南手册

2025-02-27-美国国防部邓***
AI智能总结
查看更多
人工智能赋能系统研制试验与鉴定指南手册

Developmental Test and Evaluation ofArtificial Intelligence-Enabled SystemsGuidebook February 2025 Office of the Director,Developmental Test, Evaluation,and Assessments Office of the Under Secretary of DefenseforResearch and Engineering Washington, DC Distribution StatementA. Approved for public release.Distribution is unlimited. Developmental Test and Evaluation of Artificial Intelligence-Enabled SystemsGuidebook Office of the Director, Developmental Test, Evaluation, and AssessmentsOffice of the Under Secretary of Defense for Research and Engineering3030 Defense PentagonWashington, DC 20301osd.r-e.comm@mail.milhttps://www.cto.mil/dtea/ Distribution Statement A. Approved for public release; distribution is unlimited.DOPSR Case #25-T-1195. Executive Summary The Department of Defense (DoD) developed this guidebook to support thedevelopmentaltestand evaluation (DT&E) of artificial intelligence (AI) systems and AI-enabled systems(AIES).Its intent is to provide technically sound, consensus-based guidance designed to address theunique challenges posed by AI technologies.The guidebookaims to support government testteams in planning and executingDT&E for AI-enabled components, applications, and systems,while assisting in delivering critical insights to decisionmakers and stakeholders duringAIESdevelopment and deployment. Recognizing that AI is a rapidly evolving field, this guidebookreflects the T&E community’scurrent consensusand will likely require updates as technologyand methodologies advance. Testing AI systemspresents a key challengebecausetraditional comprehensive testingapproaches are no longer feasible for many AI componentsbecause offactors such asthefollowing: •Inherent unpredictability of model outputs in practice.•Model sensitivity to small changes in input.•Complexity and opacity of some AI models.•High dimensionality of parameter spaces.•Complex dependence of model output on training datasets. Furthermore, the normally rapid pace of configuration changes of systems under considerationadds another layer of complexity to the T&E process. These factors undermine the ability of testteams, evaluators,andexecutives to generalize results from particular tests to support thenecessaryevaluations of AI components and AIESfor engineering or acquisition decisions. To address these challenges, the guidebook emphasizes several new approaches: •Early Involvement in Development.Involving T&E teams early in AIESdevelopmentenables mission-informed technology characterization. Thisearly involvementisessential given the iterative nature of machine learning model development: From thebeginning of development, continuous refinements require ongoing evaluation to ensurethatthe fielded systemalignswith operational goals.•Formal Methods for Augmentation.Formal methods offer mathematically rigoroustechniques that complement traditional physical testing, allowing for more precisevalidation of AI systems. These methods help address the inherent complexities anduncertainties associated with AI technologies. •Ensuring Testable Requirements.TheDT&E communityhas traditionally collaboratedwith the requirements community to ensure that system requirements are testable.Thecomplexity of testing AIESexpands this role.Effortsfocus on ensuringnot only thetestability in principle but alsothe ability to developa viable test program to support thenecessary evaluations. •Informing System andConcept of Employment(CONEMP)Development.The iterativenature of AIESdevelopment and its close coupling with CONEMPsrequireDT&Emeasurement activities that inform system and CONEMP developers. Testing in areassuch as human-systems integration, calibrated trust, emergent behavior,andhuman-machine teaming andtheadherence to responsible AI policies will be criticaltoavoidcostly rework and ensurealignment between system design and operational needs. This guidebookultimately aims to serve as a valuable resource for DoD AI efforts, enhancing theDoD capability to effectively test and evaluate AI technologiesandensuretheir successfulintegration in support of national defense. Mr. Christopher C. CollinsDirector, Developmental Test, Evaluation, andAssessments Contents Contents 1Introduction.................................................................................................................................11.1developmental Test and Evaluation as a Continuum.............................................................22DT&E of AI-Enabled Systems Overview...................................................................................42.1AI-Enabled Systems...............................................................................................................42.2DT&E Activities and Outputs................................................................................................92.3CDAO T&E Strategy Frameworks......................................................................................452.4Implications of AI for DT