Corinne Yorkman and Mark ReithAir Force Institute of Technology, WPAFB, USA corinne.yorkman.1@us.af.milmark.reith.3@us.af.mil Abstract:Great power competition has escalated globally, making it increasingly important for the Department of Defense(DoD) to adopt artificial intelligence (AI) technologies that are advanced and secure. Large language models (LLMs), whichgenerate text, code, images, and other digital content based on data sets used in training have gained attention for theirpotential in DoD applications such as data analysis, intelligence processing, and communication. However, due to thecomplex architecture and extensive data dependency of LLMs, integrating LLMs into defense operations presents uniquecybersecurity challenges. These risks, if not properly managed, could pose severe threats to national security and missionintegrity. This survey paper categorizes these challenges into vulnerability-centric risks, such as data leakage, andmisinformation, and threat-centric risks, including prompt manipulation and data poisoning, providing a comprehensive Keywords:Large language models, Cybersecurity challenges, Department of defense 1.Introduction The integration of artificial intelligence (AI) into critical operations has transformed numerous sectors, and theDepartment of Defense (DoD) is no exception. Large language models (LLMs) represent significant advancementsin natural language processing (NLP). These models have the potential to revolutionize decision-makingprocesses, intelligence analysis, and communication strategies within the DoD (Caballero & Jenkins, 2024).However, with these opportunities come substantial cybersecurity risks, as the DoD operates in environmentswith high stakes for national security and sensitive data protection. The expansive capabilities of LLMs alsointroduce novel vulnerabilities and threats that demand rigorous examination and tailored mitigation strategies(Department of Defense, 2023). While the field of cybersecurity has seen a growing body of research, the 1.1Framework A key aspect of this survey is the categorization of risks into two primary frameworks: vulnerability-centric risksand threat-centric risks. Vulnerability-centric risks address the systemic weaknesses within LLMs, such as dataleakage, unintended biases, and misinformation (Ganguli et al., 2023). In contrast, threat-centric risks pertain toexternal adversarial threats, including prompt manipulation and data poisoning. Threat centric and vulnerabilitycentric approaches are among the most common perspectives to address risk (Silva & Jacob, 2018). This dualframework is rooted in established practices within cybersecurity and risk management, which emphasize theinterplay between mitigating internal weaknesses and addressing external threats. Further, it follows the NISTCybersecurity Framework, which identifies vulnerabilities and threats as key risk factors. The classification also 2.Background LLMs are “a category of foundation models trained on immense amounts of data making them capable ofunderstanding and generating natural language and other types of content to perform a wide range of tasks” Corinne Yorkman and Mark Reith (IBM, 2024). These capabilities include text summarization, question answering, language translation, and more.They are built on deep learning architectures and trained on vast datasets which enable them to perform Recent advancements in LLMs have pushed the boundaries of what these systems can achieve and highlydiversified their applications. The introduction of transformers has revolutionized LLM architectures by enablingmodelsto understand context through mechanisms like self-attention,significantly enhancing theirperformance (Vaswani et al., 2017). Advancements in model fine-tuning and transfer learning also allow LLMs 2.1Role of LLMs in DoD Operations The DoD has recognized the transformative potential of LLMs for enhancing operational efficiency and decision-making processes. As stated by the Deputy Secretary of Defense (2023), “the DoD faces an imperative to explorethe use of this technology and the potential of these models' scale, speed, and interactive capabilities to improvethe Department's mission effectiveness while simultaneously identifying proper protection measures andmitigating a variety of related risks.” As an effort to advance safe and responsible AI technology within the USAF,the Air Force Research Laboratory developed NIPRGPT, an LLM cleared for installation on unclassified USAFsystems (Secretary of the Air Force Public Affairs, 2024). Additional LLM use cases include those by the Air MobilityCommand which has leveraged LLMs to generate campaign simulations and those by the US Air Forces Centralwhich use LLMs to expedite routine maintenance of software tools (Caballero & Jenkins, 2024). LLMs have thepotential to transform processes like information, planning, and decision-making and aid in military exercises.They can be used to synthesize intell