On October 24, the White House released the first-ever National Security Memorandum (NSM) on Artificial Intelligence (AI) and an accompanying Framework to Advance AI Governance and Risk Management in National Security (AI Framework) (). Here are five specific taskings to build a research agenda and support implementation of NSM AI:
The AI Safety Institute (AISI) within the National Institute of Standards and Technology in its new role as primary point of contact in government for private sector AI companies focuses on safety-related functions and is tasked with AI testing and evaluation activities. Within 180 days of the NSM, AISI is directed to "pursue voluntary preliminary testing of at least two frontier AI models prior to their public deployment or release to evaluate capabilities that might pose a threat to national security." This testing extends to assessing models' capabilities to accelerate development of biological and/or chemical weapons.
Further, the NSM calls for classified evaluations of advanced AI models' capacity to generate or exacerbate deliberate chemical and biological threats. It gives the Department of Energy, Department of Homeland Security and AISI 210 days to develop a roadmap for future classified evaluations of advanced AI models' capacity to do this. The agencies are directed to consider the scope, scale, and priority of classified evaluations; proper safeguards to ensure that evaluations and simulations are not misconstrued as offensive capability development; proper safeguards for testing sensitive and/or classified information; and sustainable implementation of evaluation methodologies.
Why this is important: To understand the risks posed by new AI systems, it is essential to understand what they can and cannot do. Safety evaluations are how this is done, not only measuring whether AI systems behave as intended and refuse malicious instructions, but also differentiating underlying capabilities using benchmarks. This process helps identify best practices for the responsible development and deployment of frontier models; fosters better understand the nature, capabilities, limitations, and impact of the technology; and suggests what policymakers can do to manage emerging risks.
In addition, the AISI has 180 days to issue guidance for AI developers on how to test, evaluate, and manage risks to safety, security, and trustworthiness arising from dual-use foundation models. The guidance will cover topics relevant to the risks posed by AI models used in development of biological and chemical weapons, the development of mitigation measures to prevent intentional misuse of models, testing the efficacy and safety of mitigations, and the application of risk management practices to the development and deployment lifecycle.
Further, leveraging the United States Government Policy for Oversight of Dual Use Research of Concern and Pathogens with Enhanced Pandemic Potential (), the Office of Science and Technology Policy (OSTP), the National Safety Council, and the Office of Pandemic Preparedness and Response Policy are given 540 days to develop guidance to promote the benefits of and mitigate the risks associated with in silico biological and chemical research.
Why this is important: AI models are considered "dual-use" because they can be used for both beneficial and malicious purposes as defined by the Fink Report. Dual use foundation models with widely available weights (open foundation models) play a critical role in fostering growth among less resourced actors, helping to widely democratize access to AI's benefits. Conversely, this is a double-edged sword, requiring balancing excitement about the benefits of scientific discovery with concerns over weaponization and misuse.
Importantly, models might enable threat actors to better reach the competency threshold for an attack and are particularly relevant for actors at the lower end of the capability spectrum, such as terrorist groups and rogue individuals, shedding light on the dangerous consequences of an easier-to-achieve competency threshold as well as uplift. New guidance should address exclusions in the Dual Use Research of Concern policy especially those involving in silico models and computational approaches, and fill gaps from missed opportunities in both the OSTP Framework on Nucleic Acid Synthesis Screening as well as the Health and Human Services Administration for Strategic Preparedness and Response Screening Framework Guidance for Providers and Users of Synthetic Nucleic Acids (). These gaps include guidance on leveraging AI for robust nucleic acid synthesis procurement screening, as well as AI-enabled detection and attribution for deterrence.
The NSM directs that within 180 days of the the Director of National Intelligence, alongside other intelligence elements, identify critical nodes in the AI supply chain, develop a list of the most plausible avenues through which these nodes could be disrupted or compromised by foreign actors, and take steps, as appropriate and consistent with applicable law, to reduce such risks.
Why this is important: The NSM directs actions to improve the security and diversity of chip supply chains, and to ensure that, as the United States supports the development of the next generation of government supercomputers and other emerging technology, this is done with AI in mind. The advanced AI supply chain represents the largest unaddressed attack surface lurking within businesses (frontier and challenger AI labs) building or deploying AI models (large language models). AI security extends beyond physical security, cybersecurity and "securing model weights," to include securing the full development and deployment pipeline and advanced AI supply chain. Current AI security models focus largely on cybersecurity of AI systems creating a visibility gap to the full spectrum of risks facing the advanced AI supply chain.
To reduce the chemical and biological risks that could emerge from AI, the NSM also gives the National Science Foundation 180 days to coordinate with other agencies, academic research institutions and scientific publishers to develop voluntary best practices and standards for publishing computational biological and chemical models, data sets, and approaches, including those that use AI and that could contribute to the production of knowledge, information, technologies, and products that could be misused to cause harm.
Why this is important: Information hazards are risks that arise from the dissemination or the potential dissemination of true information that may cause harm or enable some actors to cause harm. Such hazards are often subtler than direct physical threats, so they are easily neglected and demand an exploration of how knowledge shared openly might be harmful. This line of effort advocates and encourages the responsible release of information that could lead to dangerous or unethical acts, leveraging a set of standardized, living implementable policies to assess and control risks of misuse.
Finally, to enhance biosafety and biosecurity, the NSM directs that within 240 days the Department of Defense, alongside other agencies pursuing the development of AI systems trained on biological and chemical data, support efforts to utilize high-performance computing resources and AI systems to develop screening tools for in silico research and technology, create algorithms for nucleic acid synthesis screening, and order screening, in particular of data streams from cloud labs and biofoundries.
Why this is important: There is a long and terrible history of deliberate use of microorganisms like viruses or bacteria to cause disease or death. Typically, the development, containment, and deployment of bioweapons has required significant resourcing and expertise. Today, researchers worry that AI might increase bioweapon knowhow, with highly capable AI models assisting non-experts in designing, synthesizing, and using these weapons, thus expanding the pool of actors that could access these dangerous capabilities. The most pressing concern for biological risks related to AI stems from tools that may soon be able to accelerate the procurement of biological agents by nonstate actors. Recent studies have suggested that foundation models may soon be able to help accelerate bad actors' ability to acquire weaponizable biological agents, even if the degree to which these AI tools can currently help them remains marginal. The proposals in the NSM hope to head these off.
The NSM covers a diverse set of issues, providing a comprehensive strategy for governing AI use in national security systems, notably in defense and intelligence agencies. The research agenda described above is largely from Section 3 of the NSM and hence non-exhaustive.