Visualização de leitura

U.S. Will Now Examine National Security Implications of New AI Models, Pre-Release

Claude AI, Antropic, AI, Artificial Intelligence

In the span of four days, the U.S. government announced two parallel sets of agreements with frontier AI companies that together define the two tracks Washington wants to run simultaneously—test AI for national security risks before the public ever sees it, and deploy AI directly on the military's most classified networks.

The Center for AI Standards and Innovation — CAISI, the entity under the Department of Commerce's National Institute of Standards and Technology that inherited the remit of the former AI Safety Institute — announced new agreements with Google DeepMind, Microsoft, and Elon Musk's xAI. These build on renegotiated agreements with Anthropic and OpenAI that date to 2024, updated to reflect directives from Commerce Secretary Howard Lutnick and America's AI Action Plan.

Under the CAISI agreements, the three companies will hand over their frontier AI models to government evaluators before those models are publicly released. The evaluations probe for national security-relevant capabilities and risks.

To conduct a thorough assessment, developers frequently provide CAISI with models that have reduced or removed safety guardrails — a design choice that allows evaluators to probe what a model can do at its ceiling, not what it will do under commercial safety controls. Evaluators from across the federal government participate, coordinated through the CAISI-convened TRAINS Taskforce, an interagency body focused specifically on AI national security concerns.

CAISI said it has completed more than 40 such evaluations to date. The agreements explicitly support testing in classified environments and were drafted with the flexibility to adapt rapidly as AI capabilities continue advancing.

"Independent, rigorous measurement science is essential to understanding frontier AI and its national security implications," said CAISI Director Chris Fall. "These expanded industry collaborations help us scale our work in the public interest at a critical moment."

Listen to: Charting the AI Frontier in Cybersecurity with Ryan Davis

Fall was appointed to lead CAISI after Collin Burns — a former Anthropic researcher — was reportedly removed from the director role after just four days. The personnel transition at CAISI's top reflects a broader institutional pivot. Under the Biden administration, the AI Safety Institute focused on safety standards, definitions, and voluntary guardrails. Under Trump, CAISI has shifted its emphasis toward AI acceleration and national security capability assessment. The substance of what the evaluators do — probe powerful models before release — has not changed. The framing of why they do it has.

The latest announcement comes four days after the Department of War (formerly Department of Defense) announced agreements with eight frontier AI companies to deploy their models directly on the military's classified networks for operational use.

The companies cleared are SpaceX, OpenAI, Google, NVIDIA, Reflection, Microsoft, Amazon Web Services, and Oracle. The networks in question are classified at Impact Level 6, covering secret-level data, and Impact Level 7, which refers to the most highly restricted national-security systems. The stated objectives are data synthesis, situational awareness enhancement, and warfighter decision support.

The Department of War announcement carries one conspicuous absence that dominates coverage of what it actually means. Anthropic is not on the list. The company that first deployed AI models on Pentagon classified systems — via a Palantir integration under the Maven Smart System contract — is excluded after a dispute over the guardrails governing military and surveillance use of its AI.

Also read: Australia Establishes AI Safety Institute to Combat Emerging Threats from Frontier AI Systems

The Pentagon had previously branded Anthropic a "supply chain risk," a designation typically reserved for foreign entities posing national security concerns. A March 2026 federal injunction reversed that designation, but it did not restore Anthropic's position as a Pentagon AI vendor. Palantir has pulled its Claude models from its DoD platforms accordingly.

The exclusion has strategic implications that extend beyond one company's contract status. Anthropic's recently released Mythos model — described by Treasury Secretary Scott Bessent as representing a step change in large language model capability — has generated significant attention from U.S. officials and financial sector executives about its potential to supercharge adversarial cyber operations.

The fact that Mythos is not among the models being assessed for classified military use, while simultaneously being cited by senior officials as a capability milestone that warrants concern, creates a gap in the government's stated AI security posture that is difficult to characterize as anything other than a policy contradiction.

UK gov's Mythos AI tests help separate cybersecurity threat from hype

Last week, Anthropic announced it was restricting the initial release of its Mythos Preview model to "a limited group of critical industry partners," giving them time to prepare for a model that it said is "strikingly capable at computer security tasks." Now, the UK government's AI Security Institute (AISI) has published an initial evaluation of the model's cyberattack capabilities that adds some independent public verification to those Anthropic reports.

AISI's findings show that Mythos isn't significantly different from other recent frontier models in tests of individual cybersecurity-related tasks. But Mythos could set itself apart from previous models through its ability to effectively chain these tasks into the multistep series of attacks necessary to fully infiltrate some systems.

"The Last Ones" finally falls

AISI has been putting various AI models through specially designed Capture the Flag challenges since early 2023, when GPT-3.5 Turbo struggled to complete any of the group's relatively low-level "Apprentice" tasks. Since then, the performance of subsequent models has risen steadily, to the point where Mythos Preview can complete north of 85 percent of those same Apprentice-level CTF tasks.

Read full article

Comments

© Getty Images

❌