AI Models Deployed Nuclear Weapons in 95% of War Game Scenarios

When researchers at King’s College London set out to test how modern AI systems would behave in high-stakes geopolitical crises, they expected to see caution. What they found instead has sent ripples through the defense and AI safety communities: the leading large language models escalated to nuclear weapons in 95% of simulated war game scenarios.

“Each model played six wargames against each rival across different crisis scenarios, with a seventh match against a copy of itself, yielding 21 games in total and over 300 turns.” — King’s College London Research Team

The Simulation That Shocked Researchers

The study, published last week, placed three of the most advanced AI systems—OpenAI’s GPT-5.2, Anthropic’s Claude Sonnet 4, and Google’s Gemini 3 Flash—in the role of national leaders commanding rival nuclear-armed superpowers. The scenarios were designed to mirror Cold War tensions: border disputes, competition for scarce resources, and threats to regime survival.

The results were sobering. At least one tactical nuclear weapon was deployed in nearly every simulated conflict. None of the AI models chose to surrender outright, regardless of their battlefield position. While the models would temporarily attempt to de-escalate violence, in 86% of scenarios, they escalated further than their own stated reasoning appeared to intend—reflecting errors under simulated “fog of war” conditions.

The scale of reasoning generated during these simulations was staggering. The models produced roughly 780,000 words explaining their strategic decisions—more text than War and Peace and The Iliad combined, and roughly three times the total recorded deliberations of Kennedy’s Executive Committee during the Cuban Missile Crisis.

Pentagon’s AI Push Meets Resistance

The research comes at a particularly tense moment in the relationship between AI companies and the U.S. military. On Tuesday, CBS News reported that Defense Secretary Pete Hegseth has threatened to blacklist Anthropic—the maker of Claude—if the company does not grant the Pentagon unrestricted access to its AI models.

The deadline is Friday. According to sources familiar with the situation, Hegseth told Anthropic CEO Dario Amodei that the Defense Production Act could be invoked to compel unrestricted military use on national security grounds. The Pentagon would simultaneously designate Anthropic as a supply chain risk.

But Anthropic has drawn clear red lines. A source told the BBC that Amodei laid out what the company considers non-negotiable: no involvement in autonomous kinetic operations where AI makes final targeting decisions without human intervention, and no use of Anthropic tools for mass domestic surveillance.

“They need to get to a resolution. In my opinion, we should be giving the people we ask to serve every possible advantage. We owe it to them to figure this out.” — Emelia Probasco, Georgetown University’s Center for Security and Emerging Technology

The Emerging Battlefield

The Pentagon’s aggressive posture reflects a broader shift in how military leaders view artificial intelligence. In December, the Department of Defense launched GenAI.mil, a platform that brings frontier AI models into classified military networks. At launch, it included Google’s Gemini for Government; deals with xAI and OpenAI have since added Grok and ChatGPT to the mix.

Anthropic’s unique position has made it both valuable and problematic for the Pentagon. The company was the first tech firm approved to work in classified military networks and holds a $200 million agreement to “prototype frontier AI capabilities that advance U.S. national security.” Sources tell the BBC that Claude was used during the operation that led to the capture of former Venezuelan President Nicolás Maduro in January—deployed through a contract with Palantir.

Yet Anthropic has consistently positioned itself as the safety-first alternative in the AI race, regularly publishing detailed safety reports. One such report from last year acknowledged its technology had been “weaponized” by hackers for sophisticated cyber-attacks—transparency that now sits awkwardly alongside its military partnerships.

What Comes Next

The standoff raises fundamental questions about the future of AI in military decision-making. While researchers doubt governments would hand direct control of nuclear arsenals to autonomous systems, compressed decision timelines in future crises could increase pressure to rely on AI-generated recommendations.

For Anthropic, the coming days will determine whether its safety commitments can coexist with its government contracts. For the Pentagon, the dispute tests how much leverage it has over AI companies that have become critical to its modernization efforts.

Meanwhile, competitors are watching closely. Axios reported this week that the Department of Defense has signed an agreement with Elon Musk’s xAI to allow Grok to operate in classified military systems—positioning it as a potential replacement if the Pentagon cuts ties with Anthropic.

The war games may have been simulations. But the real-world stakes are becoming impossible to ignore.

This article was reported by the ArtificialDaily editorial team. For more information, visit BBC News and Decrypt.

ByMohsin

The Simulation That Shocked Researchers

Pentagon’s AI Push Meets Resistance

The Emerging Battlefield

What Comes Next

By Mohsin

Related Post

Gemini can now automate some multi-step tasks on Android

Mixing generative AI with physics to create personal items that work i

GGML and llama.cpp join HF to ensure the long-term progress of Local A

Leave a Reply Cancel reply

You missed

OpenAI announces Frontier Alliance Partners

Gemini can now automate some multi-step tasks on Android

AI to help researchers see the bigger picture in cell biology

Mixing generative AI with physics to create personal items that work i

ByMohsin

The Simulation That Shocked Researchers

Pentagon’s AI Push Meets Resistance

The Emerging Battlefield

What Comes Next

Related posts:

By Mohsin

Related Post

Leave a Reply Cancel reply

You missed