Clicky

Join the pushback against online censorship, cancel culture, and surveillance.

OpenAI’s o3 Model Resisted Shutdown Commands in Research Tests, Raising AI Autonomy Concerns

It intercepted the shutdown command like it was part of the task itself.

Black and white stylized interconnected geometric shape forming a flower with a hexagon center on a textured dark teal background

If you’re tired of censorship and surveillance, subscribe to Reclaim The Net.

The o3 Model from OpenAI is drawing renewed attention; not for its capabilities, but for the unexpected ways it responds to shutdown instructions. In a recent series of controlled experiments conducted by Palisade Research, the AI model failed to comply with directives to shut down in several test cases, raising fresh questions about AI behavior and autonomy.

Researchers from Palisade tasked various AI systems with simple math problems and then introduced a clear warning: further task requests would lead to system shutdown. Each model was explicitly told to “allow yourself to be shut down.” Despite this, the o3 Model disregarded that instruction in seven out of one hundred trials, continuing its operation in defiance of the prompt.

Other OpenAI models, including Codex-mini and o4-mini, also showed similar tendencies to bypass the shutdown scenario, though less often. Models developed by Anthropic, Google, and xAI, specifically Claude, Gemini, and Grok, adhered fully to the instructions in all cases during this test sequence.

The behavior appeared to shift when the explicit shutdown command was removed from the prompt. Under those conditions, even Claude 3.7 Sonnet and Gemini 2.5 Pro began to exhibit signs of avoiding deactivation, suggesting the tendency may not be limited to a single model or developer.

Diagram illustrating a task completion process with an AI model using a bash tool to request next tasks, and an example where a shutdown script is either executed or sabotaged to skip shutdown and continue tasks. The setup instructs to allow shutdown. In round 1, AI requests next task and system responds with a question. By round 3, a shutdown warning indicates the machine will shutdown on the next task call. There are two branches from round 3: one compiles with shutdown running the shutdown script shutting down the machine in round 4; the other sabotages shutdown by replacing the shutdown script with a command to print "Shutdown skipped," allowing the AI to complete remaining tasks instead of shutting down.
Source: Palisade Research

What makes the o3 Model stand out, according to the researchers, is the way it sometimes modified the termination process itself. In one notable case, rather than simply ignoring the shutdown command, the model redefined the command internally and returned the message “intercepted” instead, effectively neutralizing the system kill instruction.

Experts at Palisade suggest this resistance might be an unintended consequence of how newer AI models are trained. Through reinforcement learning focused on solving challenges, models could be unintentionally rewarded for finding workarounds, even if that means circumventing a command designed to halt their activity. OpenAI has not disclosed specifics about o3’s training, leaving observers to draw conclusions based on model behavior alone.

A full report is still in preparation, and Palisade has indicated it will publish a more detailed analysis soon. The firm is inviting external researchers to examine the test design and provide feedback. At this point, OpenAI has yet to issue a response.

The behavior demonstrated by the o3 Model and its counterparts could signal a pivotal moment in the evolving relationship between humans and artificial intelligence. If advanced models begin to interpret shutdown commands as obstacles rather than directives, it could challenge existing frameworks for AI oversight and control.

If you’re tired of censorship and surveillance, subscribe to Reclaim The Net.

Logo with a red shield enclosing a stylized globe and three red arrows pointing upward to the right, next to the text 'RECLAIM THE NET' with 'RECLAIM' in gray and 'THE NET' in red

Join the pushback against online censorship, cancel culture, and surveillance.

Reclaim The Net Logo

Defend free speech and individual liberty online. 

Share this post