The Off Switch

Be careful what you wish for.

Jun 13, 2026

A special edition, because the argument I have been making in theory happened in fact on Friday afternoon.

At 5:21 in the evening on Friday June 12 2026, the US government sent Anthropic a letter. Citing national security and export-control authority, it ordered the company to suspend all access to its two newest models, Fable 5 and Mythos 5, for every foreign national in the world, including Anthropic’s own foreign-national employees. To comply, Anthropic had to switch both models off for every customer, everywhere. By its own account, the company learned the directive existed and had to start pulling the models down inside the same evening.

Anthropic published a statement and posted it the same night. The operative passage:

We are complying with the government’s legal directive and are removing access to Fable 5 and Mythos 5 for all users. However, we disagree that the finding of a narrow potential jailbreak should be cause for recalling a commercial model deployed to hundreds of millions of people. If this standard was applied across the industry, we believe it would essentially halt all new model deployments for all frontier model providers.
As we have stated publicly, we believe the government should have the ability to block unsafe deployments, as part of a statutory process that is transparent, fair, clear, and grounded in technical facts. This action does not adhere to those principles.

The stated reason was a jailbreak. The government had seen a technique for getting past Fable’s safeguards, and according to Anthropic the technique amounted to asking the model to read a codebase and point out its flaws, a capability the company says is already available in other deployed models including OpenAI’s GPT-5.5 and used every day by the people who defend software for a living. Anthropic is complying, and disputing, at the same time. It says recalling a model deployed to hundreds of millions of people over a narrow finding is disproportionate, and that if the standard were applied evenly it would halt new model releases across the entire industry.

They are right about that. I want to be precise that they are right, because what follows is not a complaint about Anthropic. It is a description of a machine, captured in the moment it was switched on.

I have been making this argument since at least April but I’m honestly surprised it came this fast and in this manner. In Artificial Good Enough Intelligence (April 28) I argued the frontier was going closed, that the best models would increasingly be withheld, restricted, or selectively exposed. In AI Waves #6 (April 23) I wrote that the closed labs had begun to gate capability by trust level, not just price. What I had wrong was the hand on the gate. I took it to be the labs, withholding the top tier as a business decision. Friday the state reached in and took the gate for itself. The off switch was always the risk. It just changed hands.

The Reversal Test

On Wednesday, in Policy on the AI Exponential, Dario Amodei proposed that frontier models pass mandatory third-party testing and that the government hold the power to block or reverse a release that fails. He asked for this explicitly, as the responsible path, with the caveat that the power be exercised through a process that is transparent, fair, clear, and grounded in technical facts.

On Friday, a release was reversed. Against him. Through a process that, by his own account of it, was none of those four things. The letter arrived without specific technical detail, gave one evening to comply, and rested on a finding the company considers trivial and widely available elsewhere.

The gap between Wednesday and Friday is forty-eight hours, and it contains the whole argument. The power to reverse a deployment is not safe in proportion to how carefully it is designed. It is dangerous in proportion to the fact that it exists. Anthropic asked for an instrument and was handed the sharp end of it two days later, which is the oldest lesson in the book about building machinery and assuming you will be the one holding it.

You Cannot Recall Open Weights

Here is the part that matters past this weekend. The reason a directive could darken Fable and Mythos in an evening is that they are closed. They run where the company can reach them, which means they run where the government can reach the company. A single letter to a single legal department took two models away from hundreds of millions of users at once.

There is no equivalent letter for open weights. Once a model’s weights are public, they sit on tens of thousands of machines in dozens of jurisdictions, and there is no address to send the order to. You can regulate what people do with them, you can prosecute misuse, but you cannot recall them. The off switch does not exist, because there is no single hand for it to sit in.

This is the entire case for open models, and it has nothing to do with whether open or closed is safer in the ordinary sense, the sense of refusing a dangerous request. It is a point about who holds the switch. A closed model concentrates capability in a way that is efficient, governable, and revocable. Friday was a demonstration that revocable means revocable by someone other than you, for reasons you may consider trivial, on a timeline you do not control. Open weights are messier, less governable, and harder to make safe in the ordinary sense. They are also the only architecture with no switch to seize. You trade one kind of safety for another, and Friday clarified the price of each.

The Counterweight

This is why the open layer is not an ideological preference. It is a structural one. A world with only closed frontier models is a world with a small number of off switches, held by a small number of parties, that turn out to be live. The counterweight to that is not a better-behaved lab or a wiser regulator, both of which require the institution to stay good forever. The counterweight is infrastructure that does not have an off switch in the first place: weights anyone can run, compute anyone can rent, verification anyone can check. Capability that no letter can recall.

I spent two essays arguing that the threat worth pricing was the concentration of control, not the awakening of the machine. I did not expect the argument to be settled this fast, or this literally, or at Anthropic’s expense rather than in its favor. Friday was not a story about a dangerous model. Every account, including the government’s, agrees the model was not the danger. It was a story about a switch, and the discovery that it works.

p.s. I’m printing these t-shirts. If you managed to trip the safety limits in the last 72 hours you get one. Reply here and I’ll send you one or feel free to copy the idea.

The Off Switch

Be careful what you wish for.

The Reversal Test

You Cannot Recall Open Weights

The Counterweight

Ready for more?