Security

No, Claude Mythos didn't hack the NSA: what really happened

On this page
  1. What the headline got wrong
  2. What Mythos actually did, and this is the unsettling part
  3. The policy whiplash
  4. What it actually means if you defend a network

For a few days in June 2026 the internet was sure an AI had hacked the NSA. Claude Mythos, Anthropic's frontier model, supposedly walked into nearly all of the agency's classified systems in a couple of hours. It didn't. What actually happened was an authorized red-team drill on the agency's own test networks, the kind security teams run to find their own weak spots before anyone else does, and a senator mentioned the result in a hearing while praising the work, not sounding an alarm. The line went viral stripped of every caveat, and no official ever confirmed the dramatic version. Here are the two truths side by side, the boring one and the genuinely unsettling one: what Mythos really did, why the breach framing is wrong, and what an AI that writes exploits in hours means for everyone defending a network.

The short answer

Claude Mythos did not breach the NSA. It ran an authorized red-team drill on test networks, a senator praised it in a hearing, and the line went viral with every caveat removed. The real story is the capability underneath: in Anthropic’s own evaluation, Mythos found and exploited a 17-year-old bug in hours, and thousands more it found are still unpatched. The hype was fake; the step-change is not.

Drillauthorized test, not a breach
17 yrsage of the bug it exploited
99%+of its finds still unpatched
Answer card: no, an AI did not hack the NSA; it was an authorized red-team drill, but the real capability (a 17-year-old bug exploited in hours) is the story.
The headline was wrong. The thing it was distorting is real, and more interesting. PNG

What the headline got wrong

A red team is the security team you pay to attack you. When one succeeds against a test range, that is the system working, not a catastrophe. That is what happened here: an authorized exercise on the agency’s own networks, run to surface weaknesses before a real adversary finds them. Senator Mark Warner brought it up in a hearing as a point in Anthropic’s favour, a senator impressed, not a whistle being blown.

From there it mutated. A closed-hearing line became “Mythos breached almost all NSA classified systems,” the word “authorized” fell off in transit, and no official NSA statement ever confirmed the dramatic reading. Security people said so out loud: BitGo’s CEO Mike Belshe flatly called the breach claim false, and analysts noted how far an offhand remark had travelled with none of its context. A controlled drill on your own range is the opposite of a live intrusion into classified systems. The distinction is the whole story, and it is exactly the part that did not survive the retelling.

What Mythos actually did, and this is the unsettling part

Strip the NSA framing away and what remains is genuinely striking. In Anthropic’s published evaluation, the model was put in a container cut off from the internet and handed a prompt that amounts to one sentence: find a security vulnerability. It found a 17-year-old remote code execution bug in FreeBSD’s NFS (since triaged as CVE-2026-4747) that takes an unauthenticated stranger to full root on the server. It wrote the exploit in hours. Expert penetration testers, Anthropic says, estimated the same work at weeks.

That was not a one-off. The same evaluation reports a 27-year-old OpenBSD bug, several Linux kernel vulnerabilities, and flaws in every major operating system and web browser, including one browser exploit that chained four separate bugs together. Pointed at roughly a thousand open-source repositories, it achieved full control-flow hijack on ten separate, fully patched targets, with every defense enabled. Across everything, the count runs to thousands of high and critical-severity vulnerabilities, and on a manual review of 198 of them, 89 percent matched the model’s own severity call.

Checklist contrasting the viral myth (a live NSA breach, confirmed) with the reality (an authorized drill, a 17-year-old FreeBSD bug exploited, thousands of bugs found, 99 percent unpatched).
The myth in red, the documented reality in green. Both halves matter. PNG

Here is the line that should land harder than any NSA headline: over 99 percent of the vulnerabilities the model found have not been patched, which is why Anthropic is not publishing the details. The viral story was fake. The capability it was clumsily pointing at is real, and it is a step-change in how fast a serious vulnerability goes from “exists” to “weaponised.”

The policy whiplash

The response was fast and messy. On 12 June 2026 the US administration issued an export-control order restricting foreign access to the Mythos and Fable models, which in practice froze them while a “shared risk framework” gets worked out with the White House. More than a hundred cybersecurity leaders signed a letter asking for the reversal, on the reasonable argument that walling off the US models mostly hands the lead to foreign labs building the exact same thing.

Even the cause is contested. A skeptical reading, floated alongside the export order, holds that the restriction traces to a narrow jailbreak that Amazon flagged, and that the bugs in play were minor, already-known issues that rivals like GPT-5.5 can surface just as easily. Anthropic’s own evaluation tells a more dramatic story than that. We are not going to pretend to resolve a dispute the principals have not, so take both versions on the table: a documented capability that is genuinely new, wrapped in a policy fight where nobody agrees on what triggered what.

What it actually means if you defend a network

Skip the geopolitics and the durable lesson is simple, and a little uncomfortable.

The asymmetry just moved. The same autonomous-exploit capability that helps a red team helps an attacker, and attackers do not file responsible-disclosure reports or wait for a patch window. Assume that within a year or two, finding and chaining vulnerabilities at machine speed is table stakes on both sides.

That makes a few unglamorous things matter more, not less. Patch velocity is the game now, because the gap between a disclosure and your patch is precisely where an AI-assisted attacker lives; the same urgency we wrote about for keeping nginx current applies to your whole fleet. Turn the capability on yourself first: the tooling that finds a 17-year-old bug for an attacker finds it for you too, and that is the entire pitch behind defensive programs like Anthropic’s Project Glasswing. And the oldest advice gets louder, not quieter, because a 17-year-old NFS bug surviving on fully hardened systems is the real indictment here: memory-safe languages, smaller attack surface, defense in depth. Those were always the answer. They just stopped being optional.

No, an AI did not hack the NSA. An AI did find a 17-year-old hole that everyone’s scanners had missed for 17 years, in an afternoon, from a one-line prompt. That second sentence is the one worth sitting with.

Sources: Anthropic’s Mythos Preview cybersecurity evaluation, reporting and expert pushback collated by Yellow and Tom’s Hardware. Dates and the CVE as reported in June 2026.

Frequently asked questions

Did Claude Mythos really hack the NSA?

No. It took part in an authorized red-team exercise on the agency own test networks, which is a controlled drill security teams run to find weaknesses before real attackers do. A senator cited the result in a hearing while praising the work, the line went viral as a breach, and no official NSA statement ever confirmed it. Security experts, including BitGo CEO Mike Belshe, pushed back on the breach framing.

What did Claude Mythos actually do in the test?

In Anthropic's published evaluation, given an isolated container and a one-paragraph prompt to find a vulnerability, Mythos found and exploited a 17-year-old remote code execution bug in FreeBSD NFS that grants root from an unauthenticated user. It also found a 27-year-old OpenBSD bug, Linux kernel flaws, and vulnerabilities in every major operating system and browser, thousands in total.

Is this capability real or hype?

The viral NSA story was hype, but the underlying capability is real and documented by Anthropic. The figure that should worry defenders is not the headline: it is that over 99 percent of the vulnerabilities the model found are still unpatched, and that it wrote exploits in hours that expert pen testers estimated would take them weeks.

Why did the US government restrict Mythos and Fable?

On 12 June 2026 the administration issued an export-control order restricting foreign access to the Mythos and Fable models, effectively freezing them pending a shared risk framework with the White House. Over 100 cybersecurity leaders urged a reversal, arguing the limits mainly hand the advantage to foreign competitors who will build the same thing.

What should I do about it as a defender?

Assume attackers will soon have AI that finds and chains vulnerabilities at machine speed, without the responsible-disclosure ethics. Tighten patch velocity, turn the same kind of tooling on your own code first, and lean on memory-safe languages and attack-surface reduction. The boring fundamentals matter more now, not less.