Anthropic Built Something So Powerful It Had to Warn Everyone Before Releasing It
Anthropic built a model that autonomously finds thousands of zero-days in production systems, then launched a private coalition to patch them before releasing it publicly.
On April 7th, Anthropic announced a new AI model and simultaneously launched a program to protect the world from it. That’s not a simplification — that’s basically what happened. Claude Mythos, their latest frontier model, is capable of autonomously identifying thousands of high-severity vulnerabilities across every major operating system and web browser. (The official launch announcement is here.) Rather than release it publicly, Anthropic spun up Project Glasswing: an invite-only coalition of 40+ organizations — including AWS, Apple, Google, Microsoft, JPMorgan, and government cybersecurity agencies — equipped with $100 million in model credits to stress-test their systems against Mythos before anyone else gets access. The message, delivered with a straight face: here’s what our AI can do, and here’s why you should be scared.
This is a new move. The standard playbook for a frontier AI launch goes something like: build it, benchmark it, release it, iterate on the fallout. Anthropic is running a different play. They’re saying, out loud, that Mythos is genuinely dangerous in ways that require mobilizing the cybersecurity infrastructure of some of the largest organizations in the world before anyone else can touch it. And they’re framing this as responsibility rather than alarm. Whether you buy that framing or not, it’s worth sitting with what it actually means: one of the leading AI labs just admitted that the thing it built is a dual-use weapon, and their plan for handling that is to get a very exclusive group of powerful institutions to patch the holes it can find before the public gets access.
The technical picture is impressive in a genuinely unsettling way. Mythos scored 56.8% on Humanity’s Last Exam — the hardest benchmark in the field — without any tools. That puts it ahead of every other publicly evaluated frontier model except the full, unreleased version of itself. On cybersecurity specifically, it doesn’t just identify vulnerabilities in theory; it finds real ones, in production systems, at scale. The dual-use problem here isn’t hypothetical. If Anthropic can run Mythos in a controlled environment and find thousands of zero-days across major OSes and browsers, so can anyone else with access to a comparable model — or anyone who manages to get access to Mythos itself through less official channels.
One of the leading AI labs just admitted that the thing it built is a dual-use weapon, and their plan for handling that is to get a very exclusive group of powerful institutions to patch the holes it can find before the public gets access.
Project Glasswing’s structure raises its own questions. The 40+ participating organizations are an interesting list: major tech companies, financial institutions, government agencies. Notably absent: AI accountability organizations, civil society groups, independent security researchers, or anyone who might have a perspective that isn’t aligned with the interests of large institutions. When you decide that the appropriate response to a dangerous capability is to quietly share it with the most powerful players in the ecosystem and ask them to fix the vulnerabilities that benefit them, you’re making a choice about who gets to know and who gets to prepare.
“‘Too Dangerous to Release’ Is Becoming AI’s New Normal.”
— Time Magazine, April 2026
Time Magazine ran a piece this April with exactly that framing. That framing is correct — but it’s also worth asking what “too dangerous” actually means when the model is still being handed to 40 organizations including the NSA.
There’s a version of this where Anthropic is doing exactly the right thing: being transparent about dangerous capabilities, building a structured response before releasing the model publicly, and prioritizing security over speed to market. There’s another version where “Project Glasswing” is a sophisticated way to generate goodwill, entrench Anthropic’s relationships with the most powerful institutions in the world, and still technically be first to market on a capability their competitors are racing toward. Both things can be true. The cybersecurity community is already warning about a coming flood of vulnerabilities and patches that security teams won’t have the bandwidth to process — because every major AI lab is building toward this capability, whether they’re being as public about it as Anthropic or not.