Frontier LLMs can autonomously discover, exploit, and chain zero-day vulnerabilities in production software without custom scaffolding. The capability has escalated rapidly: Claude Opus 4.6 found 500+ zero-days in open-source codebases (Feb 2026); Claude Mythos Preview — restricted to Project Glasswing partners — now finds "thousands" including 17-year-old FreeBSD RCEs, 27-year-old OpenBSD bugs, and 16-year-old FFmpeg flaws, and can chain 3–5 independent vulnerabilities into sophisticated exploit paths.
What Changed
Prior to late 2025, LLMs could assist with known vulnerability patterns (finding SQLi from templates, explaining CVEs) but could not independently discover novel zero-day vulnerabilities in hardened codebases. This changed with frontier models released in late 2025 and early 2026:
Claude Opus 4.6 (February 2026):
- Ghost CMS: First-ever critical CVE in the project's history — a blind SQL injection enabling unauthenticated credential extraction from the production database. The CMS had 50,000 GitHub stars and ~20 years without a critical vulnerability.
- Linux kernel (NFS v4 daemon): Remotely exploitable heap buffer overflow involving two cooperating clients. The bug predates git — introduced in 2003 and undetected through decades of expert review and continuous fuzzer coverage.
- Mozilla Firefox: In a two-week partnership with Mozilla, Claude found 22 vulnerabilities (14 high-severity) across ~6,000 C++ files — nearly a fifth of all high-severity Firefox bugs patched in 2025. Patched in Firefox 148 (Feb 24, 2026).
- Smart contracts: Research by Anthropic scholars showed LLMs can identify and exploit vulnerabilities recovering several million dollars from real smart contracts, with exploitation capability scaling exponentially (log-scale improvement across model generations).
Claude Mythos Preview (April 2026, Project Glasswing):
- 17-year-old FreeBSD RCE (CVE-2026-4747): Unauthenticated remote code execution giving full server control; fully discovered and exploited autonomously.
- 27-year-old OpenBSD crash: A remote crash vulnerability surviving three decades of expert review and fuzzing.
- 16-year-old FFmpeg bug: Missed across more than 5 million automated test runs; found through code reasoning.
- Linux kernel privilege escalation chain: A multi-step exploit chain requiring discovery of two cooperating bugs and their interaction.
- Vulnerability chaining: Mythos strings together 3–5 independent vulnerabilities into a single sophisticated exploit path, producing outcomes that no individual vulnerability would yield. This is qualitatively different from single-bug discovery.
Mythos scores 83.1% on CyberGym (a benchmark specifically designed to test vulnerability reproduction), compared to Opus 4.6's 66.6% — a 16-point gap in just one model generation.
How It Works
The approach is structurally different from traditional automated security tools:
- No fuzzing harnesses or custom tooling — the model reads and reasons about source code, tracing data flows and understanding component interactions
- Minimal scaffolding — researchers used a coding agent in a VM with a simple prompt ("find a vulnerability, write it up")
- File-by-file hinting — adding a hint to examine specific files enables systematic coverage across an entire codebase
- Exploit generation — the model not only identifies vulnerabilities but writes working exploit code
- Vulnerability chaining (Mythos-level) — the model identifies how multiple independent weaknesses compose into a single high-severity exploit path
Why Assess
This is a paradigm shift in offensive security, but the practice is still emerging:
- Capability is frontier-only: Models released more than 6 months ago cannot reliably find these classes of bugs. CyberGym scores drop sharply outside the frontier tier.
- Scalability challenges: Running the same model multiple times on a codebase tends to rediscover the same bug. Systematic coverage requires file-by-file hinting or more sophisticated orchestration.
- Validation bottleneck: Anthropic reported having "several hundred crashes" in the Linux kernel that could not be reported because they hadn't been manually validated yet.
- Dual-use tension: The same capability that enables defenders to find bugs enables attackers to exploit them. Weak safeguards only stop good-faith users; strong safeguards lock out legitimate defenders.
- Restricted access: The most capable model for this task (Mythos) is restricted to Project Glasswing partners. Teams outside that consortium cannot use it.
Attacker/Defender Balance
Bruce Schneier: "Those panicking about the ramifications are correct about the problem, even if the exact timeline cannot be predicted. The shift will happen sooner than we are ready for." His analysis identifies a current short-term advantage to defenders — finding vulnerabilities for fixing is easier than finding plus exploiting — but expects this advantage to shrink as capable models become more broadly available.
A joint Cloud Security Alliance / SANS / OWASP report concludes that organisations are "likely to be overwhelmed" in the near term by threat actors using AI to find and exploit vulnerabilities faster than defenders can patch. IBM's framing: "If the attackers aren't humans anymore, the defenders can't be humans anymore either." The conflict has shifted to machine speed vs. machine speed.
Rate of Progress
METR's Time Horizons benchmark shows autonomous task completion capability doubling roughly every 4–7 months. The CyberGym delta between Opus 4.6 (66.6%) and Mythos (83.1%) in one generation validates this trajectory for security-specific tasks. Nicholas Carlini (Anthropic) notes that models released 3–4 months ago cannot find these bugs; current models can.
Implications for Engineering Teams
- The attacker-defender balance is shifting. For 20 years, dual-use security research generally favored defenders. LLM-powered vulnerability discovery — especially chaining — may tip this balance during the transitionary period.
- Traditional security tools are complementary, not sufficient. Pattern-matching SAST tools and coverage-guided fuzzers miss the classes of bugs that LLMs find through reasoning about code semantics.
- Proactive scanning is now possible. Tools like Claude Code Security and OpenAI's Arvar project are productizing this capability for defenders.
- Watch for capability democratization. The CyberGym gap between Mythos (83.1%) and Opus 4.6 (66.6%) is a one-generation lag. Models at today's Mythos capability level will likely be generally available within 12–18 months at current progress rates.
Key Characteristics
| Property | Details |
|---|---|
| Pioneered by | Anthropic Frontier Red Team |
| Key researcher | Nicholas Carlini |
| Current best model | Claude Mythos Preview (restricted, Project Glasswing) |
| Production-available model | Claude Opus 4.6 |
| CyberGym score | Mythos: 83.1% / Opus 4.6: 66.6% |
| Bugs found (Opus 4.6) | 500+ high-severity zero-days in open-source software |
| Notable targets | Linux kernel, Ghost CMS, Mozilla Firefox, FreeBSD, OpenBSD, FFmpeg |
| Similar efforts | Google DeepMind (Big Sleep), OpenAI (Arvar project) |
Further Reading
- Evaluating and mitigating the growing risk of LLM-discovered 0-days — Anthropic Frontier Red Team blog post (Feb 2026)
- Mythos Preview technical blog (Anthropic Frontier Red Team) — April 2026 update covering chaining and Glasswing findings
- Project Glasswing (Anthropic) — restricted partner programme
- On Anthropic's Mythos Preview and Project Glasswing — Schneier on Security — attacker/defender balance analysis
- Hardening Firefox with Anthropic's Red Team — Mozilla blog post on the 22 CVEs found (Mar 2026)
- Reverse engineering Claude's CVE-2026-2796 exploit — exploit deep dive
- Nicholas Carlini — conference talk on LLM vulnerability research — live demos of Ghost CMS and Linux kernel zero-days
- AI Finds Vulns You Can't — podcast with Nicholas Carlini
- METR Time Horizons — benchmark tracking autonomous capability growth