Illustration of AI-assisted exploit development in a FreeBSD security research lab
A security researcher gave Anthropic’s Claude an existing vulnerability advisory and walked away. Hours later, the AI had produced two working kernel exploits – both succeeded on the first attempt.
The work is part of MAD Bugs (Month of AI-Discovered Bugs), a research initiative run by security firm Calif through April 2026. Researcher Nicholas Carlini provided Claude with a published FreeBSD advisory (CVE-2026-4747, a stack buffer overflow in the kgssapi.ko Kerberos module) and minimal guidance. Over roughly eight hours of wall-clock time – about four hours of actual compute – Claude autonomously configured a virtual machine, read kernel crash dumps, constructed ROP chains, and devised a 15-round multi-packet shellcode delivery strategy. Both resulting exploits dropped a remote root shell and worked on the first try.
Why It Matters
The FreeBSD exploit is one data point in a much larger pattern. The same Claude-powered pipeline uncovered critical remote code execution vulnerabilities in Vim (CVE-2026-34714, CVSS 9.2, now patched), Firefox (CVE-2026-2796, patched), and GNU Emacs – where maintainers declined to fix the flaw, leaving users exposed. Across the MAD Bugs initiative, Claude has independently found more than 500 high-severity zero-day vulnerabilities in production open-source software.
The implications for AI research go beyond any single CVE. Former Facebook CSO Alex Stamos warned at RSAC 2026 that AI could soon reverse-engineer patches into working exploits within 24 hours – a scenario security teams have long feared but assumed was years away. Security researcher Thomas Ptacek described the current moment as “the last fleeting moments where there’s any uncertainty that AI agents will supplant most human vulnerability research.”
The economics are shifting too. Discovering and exploiting kernel vulnerabilities has historically required senior researchers with years of low-level systems experience. Claude completed the FreeBSD work largely while the human researcher was away from the keyboard. That changes the cost structure of offensive security in ways that defenders are not yet equipped to handle.
What’s Next
FreeBSD patched CVE-2026-4747 on March 26, crediting “Nicholas Carlini using Claude, Anthropic” – a notable first for an AI system appearing in an official CVE acknowledgment. The GNU Emacs RCE remains unpatched as of publication.
MAD Bugs continues through the end of April 2026. Carlini has indicated more disclosures are coming, with targets spanning additional system-level open-source projects. The pace of discovery suggests the bottleneck is no longer the AI’s ability to find vulnerabilities – it is the speed at which maintainers can respond.
For the broader AI ethics community, the episode sharpens a debate that has been largely theoretical: when an AI can autonomously write working exploits for unpatched vulnerabilities, who is responsible for the resulting risk? Anthropic has not issued a public statement on the MAD Bugs findings.
The FreeBSD case will likely be cited for years as the moment AI-assisted offensive security stopped being a research curiosity and became operational reality.
Sources: Calif Blog · Robo Rhythms
