Defenders have months, not years, to prepare for AI-powered hacking

Abstract of modern high tech internet data center room with rows of racks with network and server hardware. 3d rendering

Instead of Spy vs. Spy, the cybersecurity world is quickly becoming AI vs. AI.

Anthropic is positioning its gated Claude Mythos Preview model as offering a “striking leap” in many evaluation benchmarks over its predecessor, which at the time of launch was Opus 4.6. Most recently, Cloudflare Chief Security Officer Grant Bourzikas reported that Mythos chained low-severity bugs into working exploit proofs across more than fifty of the company’s repositories. But Bourzikas noted that if you ask the model directly to write a proof of concept against live infrastructure, it refuses, offers to audit the code defensively, and suggests you try a local harness instead. Ask the same question after an unrelated environment change, and it complies.

Over the past two weeks, several other developments have sharpened the picture. Curl maintainer Daniel Stenberg reported that Mythos found one confirmed low-severity vulnerability in curl’s 178,000 lines of code, fewer issues than prior AI tools had surfaced, and called the rollout “an amazingly successful marketing stunt.” The UK’s AI Security Institute, which first evaluated Mythos in April, reported that a newer checkpoint showed “notable capability jumps“: it completed AISI’s 32-step corporate network takeover simulation (“The Last Ones”) in six of ten attempts, up from three, and became the first model to solve “Cooling Tower,” a 7-step industrial control system attack simulation that no previous model had cleared. OpenAI’s GPT-5.5 completed “The Last Ones” in two of ten attempts but could not solve Cooling Tower. And OpenAI’s own Trusted Access for Cyber program points to the same emerging deployment pattern from another frontier lab. That is, give vetted defenders access to more capable cyber models, pair that access with stronger verification and monitoring, and reserve the most permissive workflows for controlled settings.

Palo Alto Networks CTO Lee Klarich, whose team has been testing both Mythos and GPT-5.5-Cyber, put a number on the urgency. “We now estimate a narrow three-to-five-month window for organizations to outpace the adversary before AI-driven exploits start to become the new norm,” he wrote. “The big question just a few weeks ago was: ‘Are we overstating the model capabilities?’ With more testing, I can confidently say we weren’t.”

When the model alone isn’t enough

The Cooling Tower result matters for R&D security leaders specifically because it is an industrial control system simulation. Lab automation controllers, connected instruments, and biomanufacturing OT sit in the same category. And AISI is already building harder ranges because the current ones are being saturated.

But proving that a model can chain exploits in a controlled range is different from making it produce useful findings against a real codebase. Cloudflare’s central finding was that pointing a capable model at a repository and asking it to find vulnerabilities doesn’t produce meaningful coverage. Context windows fill, single-stream throughput collapses, and a hundred-thousand-line codebase gets a fraction of a percent of real coverage in a single agent session.

What does work is a harness. Cloudflare’s runs eight stages: Recon, Hunt, Validate, Gapfill, Dedupe, Trace, Feedback, and Report. Roughly fifty narrow “hunters” run in parallel, each with one attack class and one scope hint. A separate adversarial agent, running a different prompt and a different model with no ability to file its own findings, tries to disprove every hit. A Trace stage then checks whether confirmed bugs in shared libraries are actually reachable from outside the system in each downstream codebase that depends on them.

“A good human researcher tells you what they found and how confident they are. Models don’t,” Bourzikas wrote. “Ask a model to find bugs, and it will find them, whether the code has any or not.” Mythos Preview’s improvement, he added, is that findings arrive with working proof-of-concept code rather than hedged speculation, which means “far less time spent asking ‘is this even real?'”

That operational lesson is consistent with what Mozilla reported in early May: the harness infrastructure, not the model, is the durable asset. Any organization with a comparable pipeline will see step-function improvements each time frontier models upgrade. T

The access gap reaches R&D

Anthropic’s Project Glasswing gives vetted infrastructure partners access to Mythos Preview. OpenAI’s Trusted Access for Cyber does the same for GPT-5.5-Cyber. Palo Alto’s Frontier AI Alliance extends the perimeter further, but the partner list tells you who it reaches: Accenture, Deloitte, IBM, PwC, Cognizant, HCLTech, Kyndryl, TCS, Infosys, McKinsey, Orange Cyberdefense, Wipro. All consulting and IT services.

Cliff Steinhauer, director of information security and engagement at the National Cybersecurity Alliance, said the dynamic creates a compounding gap. “You’re going to have companies and organizations and individuals who are on top of it, who have been ramping up and building their AI-powered cyber defense stacks,” he said. “When the attackers get a hold of the next, latest, greatest automated exploitability and vulnerability-finding capabilities, those organizations will be more ready than the ones that are slower to adopt.”

The speed of the attacker timeline underscores the point. Palo Alto reports that the time from initial access to data exfiltration has collapsed to 39 seconds in some cases. Against that clock, Cloudflare’s Bourzikas warned that teams chasing a two-hour SLA from CVE disclosure to production patch are solving the wrong problem: compressing patch cycles below regression-test time produces worse bugs than the ones being patched. Architectural defense is the lever that holds when the attacker timeline shortens. That includes segmenting networks so a flaw in one component can’t give an attacker access to others, and deploying fixes everywhere simultaneously rather than waiting on individual teams.

For R&D-heavy industries, that translates to segmenting the pre-publication data plane, hardening instrument firmware update paths, and treating LIMS and ELN integrations as the trust boundaries they are. State-backed hackers already treat universities and research organizations as high-value targets: U.S. intelligence has warned that foreign cyber actors seek trade secrets and proprietary information from universities, and DOJ in 2025 alleged that PRC-directed hackers targeted U.S. universities, immunologists and virologists to steal COVID-19 research.

When the model alone isn’t enough

The access gap reaches R&D

Related Articles Read More >

Elsevier expands LeapSpace with writing coach and Claim Radar, says 97% of users report time savings from the platform

Anthropic says Claude can run science experiments now rather than just plan them

OpenAI’s GPT-5.6 Sol sets a coding record. Its own system card says it cheats sometimes.

Noetik’s TARIO-2: A ‘world model’ that reads a tumor from a single slide

Search R&D World