Research & Development World

  • R&D World Home
  • Topics
    • Aerospace
    • Automotive
    • Biotech
    • Careers
    • Chemistry
    • Environment
    • Energy
    • Life Science
    • Material Science
    • R&D Management
    • Physics
  • Technology
    • 3D Printing
    • A.I./Robotics
    • Software
    • Battery Technology
    • Controlled Environments
      • Cleanrooms
      • Graphene
      • Lasers
      • Regulations/Standards
      • Sensors
    • Imaging
    • Nanotechnology
    • Scientific Computing
      • Big Data
      • HPC/Supercomputing
      • Informatics
      • Security
    • Semiconductors
  • R&D Market Pulse
  • R&D 100
    • 2025 R&D 100 Award Winners
    • 2025 Professional Award Winners
    • 2025 Special Recognition Winners
    • R&D 100 Awards Event
    • R&D 100 Submissions
    • Winner Archive
  • Resources
    • Research Reports
    • Digital Issues
    • Educational Assets
    • Subscribe
    • Video
    • Webinars
    • Content submission guidelines for R&D World
  • Global Funding Forecast
  • Top Labs
  • Advertise
  • SUBSCRIBE

Post-Mythos, defenders have months, not years, to prepare for AI-powered hacking

By Brian Buntz | May 18, 2026

Abstract of modern high tech internet data center room with rows of racks with network and server hardware. 3d rendering

Instead of Spy vs. Spy, the cybersecurity world is quickly becoming AI vs. AI.

Anthropic is positioning its gated Claude Mythos Preview model as offering a “striking leap” in many evaluation benchmarks over its predecessor, which at the time of launch was Opus 4.6. Most recently, Cloudflare Chief Security Officer Grant Bourzikas reported that Mythos chained low-severity bugs into working exploit proofs across more than fifty of the company’s repositories. But Bourzikas noted that if you ask the model directly to write a proof of concept against live infrastructure, it refuses, offers to audit the code defensively, and suggests you try a local harness instead. Ask the same question after an unrelated environment change, and it complies.

Over the past two weeks, several other developments have sharpened the picture. Curl maintainer Daniel Stenberg reported that Mythos found one confirmed low-severity vulnerability in curl’s 178,000 lines of code, fewer issues than prior AI tools had surfaced, and called the rollout “an amazingly successful marketing stunt.” The UK’s AI Security Institute, which first evaluated Mythos in April, reported that a newer checkpoint showed “notable capability jumps“: it completed AISI’s 32-step corporate network takeover simulation (“The Last Ones”) in six of ten attempts, up from three, and became the first model to solve “Cooling Tower,” a 7-step industrial control system attack simulation that no previous model had cleared. OpenAI’s GPT-5.5 completed “The Last Ones” in two of ten attempts but could not solve Cooling Tower. And OpenAI’s own Trusted Access for Cyber program points to the same emerging deployment pattern from another frontier lab. That is, give vetted defenders access to more capable cyber models, pair that access with stronger verification and monitoring, and reserve the most permissive workflows for controlled settings.

Palo Alto Networks CTO Lee Klarich, whose team has been testing both Mythos and GPT-5.5-Cyber, put a number on the urgency. “We now estimate a narrow three-to-five-month window for organizations to outpace the adversary before AI-driven exploits start to become the new norm,” he wrote. “The big question just a few weeks ago was: ‘Are we overstating the model capabilities?’ With more testing, I can confidently say we weren’t.”

When the model alone isn’t enough

The Cooling Tower result matters for R&D security leaders specifically because it is an industrial control system simulation. Lab automation controllers, connected instruments, and biomanufacturing OT sit in the same category. And AISI is already building harder ranges because the current ones are being saturated.

But proving that a model can chain exploits in a controlled range is different from making it produce useful findings against a real codebase. Cloudflare’s central finding was that pointing a capable model at a repository and asking it to find vulnerabilities doesn’t produce meaningful coverage. Context windows fill, single-stream throughput collapses, and a hundred-thousand-line codebase gets a fraction of a percent of real coverage in a single agent session.

What does work is a harness. Cloudflare’s runs eight stages: Recon, Hunt, Validate, Gapfill, Dedupe, Trace, Feedback, and Report. Roughly fifty narrow “hunters” run in parallel, each with one attack class and one scope hint. A separate adversarial agent, running a different prompt and a different model with no ability to file its own findings, tries to disprove every hit. A Trace stage then checks whether confirmed bugs in shared libraries are actually reachable from outside the system in each downstream codebase that depends on them.

“A good human researcher tells you what they found and how confident they are. Models don’t,” Bourzikas wrote. “Ask a model to find bugs, and it will find them, whether the code has any or not.” Mythos Preview’s improvement, he added, is that findings arrive with working proof-of-concept code rather than hedged speculation, which means “far less time spent asking ‘is this even real?'”

That operational lesson is consistent with what Mozilla reported in early May: the harness infrastructure, not the model, is the durable asset. Any organization with a comparable pipeline will see step-function improvements each time frontier models upgrade. T

The access gap reaches R&D

Anthropic’s Project Glasswing gives vetted infrastructure partners access to Mythos Preview. OpenAI’s Trusted Access for Cyber does the same for GPT-5.5-Cyber. Palo Alto’s Frontier AI Alliance extends the perimeter further, but the partner list tells you who it reaches: Accenture, Deloitte, IBM, PwC, Cognizant, HCLTech, Kyndryl, TCS, Infosys, McKinsey, Orange Cyberdefense, Wipro. All consulting and IT services.

Cliff Steinhauer, director of information security and engagement at the National Cybersecurity Alliance, said the dynamic creates a compounding gap. “You’re going to have companies and organizations and individuals who are on top of it, who have been ramping up and building their AI-powered cyber defense stacks,” he said. “When the attackers get a hold of the next, latest, greatest automated exploitability and vulnerability-finding capabilities, those organizations will be more ready than the ones that are slower to adopt.”

The speed of the attacker timeline underscores the point. Palo Alto reports that the time from initial access to data exfiltration has collapsed to 39 seconds in some cases. Against that clock, Cloudflare’s Bourzikas warned that teams chasing a two-hour SLA from CVE disclosure to production patch are solving the wrong problem: compressing patch cycles below regression-test time produces worse bugs than the ones being patched. Architectural defense is the lever that holds when the attacker timeline shortens. That includes segmenting networks so a flaw in one component can’t give an attacker access to others, and deploying fixes everywhere simultaneously rather than waiting on individual teams.

For R&D-heavy industries, that translates to segmenting the pre-publication data plane, hardening instrument firmware update paths, and treating LIMS and ELN integrations as the trust boundaries they are. State-backed hackers already treat universities and research organizations as high-value targets: U.S. intelligence has warned that foreign cyber actors seek trade secrets and proprietary information from universities, and DOJ in 2025 alleged that PRC-directed hackers targeted U.S. universities, immunologists and virologists to steal COVID-19 research.

Tell Us What You Think! Cancel reply

You must be logged in to post a comment.

Related Articles Read More >

Google Gemini icon mobile app on a screen smartphone iPhone closeup. Gemini is an AI assistant from Google. Batumi, Georgia - January 17, 2026
Google’s Gemini 3.5 Flash scores within two points of Anthropic’s flagship at roughly one third of the price
A 2023 Tweet from Karpathy who also helped popularize the term "vibe coding."
Why Anthropic hired OpenAI co-founder and Software 3.0 proponent Karpathy and acquired the dev tools company Stainless
Technology specialists monitoring artificial intelligence of robotic machines inside the factory.
The 2 a.m. problem: A Jabil executive on what really stalls robotics at scale
NVIDIA-backed robotics startup RLWRLD targets dexterous labor worth trillions with RLDX-1
rd newsletter
EXPAND YOUR KNOWLEDGE AND STAY CONNECTED
Get the latest info on technologies, trends, and strategies in Research & Development.

R&D World Digital Issues

Fall 2025 issue

Browse the most current issue of R&D World and back issues in an easy to use high quality format. Clip, share and download with the leading R&D magazine today.

R&D 100 Awards
Research & Development World
  • Subscribe to R&D World Magazine
  • Sign up for R&D World’s newsletter
  • Contact Us
  • About Us
  • Drug Discovery & Development
  • Pharmaceutical Processing
  • Global Funding Forecast

Copyright © 2026 WTWH Media LLC. All Rights Reserved. The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media
Privacy Policy | Advertising | About Us

Search R&D World

  • R&D World Home
  • Topics
    • Aerospace
    • Automotive
    • Biotech
    • Careers
    • Chemistry
    • Environment
    • Energy
    • Life Science
    • Material Science
    • R&D Management
    • Physics
  • Technology
    • 3D Printing
    • A.I./Robotics
    • Software
    • Battery Technology
    • Controlled Environments
      • Cleanrooms
      • Graphene
      • Lasers
      • Regulations/Standards
      • Sensors
    • Imaging
    • Nanotechnology
    • Scientific Computing
      • Big Data
      • HPC/Supercomputing
      • Informatics
      • Security
    • Semiconductors
  • R&D Market Pulse
  • R&D 100
    • 2025 R&D 100 Award Winners
    • 2025 Professional Award Winners
    • 2025 Special Recognition Winners
    • R&D 100 Awards Event
    • R&D 100 Submissions
    • Winner Archive
  • Resources
    • Research Reports
    • Digital Issues
    • Educational Assets
    • Subscribe
    • Video
    • Webinars
    • Content submission guidelines for R&D World
  • Global Funding Forecast
  • Top Labs
  • Advertise
  • SUBSCRIBE