Anthropic Mythos: The model, the myth and the mundane

Anthropic Mythos (and AI in general) does pose a cyber risk! But the context and caveats surrounding that statement tend to get lost in mainstream media coverage. Context and caveats don’t make for apocalyptic headlines, so they get dropped… Let’s look at Anthropic Mythos: The model, the myth and the mundane!

In 1995, Dan Farmer and Wietse Venema released a tool called SATAN (the Security Administrator Tool for Analyzing Networks) and the information security world lost its collective mind. SATAN was, at the time, one of the first automated vulnerability scanners: it probed networks for known weaknesses and reported what it found. 

The concern was that attackers would use it to find and exploit vulnerabilities faster than defenders could fix them. The concern was valid. Attackers did use it. But so did defenders, and the net result was that the profession got better at finding and fixing things, and the race continued more or less as before.

Thirty years on, we find ourselves in a familiar spot. Anthropic’s Mythos Preview has arrived, and the sky is, once again, falling.

What has Mythos done in testing?

Let’s give credit where it’s due: Mythos (from what we’ve read; we’re not allowed in the cool-kids treehouse yet!) seems genuinely impressive. The UK’s AI Security Institute (AISI) tested it independently and the results are worth understanding properly rather than through the filter of media coverage.

In capture-the-flag (CTF) challenges at various difficulty levels, Mythos performed well but not dramatically better than its contemporaries. At all four levels, no model exceeded any other by more than ten percentage points. If you only looked at the CTF results, you’d say it was a solid incremental improvement along an existing trend line.

The more significant result is with chained attacks. AISI built a 32-step cyber range called “The Last Ones” (TLO) simulating a full attack chain from reconnaissance to network takeover, estimated at 20 hours of human effort. Mythos is the first model to solve it end-to-end, succeeding in three of ten attempts and averaging 22 steps. Claude Opus 4.6 (the previous best) averaged 16 steps, peaking at 28.

Anthropic says Claude Mythos Preview autonomously identified and exploited CVE-2026-4747, a 17-year-old FreeBSD RPCSEC_GSS/NFS-related remote code execution vulnerability. It also identified a 27-year-old denial-of-service flaw in OpenBSD’s TCP SACK implementation and a 16-year-old FFmpeg H.264 decoder vulnerability.

Mythos could also work fully autonomously, meaning no human involvement after the initial prompt while, for context, Anthropic reports that Claude Opus 4.6 had a near-zero success rate at autonomous exploit development. Opus 4.6 produced working exploits only twice in the earlier comparison.

These are real capabilities and they represent a genuine step change in what AI can do offensively.

Hop on the reality bus… even if the ride is boring!

But there are important caveats that tend to get lost in the mainstream media and infomercial coverage! They don’t make for apocalyptic news headlines, but let’s look at them anyway!

First, Mythos failed AISI’s OT-focused “Cooling Tower” range, getting stuck on the IT sections. Second, the ranges lack active defenders and defensive tooling, and they do not penalise models for actions that would trigger security alerts. Third, and this is from AISI’s own conclusion: they cannot say whether Mythos would succeed against a well-defended system. What they can say is that Mythos is capable of autonomously attacking small, weakly defended and vulnerable enterprise systems where access to a network has already been gained.

Note that last qualifier, “weakly defended and vulnerable,” because it does a lot of heavy lifting. It’s the difference between “AI can break into anything” (which is what the headlines suggest) and “AI can exploit systems that are already broken” (which is what the data shows).

Rob Bowley did some useful cost analysis on the AISI data. Accounting for Mythos’s token pricing (five times Opus 4.6 per token), the high variance across runs, and the higher token usage on the harder later steps, a successful Mythos run comes out at roughly US$880-3,500. A human expert completing the same range: 14 hours, once, reliably. The economics are interesting to say the least!

Independent analysis has been even less flattering. VulnCheck researcher Patrick Garrity estimated the confirmed vulnerability count at around 40, despite Anthropic’s claim of “thousands of additional high- and critical-severity vulnerabilities.” Mozilla’s CTO Bobby Holley, after confirming that Mythos found 271 vulnerabilities in Firefox, noted that none were bugs that an elite human researcher couldn’t have found. And an Aisle replication study tested Mythos’s showcase vulnerabilities using small, cheap, open-weight models and found they produced much of the same analysis.

The 27-year-old denial-of-service flaw in OpenBSD’s TCP SACK implementation and the 16-year-old FFmpeg H.264 decoder vulnerability? Well don’t forget that Mythos had access to the source code because FreeBSD and OpenBSD are both free and open source software (FOSS) projects allowing, as per the stated intention of FOSS, to allow reviews and improvements to be made by anyone. But, I digress…

And the Firefox JavaScript-engine vulnerabilities. Well, they’re there but the benchmark did not represent exploitation of a stock end-user Firefox browser: Anthropic says it used a testing harness mimicking a Firefox 147 content process, without the browser process sandbox or other defence-in-depth mitigations. So if you turn off security controls, apparently things get less secure… who’d have guessed!?

Then there is the irony. Anthropic restricted access to Mythos under Project Glasswing, worried it was too dangerous for general release. Within days, unauthorised users accessed it by guessing URL patterns from previous deployments, using information exposed in the Mercor data breach and access associated with a third-party environment. Mercor, an AI staffing firm supplying contractors to Anthropic, was itself compromised via the LiteLLM supply chain attack in March 2026: a cascading compromise that started with a backdoored security-vulnerability scanner!

So while the media is focussed on excitedly interviewing the big end of town about the futuristic and unstoppable AI cyber weapon, it forgets to note that the company associated with building the “weapon” was compromised by the kind of decidedly un-futuristic supply chain attack that proper vendor risk management and credential rotation would have mitigated. As Horizon3.ai CEO Snehal Antani noted: attackers don’t need Mythos. They just need a target organisation that fails to implement effective risk identification and treatment plans, and proper credential management.

The reality bus can be exciting! Keeping it real!

Make no mistake: AI really does accelerate the pace with which attackers can get stuff done, and we use AI tools extensively in our own testing and assessment work. So we have direct experience of how AI changes the game for offensive-security practitioners.

Our Head of Testing and Assessment has spent recent months using LLMs to develop custom command and control (C2) tooling for red team engagements. The objective is straightforward: off-the-shelf C2 frameworks like Cobalt Strike and Sliver have well-known signatures that EDR products are trained to recognise. By contrast, our own custom tooling, built from scratch using non-standard implementations, often sidesteps those signatures quite effectively.

The development process is instructive. Rather than asking an LLM to “build me a C2” (which tends to hit guardrails and bypasses the developer’s understanding of the code), the approach was iterative: build a small component, ask the LLM to extend it, refine, repeat. By including specific constraints in the prompts (“don’t use library X,” “implement the parsing natively rather than importing a standard module”), the LLM produces custom implementations that avoid the indicators of compromise (IOCs) that defenders rely on. The resulting client binary is around 50 kilobytes and capable of most reconnaissance tasks, and which took one person a couple of months to build in his spare time. Oh and it can and does bypass what is likely to be your favorite EDR!

During development, one model introduced a subtle but critical error: it implemented a network protocol using little-endian byte order instead of the big-endian required by the specification. The code worked in local testing but failed against real-world infrastructure. After several unsuccessful debugging attempts, a switch to a more capable (and more expensive) model identified the root cause immediately. The fix was applied, development switched back to the cheaper model, which now had the corrected context and continued without issue. The ability to swap models while maintaining development context is a significant capability accelerator.

And none of this requires access to Mythos, or even to commercial API services. Capable open-weight models like Kimi can be run locally, with quantisation techniques bringing larger models within reach of consumer GPUs. A motivated attacker running an open-weight model locally has no guardrails, no usage logging, no terms of service, and capabilities already far beyond what is needed to breach an organisation with immature security controls. The tightening of guardrails on commercial models (which our testers have observed first-hand) is a reasonable step, but it is not a meaningful barrier to anyone determined enough to run their own infrastructure.

We’ll go into more detail in a follow-up post, but the point is this: AI eliminates much of the programming knowledge barrier for offensive development, allowing a practitioner who knows what they want to build to get there much faster than they could by hand. What it does not do is replace the expertise required to know what to build in the first place, or to understand why the output is wrong when it is. The experienced tester who uses AI as a development accelerator is genuinely more dangerous than they were two years ago. A novice asking an LLM to “hack this website” is going to have a bad time.

The proof-of-concept development for our recent Application Control bypass research (CVE-2026-25166) was AI-assisted. We use AI-generated content in our social engineering and phishing exercises. Our testers have tested the custom C2 tooling described above against leading EDR products with encouraging results. AI is a force multiplier for competent practitioners, on both sides of the fence. But it is a multiplier, not a replacement and the fundamental equation hasn’t changed. The sky is not falling. The question is whether you’ve been up to check the roof lately!

The real bogeyman is actually quite sad

So what are Australian organisations actually getting breached by in 2026? Not Mythos.

Despite the breathless, sweaty coverage in news articles like this infomercial from the ABC, the real bogeyman is one that Australian businesses are well able to deal with. And as described in he Verizon 2025 Data Breach Investigations Report (analysing over 22,000 incidents across 139 countries) it’s much more prosaic:

  • Stolen credentials remain the most common initial access vector, used in 22% of breaches.
  • Exploitation of edge device and VPN vulnerabilities surged eightfold year-on-year, yet only 54% of those vulnerabilities were fully remediated, with a median time to patch of 32 days against a median time to exploitation of zero days.
  • Ransomware was present in 44% of all breaches, up from 32%, with 88% of those incidents hitting small and medium-sized businesses.

Not exciting, not front page news, not glamorous… just the evidence-based reality of most days in real infosec!

Even the consequences of a breach are tired and well-trodden! Businesses that are breached pay a ransom, same as it ever was. Of course, in Australia we have gummy reporting laws so we don’t hear about breaches too frequently but they’re happening: In a very interesting article, iTnews reported that at least 75 Australian businesses with turnover above $3 million paid ransomware groups in the first eight months of mandatory disclosure. Between seven and thirteen a month. And those are just the ones bound by our limited reporting requirements (think critical infrastructure) and that are above the reporting threshold.

We’ve been here, or close to here, before

The pattern is familiar. A new offensive capability appears. The media has a field day. Vendors use the fear to sell services. And the actual fix turns out to be the same boring stuff it has always been.

In 2020, when the Australian Prime Minister warned of “sophisticated, state-based cyber actors” targeting Australian organisations, we wrote a post pointing out that the attackers were exploiting known vulnerabilities with patches available for months, and falling back to standard spear-phishing when that didn’t work. The Telerik UI vulnerability they were reportedly exploiting could be demonstrated with Metasploit by anyone with the slightest clue. Despite the resigned doom of the announcement, we took an optimistic view: it was wrong for organisations to accept that they must be victims to this unstoppable APT force. They could instead take proactive steps to implement risk-based measures with reference to accepted frameworks and, with small consistent improvements, be a master of their own destiny.

The APT (Advanced Persistent Threat) craze made a lot of people a lot of money. A new TLA was made popular, threat intelligence platforms were sold, and entire business models were built on the premise that only expensive, specialised defences could protect against nation-state adversaries. And yet, when you looked at what the “advanced” threats were actually doing, it was the same old, boring, not-leading-edge techniques: shared accounts and single-factor authentication, exploiting unpatched software, phishing credentials, abusing misconfigured access controls, and moving laterally through flat networks.

The same was true when OpenAI announced in 2019 that GPT-2 was “too dangerous to release” because it could generate convincing fake text. That particular sky didn’t fall either. Phishing attacks got somewhat better but had no additional effect on the organisations that were proactive and implemented a risk-based, maturity-improvement program.

By all means get breathless… but then take control!

Here’s the thing that tends to get lost in the noise: AISI’s blog post ends with a recommendation that amounts to “do the basics well”: security updates, access controls, security configuration, and logging. The SANS editors who commented on the AISI findings reached the same conclusion. Lee Dukes pointed to the CIS Controls Implementation Group 1 as an effective defence. William Murray’s entire contribution was four words: “Think least privilege and defence in depth.”

And to add to the sentiment, TrustedSec’s Justin Elze puts it well when he says: Mythos doesn’t replace the existing reality of how organisations get compromised; it lands on top of it. What changes is the cost of ignoring the fundamentals.

All of which is great, because it’s what we have been saying for years, writing in these blogs, chatting to our clients, and occasionally screaming into the void: Risk management frameworks exist. The CIS Controls exist. The ACSC Essential Eight exists. ISO 27001 exists. The NIST CSF v2 exists. None of these are new, none of them are exciting, and none of them will generate feverish media coverage. But they work, when they are actually implemented, verified and maintained.

And what if that doesn’t happen? Well for just one of very many examples, consider how, in a recent penetration test, we discovered that a national managed service provider (MSP) was using shared, single-factor accounts for their client’s system administration work. And there was no evidence that the password had ever been changed. This is an MSP, an organisation that other businesses trust with their IT environments! You don’t need Mythos to compromise that network: You need to find a business that uses an MSP with password-only authentication, and ten minutes. Boring, unexciting, real.

As Sarah *sighs wistfully* said, “The future is not set. There is no fate but what we make for ourselves”. Despite the hype from the media and big business, organisations are not helpless victims waiting for AI to come and destroy them. They are, with a risk-based approach and reference to control frameworks and standards, already well able to defend themselves against the kinds of attacks that they’re most likely to face, and to manage their defences in response to their frequently-reviewed level of risk.

But, if the organisation chooses inaction and complacency? The choice is there, AI will just accelerate the consequences.

What next?

If reading this has made you wonder whether your controls are actually working, or whether the policies on paper are reflected in your actual environment, give us a call. We can help you understand where you stand against frameworks like the CIS Controls, the ACSC Essential Eight, and ISO 27001, and prioritise the penetration testing, security-maturity and risk-based improvements that will make the most difference.

Getting overtaken by the AI-threat hype is a bit like going out in the middle of the day to stare up at the sky and freak out about a hailstorm that might hit next summer. It’s a real threat to be sure and you should certainly check your gutters and insurance, but right now as you stare up at the blue expanse, the thing that you should really be most worried about is whether or not you’re wearing sunscreen!

The best defence against AI-powered attacks turns out to be the same as the best defence against every other kind of attack: doing the boring stuff well, consistently, and verifying that it actually works.

Premier australian cyber security specialists