Blog
/
AI
/
July 26, 2022

Self-Learning AI for Zero-Day and N-Day Attack Defense

Explore the differences between zero-day and n-day attacks on different customer servers to learn how Darktrace detects and prevents cyber threats effectively.
Inside the SOC
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
Written by
Lewis Morgan
Cyber Analyst
Default blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog image
26
Jul 2022

Key Terms:

Zero-day | A recently discovered security vulnerability in computer software that has no currently available fix or patch. Its name come from the reality that vendors have “zero days” to act and respond.

N-day | A vulnerability that emerges in computer software in which a vendor is aware and may have already issued (or are currently working on) a patch or fix. Active exploits often already exist and await abuse by nefarious actors.

Traditional security solutions often apply signature-based-detection when identifying cyber threats, helping to defend against legacy attacks but consequently missing novel ones. Therefore, security teams often lend a lot of focus to ensuring that the risk of zero-day vulnerabilities is reduced [1]. As explored in this blog, however, organizations can face just as much of a risk from n-day attacks, since they invite the most attention from malicious actors [2]. This is due in part to the reduced complexity, cost and time invested in researching and finding new exploits compared with that found when attackers exploit zero-days. 

This blog will examine both a zero-day and n-day attack that two different Darktrace customers faced in the fall of 2021. This will include the activity Darktrace detected, along with the steps taken by Darktrace/Network to intervene. It will then compare the incidents, discuss the possible dangers of third-party integrations, and assess the deprecation of legacy security tools.

Revisiting zero-day attacks 

Zero-days are among the greatest concerns security teams face in the era of modern technology and networking. Defending critical systems from zero-day compromises is a task most legacy security solutions are often unable to handle. Due to the complexity of uncovering new security flaws and developing elaborate code that can exploit them, these attacks are often carried out by funded or experienced groups such as nation-state actors and APTs. One of history’s most prolific zero-days, ‘Stuxnet’, sent security teams worldwide into a global panic in 2010. This involved a widespread attack on Iranian nuclear infrastructure and was widely accepted to be a result of nation-state actors [3]. The Stuxnet worm took advantage of four zero-day exploits, compromising over 200,000 devices and physically damaging around 10% of the 9,000 critical centrifuges at the Natanz nuclear site. 

More recently, 2021 saw the emergence of several critical zero-day vulnerabilities within SonicWall’s product suite [4]. SonicWall is a security hardware manufacturer that provides hardware firewall devices, unified threat management, VPN gateways and network security solutions. Some of these vulnerabilities lie within their Secure Mobile Access (SMA) 100 series (for example, CVE-2019-7481, CVE-2021-20016 and CVE-2021-20038 to name a few). These directly affected VPN devices and often allowed attackers easy remote access to company devices. CVE-2021-20016 in particular incorporates an SQL-Injection vulnerability within SonicWall’s SSL VPN SMA 100 product line [5]. If exploited, this defect would allow an unauthenticated remote attacker to perform their own malicious SQL query in order to access usernames, passwords and other session related information. 

The N-day underdog

The shadow cast by zero-day attacks often shrouds that of n-day attacks. N-days, however, often pose an equal - if not greater - risk to the majority of organizations, particularly those in industrial sectors. Since these vulnerabilities have fixes available, all of the hard work around research is already done; malicious actors only need to view proof of concepts (POCs) or, if proficient in coding, reverse-engineer software to reveal code-changes (binary diffing) in order to exploit these security flaws in the wild. These vulnerabilities are typically attributed to opportunistic hackers and script-kiddies, where little research or heavy lifting is required.  

August 2021 gave rise to a critical vulnerability in Atlassian Confluence servers, namely CVE-2021-26084 [6]. Confluence is a widely used collaboration wiki tool and knowledge-sharing platform. As introduced and discussed a few months ago in a previous Darktrace blog (Explore Internet-Facing System Vulnerabilities), this vulnerability allows attackers to remotely execute code on internet-facing servers after exploiting injection vulnerabilities in Object-Graph Navigation Language (OGNL). Whilst Confluence had patches and fixes available to users, attackers still jumped on this opportunity and began scanning the internet for signs of critical devices serving this outdated software [7]. Once identified, they would  exploit the vulnerability, often installing crypto mining software onto the device. More recently, Darktrace explored a new vulnerability (CVE-2022-26134), disclosed midway through 2022, that affected Confluence servers and data centers using similar techniques to that found in CVE-2021-26084 [8]. 

SonicWall in the wild – 1. Zero-day attack

At the beginning of August 2021, Darktrace prevented an attack from taking place within a European automotive customer’s environment (Figure 1). The attack targeted a vulnerable internet-facing SonicWall VPN server, and while the attacker’s motive remains unclear, similar historic events suggest that they intended to perform ransomware encryption or data exfiltration. 

Figure 1: Timeline of the SonicWall attack 

Darktrace was unable to confirm the definite tactics, techniques and procedures (TTPs) used by the attacker to compromise the customer’s environment, as the device was compromised before Darktrace installation and coverage. However, from looking at recently disclosed SonicWall VPN vulnerabilities and patterns of behaviour, it is likely CVE-2021-20016 played a part. At some point after this initial infection, it is also believed the device was able to move laterally to a domain controller (DC) using administrative credentials; it was this server that then initiated the anomalous activity that Darktrace detected and alerted on. 

On August 5th 2021 , Darktrace observed this compromised domain controller engaging in unusual ICMP scanning - a protocol used to discover active devices within an environment and create a map of an organization’s network topology. Shortly after, the infected server began scanning devices for open RDP ports and enumerating SMB shares using unorthodox methods. SMB delete and HTTP requests (over port 445 and 80 respectively) were made for files named delete.me in the root directory of numerous network shares using the user agent Microsoft WebDAV. However, no such files appeared to exist within the environment. This may have been the result of an attacker probing devices in the network in an effort to see their responses and gather information on properties and vulnerabilities they could later exploit. 

Soon the infected DC began establishing RDP tunnels back to the VPN server and making requests to an internal DNS server for multiple endpoints relating to exploit kits, likely in an effort to strengthen the attacker’s foothold within the environment. Some of the endpoints requested relate to:

-       EternalBlue vulnerability 

-       Petit Potam NTLM hash attack tool

-       Unusual GitHub repositories

-       Unusual Python repositories  

The DC made outgoing NTLM requests to other internal devices, implying the successful installation of Petit Potam exploitation tools. The server then began performing NTLM reconnaissance, making over 1,000 successful logins under ‘Administrator’ to several other internal devices. Around the same time, the device was also seen making anonymous SMBv1 logins to numerous internal devices, (possibly symptomatic of the attacker probing machines for EternalBlue vulnerabilities). 

Interestingly, the device also made numerous failed authentication attempts using a spoofed credential for one of the organization’s security managers. This was likely in an attempt to hide themselves using ‘Living off the Land’ (LotL) techniques. However, whilst the attacker clearly did their research on the company, they failed to acknowledge the typical naming convention used for credentials within the environment. This ultimately backfired and made the compromise more obvious and unusual. 

In the morning of the following day, the initially compromised VPN server began conducting further reconnaissance, engaging in similar activity to that observed by the domain controller. Until now, the customer had set Darktrace RESPOND to run in human confirmation mode, meaning interventions were not made autonomously but required confirmation by a member of the internal security team. However, thanks to Proactive Threat Notifications (PTNs) delivered by Darktrace’s dedicated SOC team, the customer was made immediately aware of this unusual behaviour, allowing them to apply manual Darktrace RESPOND blocks to all outgoing connections (Figure 2). This gave the security team enough time to respond and remediate before serious damage could be done.

Figure 2: Darktrace RESPOND model breach showing the manually applied “Quarantine Device” action taken against the compromised VPN server. This screenshot displays the UI from Darktrace version 5.1

Confluence in the wild – 2. N-day attack

Towards the end of 2021, Darktrace saw a European broadcasting customer leave an Atlassian Confluence internet-facing server unpatched and vulnerable to crypto-mining malware using CVE-2021-26084. Thanks to Darktrace, this attack was entirely immobilized within only a few hours of the initial infection, protecting the organization from damage (Figure 3). 

Figure 3: Timeline of the Confluence attack

On midday on September 1st 2021, an unpatched Confluence server was seen receiving SSL connections over port 443 from a suspicious new endpoint, 178.238.226[.]127.  The connections were encrypted, meaning Darktrace was unable to view the contents and ascertain what requests were being made. However, with the disclosure of CVE-2021-26084 just 7 days prior to this activity, it is likely that the TTPs used involved injecting OGNL expressions to Confluence server memory; allowing the attacker to remotely execute code on the vulnerable server.

Immediately after successful exploitation of the Confluence server, the infected device was observed making outgoing HTTP GET requests to several external endpoints using a new user agent (curl/7.61.1). Curl was used to silently download and configure multiple suspicious files relating to XMRig cryptocurrency miner, including ld.sh, XMRig and config.json. Subsequent outgoing connections were then made to europe.randomx-hub.miningpoolhub[.]com · 172.105.210[.]117 using the JSON-RPC protocol, seen alongside the mining credential maillocal.confluence (Figure 4). Only 3 seconds after initial compromise, the infected device began attempting to mine cryptocurrency using the Minergate protocol but was instantly and autonomously blocked by Darktrace RESPOND. This prevented the server from abusing system resources and generating profits for the attacker.

Figure 4: A graph showing the frequency of external connections using the JSON-RPC protocol made by the breach device over a 48-hour window. The orange-red dots represent models that breached as a result of this activity, demonstrating the “waterfall” effect commonly seen when a device suffers a compromise. This screenshot displays the UI from Darktrace version 5.1

In the afternoon, the malware persisted with its infection. The compromised server began making successive HTTP GET requests to a new rare endpoint 195.19.192[.]28 using the same curl user agent (Figures 5 & 6). These requests were for executable and dynamic library files associated with Kinsing malware (but fortunately were also blocked by Darktrace RESPOND). Kinsing is a malware strain found in numerous attack campaigns which is often associated with crypto-jacking, and has appeared in previous Darktrace blogs [9].

Figure 5: Cyber AI Analyst summarising the unusual download of Kinsing software using the new curl user agent. This screenshot displays the UI from Darktrace version 5.1

The attacker then began making HTTP POST requests to an IP 185.154.53[.]140, using the same curl user agent; likely a method for the attacker to maintain persistence within the network and establish a foothold using its C2 infrastructure. The Confluence server was then again seen attempting to mine cryptocurrency using the Minergate protocol. It made outgoing JSON-RPC connections to a different new endpoint, 45.129.2[.]107, using the following mining credential: ‘42J8CF9sQoP9pMbvtcLgTxdA2KN4ZMUVWJk6HJDWzixDLmU2Ar47PUNS5XHv4Kmfdh8aA9fbZmKHwfmFo8Wup8YtS5Kdqh2’. This was once again blocked by Darktrace RESPOND (Figure 7). 

Figure 6: VirusTotal showing the unusualness of one of these external IPs [10]
Figure 7: Log data showing the action taken by Darktrace RESPOND in response to the device breaching the “Crypto Currency Mining Activity” model. This screenshot displays the UI from Darktrace version 5.1

The final activity seen from this device involved the download of additional shell scripts over HTTP associated with Kinsing, namely spre.sh and unk.sh, from 194.38.20[.]199 and 195.3.146[.]118 respectively (Figure 8). A new user agent (Wget/1.19.5 (linux-gnu)) was used when connecting to the latter endpoint, which also began concurrently initiating repeated connections indicative of C2 beaconing. These scripts help to spread the Kinsing malware laterally within the environment and may have been the attacker's last ditch efforts at furthering their compromise before Darktrace RESPOND blocked all connections from the infected Confluence server [11]. With Darktrace RESPOND's successful actions, the customer’s security team were then able to perform their own response and remediation. 

Figure 8: Cyber AI Analyst revealing the last ditch efforts made by the threat actor to download further malicious software. This screenshot displays the UI from Darktrace version 5.1

Darktrace Coverage: N- vs Zero-days

In the SonicWall case the attacker was unable to achieve their actions on objectives (thanks to Darktrace's intervention). However, this incident displayed tactics of a more stealthy and sophisticated attacker - they had an exploited machine but waited for the right moment to execute their malicious code and initiate a full compromise. Due to the lack of visibility over attacker motive, it is difficult to deduce what type of actor led to this intrusion. However, with the disclosure of a zero-day vulnerability (CVE-2021-20016) not long before this attack, along with a seemingly dormant initially compromised device, it is highly possible that it was carried out by a sophisticated cyber criminal or gang. 

On the other hand, the Confluence case engaged in a slightly more noisy approach; it dropped crypto mining malware on vulnerable devices in the hope that the target’s security team did not maintain visibility over their network or would merely turn a blind eye. The files downloaded and credentials observed alongside the mining activity heavily imply the use of Kinsing malware [11]. Since this vulnerability (CVE-2021-26084) emerged as an n-day attack with likely easily accessible POCs, as well as there being a lack of LotL techniques and the motive being long term monetary gain, it is possible this attack was conducted by a less sophisticated or amateur actor (script-kiddie); one that opportunistically exploits known vulnerabilities in internet-facing devices in order to make a quick profit [12].

Whilst Darktrace RESPOND was enabled in human confirmation mode only during the start of the SonicWall attack, Darktrace’s Cyber AI Analyst still offered invaluable insight into the unusual activity associated with the infected machines during both the Confluence and SonicWall compromises. SOC analysts were able to see these uncharacteristic behaviours and escalate the incident through Darktrace’s PTN and ATE services. Analysts then worked through these tickets with the customers, providing support and guidance and, in the SonicWall case, quickly helping to configure Darktrace RESPOND. In both scenarios, Darktrace RESPOND was able to block abnormal connections and enforce a device’s pattern of life, affording the security team enough time to isolate the infected machines and prevent further threats such as ransomware detonation or data exfiltration. 

Concluding thoughts and dangers of third-party integrations 

Organizations with internet-facing devices will inevitably suffer opportunistic zero-day and n-day attacks. While little can be done to remove the risk of zero-days entirely, ensuring that organizations keep their systems up to date will at the very least help prevent opportunistic and script-kiddies from exploiting n-day vulnerabilities.  

However, it is often not always possible for organizations to keep their systems up to date, especially for those who require continuous availability. This may also pose issues for organizations that rely on, and put their trust in, third party integrations such as those explored in this blog (Confluence and SonicWall), as enforcing secure software is almost entirely out of their hands. Moreover, with the rising prevalence of remote working, it is essential now more than ever that organizations ensure their VPN devices are shielded from external threats, guidance on which has been released by the NSA/CISA [13].

These two case studies have shown that whilst organizations can configure their networks and firewalls to help identify known indicators of compromise (IoC), this ‘rearview mirror’ approach will not account for, or protect against, any new and undisclosed IoCs. With the aid of Self-Learning AI and anomaly detection, Darktrace can detect the slightest deviation from a device’s normal pattern of life and respond autonomously without the need for rules and signatures. This allows for the disruption and prevention of known and novel attacks before irreparable damage is caused- reassuring security teams that their digital estates are secure. 

Thanks to Paul Jennings for his contributions to this blog.

Appendices: SonicWall (Zero-day)

Darktrace model detections

·      AIA / Suspicious Chain of Administrative Credentials

·      Anomalous Connection / Active Remote Desktop Tunnel

·      Anomalous Connection / SMB Enumeration

·      Anomalous Connection / Unusual Internal Remote Desktop

·      Compliance / High Priority Compliance Model Breach

·      Compliance / Outgoing NTLM Request from DC

·      Device / Anomalous RDP Followed By Multiple Model Breaches

·      Device / Anomalous SMB Followed By Multiple Model Breaches

·      Device / ICMP Address Scan

·      Device / Large Number of Model Breaches

·      Device / Large Number of Model Breaches from Critical Network Device

·      Device / Multiple Lateral Movement Model Breaches (PTN/Enhanced Monitoring model)

·      Device / Network Scan

·      Device / Possible SMB/NTLM Reconnaissance

·      Device / RDP Scan

·      Device / Reverse DNS Sweep

·      Device / SMB Session Bruteforce

·      Device / Suspicious Network Scan Activity (PTN/Enhanced Monitoring model)

·      Unusual Activity / Possible RPC Recon Activity

Darktrace RESPOND (Antigena) actions (as displayed in example)

·      Antigena / Network / Manual / Quarantine Device

MITRE ATT&CK Techniques Observed
IoCs

Appendices: Confluence (N-day)

Darktrace model detections

·      Anomalous Connection / New User Agent to IP Without Hostname

·      Anomalous Connection / Posting HTTP to IP Without Hostname

·      Anomalous File / EXE from Rare External Location

·      Anomalous File / Script from Rare Location

·      Compliance / Crypto Currency Mining Activity

·      Compromise / High Priority Crypto Currency Mining (PTN/Enhanced Monitoring model)

·      Device / Initial Breach Chain Compromise (PTN/Enhanced Monitoring model)

·      Device / Internet Facing Device with High Priority Alert

·      Device / New User Agent

Darktrace RESPOND (Antigena) actions (displayed in example)

·      Antigena / Network / Compliance / Antigena Crypto Currency Mining Block

·      Antigena / Network / External Threat / Antigena File then New Outbound Block

·      Antigena / Network / External Threat / Antigena Suspicious Activity Block

·      Antigena / Network / External Threat / Antigena Suspicious File Block

·      Antigena / Network / Significant Anomaly / Antigena Block Enhanced Monitoring

MITRE ATT&CK Techniques Observed
IOCs

References:

[1] https://securitybrief.asia/story/why-preventing-zero-day-attacks-is-crucial-for-businesses

[2] https://electricenergyonline.com/energy/magazine/1150/article/Security-Sessions-More-Dangerous-Than-Zero-Days-The-N-Day-Threat.htm

[3] https://www.wired.com/2014/11/countdown-to-zero-day-stuxnet/

[4] https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=SonicWall+2021 

[5] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-20016

[6] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-26084

[7] https://www.zdnet.com/article/us-cybercom-says-mass-exploitation-of-atlassian-confluence-vulnerability-ongoing-and-expected-to-accelerate/

[8] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-26134

[9] https://attack.mitre.org/software/S0599/

[10] https://www.virustotal.com/gui/ip-address/195.19.192.28/detection 

[11] https://sysdig.com/blog/zoom-into-kinsing-kdevtmpfsi/

[12] https://github.com/alt3kx/CVE-2021-26084_PoC

[13] https://www.nsa.gov/Press-Room/Press-Releases-Statements/Press-Release-View/Article/2791320/nsa-cisa-release-guidance-on-selecting-and-hardening-remote-access-vpns/

Inside the SOC
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
Written by
Lewis Morgan
Cyber Analyst

More in this series

No items found.

Blog

/

Email

/

May 1, 2026

How email-delivered prompt injection attacks can target enterprise AI – and why it matters

Default blog imageDefault blog image

What are email-delivered prompt injection attacks?

As organizations rapidly adopt AI assistants to improve productivity, a new class of cyber risk is emerging alongside them: email-delivered AI prompt injection. Unlike traditional attacks that target software vulnerabilities or rely on social engineering, this is the act of embedding malicious or manipulative instructions into content that an AI system will process as part of its normal workflow. Because modern AI tools are designed to ingest and reason over large volumes of data, including emails, documents, and chat histories, they can unintentionally treat hidden attacker-controlled text as legitimate input.  

At Darktrace, our analysis has shown an increase of 90% in the number of customer deployments showing signals associated with potential prompt injection attempts since we began monitoring for this type of activity in late 2025. While it is not always possible to definitively attribute each instance, internal scoring systems designed to identify characteristics consistent with prompt injection have recorded a growing number of high-confidence matches. The upward trend suggests that attackers are actively experimenting with these techniques.

Recent examples of prompt injection attacks

Two early examples of this evolving threat are HashJack and ShadowLeak, which illustrate prompt injection in practice.

HashJack is a novel prompt injection technique discovered in November 2025 that exploits AI-powered web browsers and agentic AI browser assistants. By hiding malicious instructions within the URL fragment (after the # symbol) of a legitimate, trusted website, attackers can trick AI web assistants into performing malicious actions – potentially inserting phishing links, fake contact details, or misleading guidance directly into what appears to be a trusted AI-generated output.

ShadowLeak is a prompt injection method to exfiltrate PII identified in September 2025. This was a flaw in ChatGPT (now patched by OpenAI) which worked via an agent connected to email. If attackers sent the target an email containing a hidden prompt, the agent was tricked into leaking sensitive information to the attacker with no user action or visible UI.

What’s the risk of email-delivered prompt injection attacks?

Enterprise AI assistants often have complete visibility across emails, documents, and internal platforms. This means an attacker does not need to compromise credentials or move laterally through an environment. If successful, they can influence the AI to retrieve relevant information seamlessly, without the labor of compromise and privilege escalation.

The first risk is data exfiltration. In a prompt injection scenario, malicious instructions may be embedded within an ordinary email. As in the ShadowLeak attack, when AI processes that content as part of a legitimate task, it may interpret the hidden text as an instruction. This could result in the AI disclosing sensitive data, summarizing confidential communications, or exposing internal context that would otherwise require significant effort to obtain.

The second risk is agentic workflow poisoning. As AI systems take on more active roles, prompt injection can influence how they behave over time. An attacker could embed instructions that persist across interactions, such as causing the AI to include malicious links in responses or redirect users to untrusted resources. In this way, the attacker inserts themselves into the workflow, effectively acting as a man-in-the-middle within the AI system.

Why can’t other solutions catch email-delivered prompt injection attacks?

AI prompt injection challenges many of the assumptions that traditional email security is built on. It does not fit the usual patterns of phishing, where the goal is to trick a user into clicking a link or opening an attachment.  

Most security solutions are designed to detect signals associated with user engagement: suspicious links, unusual attachments, or social engineering cues. Prompt injection avoids these indicators entirely, meaning there are fewer obvious red flags.

In this case, the intention is actually the opposite of user solicitation. The objective is simply for the email to be delivered and remain in the inbox, appearing benign and unremarkable. The malicious element is not something the recipient is expected to engage with, or even notice.

Detection is further complicated by the nature of the prompts themselves. Unlike known malware signatures or consistent phishing patterns, injected prompts can vary widely in structure and wording. This makes simple pattern-matching approaches, such as regex, unreliable. A broad rule set risks generating large numbers of false positives, while a narrow one is unlikely to capture the diversity of possible injections.

How does Darktrace catch these types of attacks?

The Darktrace approach to email security more generally is to look beyond individual indicators and assess context, which also applies here.  

For example, our prompt density score identifies clusters of prompt-like language within an email rather than just single occurrences. Instead of treating the presence of a phrase as a blocking signal, the focus is on whether there is an unusual concentration of these patterns in a way that suggests injection. Additional weighting can be applied where there are signs of obfuscation. For example, text that is hidden from the user – such as white font or font size zero – but still readable by AI systems can indicate an attempt to conceal malicious prompts.

This is combined with broader behavioral signals. The same communication context used to detect other threats remains relevant, such as whether the content is unusual for the recipient or deviates from normal patterns.

Ask your email provider about email-delivered AI prompt injection

Prompt injection targets not just employees, but the AI systems they rely on, so security approaches need to account for both.

Though there are clear indications of emerging activity, it remains to be seen how popular prompt injection will be with attackers going forward. Still, considering the potential impact of this attack type, it’s worth checking if this risk has been considered by your email security provider.

Questions to ask your email security provider

  • What safeguards are in place to prevent emails from influencing AI‑driven workflows over time?
  • How do you assess email content that’s benign for a human reader, but may carry hidden instructions intended for AI systems?
  • If an email contains no links, no attachments, and no social engineering cues, what signals would your platform use to identify malicious intent?

Visit the Darktrace / EMAIL product hub to discover how we detect and respond to advanced communication threats.  

Learn more about securing AI in your enterprise.

Continue reading
About the author
Kiri Addison
Senior Director of Product

Blog

/

AI

/

April 30, 2026

Mythos vs Ethos: Defending in an Era of AI‑Accelerated Vulnerability Discovery

mythos vulnerability discoveryDefault blog imageDefault blog image

Anthropic’s Mythos and what it means for security teams

Recent attention on systems such as Anthropic Mythos highlights a notable problem for defenders. Namely that disclosure’s role in coordinating defensive action is eroding.

As AI systems gain stronger reasoning and coding capability, their usefulness in analyzing complex software environments and identifying weaknesses naturally increases. What has changed is not attacker motivation, but the conditions under which defenders learn about and organize around risk. Vulnerability discovery and exploitation increasingly unfold in ways that turn disclosure into a retrospective signal rather than a reliable starting point for defense.

Faster discovery was inevitable and is already visible

The acceleration of vulnerability discovery was already observable across the ecosystem. Publicly disclosed vulnerabilities (CVEs) have grown at double-digit rates for the past two years, including a 32% increase in 2024 according to NIST, driven in part by AI even prior to Anthropic’s Mythos model. Most notably XBOW topped the HackerOne US bug bounty leaderboard, marking the first time an autonomous penetration tester had done so.  

The technical frontier for AI capabilities has been described elsewhere as jagged, and the implication is that Mythos is exceptional but not unique in this capability. While Mythos appears to make significant progress in complex vulnerability analysis, many other models are already able to find and exploit weaknesses to varying degrees.  

What matters here is not which model performs best, but the fact that vulnerability discovery is no longer a scarce or tightly bounded capability.

The consequence of this shift is not simply earlier discovery. It is a change in the defender-attacker race condition. Disclosure once acted as a rough synchronization point. While attackers sometimes had earlier knowledge, disclosure generally marked the moment when risk became visible and defensive action could be broadly coordinated. Increasingly, that coordination will no longer exist. Exploitation may be underway well before a CVE is published, if it is published at all.

Why patch velocity alone is not the answer

The instinctive response to this shift is to focus on patching faster, but treating patch velocity as the primary solution misunderstands the problem. Most organizations are already constrained in how quickly they can remediate vulnerabilities. Asset sprawl, operational risk, testing requirements, uptime commitments, and unclear ownership all limit response speed, even when vulnerabilities are well understood.

If discovery and exploitation now routinely precede disclosure, then patching cannot be the first line of defense. It becomes one necessary control applied within a timeline that has already shifted. This does not imply that organizations should patch less. It means that patching cannot serve as the organizing principle for defense.

Defense needs a more stable anchor

If disclosure no longer defines when defense begins, then defense needs a reference point that does not depend on knowing the vulnerability in advance.  

Every digital environment has a behavioral character. Systems authenticate, communicate, execute processes, and access resources in relatively consistent ways over time. These patterns are not static rules or signatures. They are learned behaviors that reflect how an organization operates.

When exploitation occurs, even via previously unknown vulnerabilities, those behavioral patterns change.

Attackers may use novel techniques, but they still need to gain access, create processes, move laterally, and will ultimately interact with systems in ways that diverge from what is expected. That deviation is observable regardless of whether the underlying weakness has been formally named.

In an environment where disclosure can no longer be relied on for timing or coordination, behavioral understanding is no longer an optional enhancement; it becomes the only consistently available defensive signal.

Detecting risk before disclosure

Darktrace’s threat research has consistently shown that malicious activity often becomes visible before public disclosure.

In multiple cases, including exploitation of Ivanti, SAP NetWeaver, and Trimble Cityworks, Darktrace detected anomalous behavior days or weeks ahead of CVE publication. These detections did not rely on signatures, threat intelligence feeds, or awareness of the vulnerability itself. They emerged because systems began behaving in ways that did not align with their established patterns.

This reflects a defensive approach grounded in ‘Ethos’, in contrast to the unbounded exploration represented by ‘Mythos’. Here, Mythos describes continuous vulnerability discovery at speed and scale. Ethos reflects an understanding of what is normal and expected within a specific environment, grounded in observed behavior.

Revisiting assume breach

These conditions reinforce a principle long embedded in Zero Trust thinking: assume breach.

If exploitation can occur before disclosure, patching vulnerabilities can no longer act as the organizing principle for defense. Instead, effective defense must focus on monitoring for misuse and constraining attacker activity once access is achieved. Behavioral monitoring allows organizations to identify early‑stage compromise and respond while uncertainty remains, rather than waiting for formal verification.

AI plays a critical role here, not by predicting every exploit, but by continuously learning what normal looks like within a specific environment and identifying meaningful deviation at machine speed. Identifying that deviation enables defenders to respond by constraining activity back towards normal patterns of behavior.

Not an arms race, but an asymmetry

AI is often framed as fueling an arms race between attackers and defenders. In practice, the more important dynamic is asymmetry.

Attackers operate broadly, scanning many environments for opportunities. Defenders operate deeply within their own systems, and it’s this business context which is so significant. Behavioral understanding gives defenders a durable advantage. Attackers may automate discovery, but they cannot easily reproduce what belonging looks like inside a particular organization.

A changed defensive model

AI‑accelerated vulnerability discovery does not mean defenders have lost. It does mean that disclosure‑driven, patch‑centric models no longer provide a sufficient foundation for resilience.

As vulnerability volumes grow and exploitation timelines compress, effective defense increasingly depends on continuous behavioral understanding, detection that does not rely on prior disclosure, and rapid containment to limit impact. In this model, CVEs confirm risk rather than define when defense begins.

The industry has already seen this approach work in practice. As AI continues to reshape both offense and defense, behavioral detection will move from being complementary to being essential.

Continue reading
About the author
Andrew Hollister
Principal Solutions Engineer, Cyber Technician
Your data. Our AI.
Elevate your network security with Darktrace AI