Blog
/
/
April 16, 2025

Introducing Version 2 of Darktrace’s Embedding Model for Investigation of Security Threats (DEMIST-2)

Learn how Darktrace’s DEMIST-2 embedding model delivers high-accuracy threat classification and detection across any environment, outperforming larger models with efficiency and precision.
No items found.
Inside the SOC
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
No items found.
woman looking at laptop at deskDefault blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog image
16
Apr 2025

DEMIST-2 is Darktrace’s latest embedding model, built to interpret and classify security data with precision. It performs highly specialized tasks and can be deployed in any environment. Unlike generative language models, DEMIST-2 focuses on providing reliable, high-accuracy detections for critical security use cases.

DEMIST-2 Core Capabilities:  

  • Enhances Cyber AI Analyst’s ability to triage and reason about security incidents by providing expert representation and classification of security data, and as a part of our broader multi-layered AI system
  • Classifies and interprets security data, in contrast to language models that generate unpredictable open-ended text responses  
  • Incorporates new innovations in language model development and architecture, optimized specifically for cybersecurity applications
  • Deployable across cloud, on-prem, and edge environments, DEMIST-2 delivers low-latency, high-accuracy results wherever it runs. It enables inference anywhere.

Cybersecurity is constantly evolving, but the need to build precise and reliable detections remains constant in the face of new and emerging threats. Darktrace’s Embedding Model for Investigation of Security Threats (DEMIST-2) addresses these critical needs and is designed to create stable, high-fidelity representations of security data while also serving as a powerful classifier. For security teams, this means faster, more accurate threat detection with reduced manual investigation. DEMIST-2's efficiency also reduces the need to invest in massive computational resources, enabling effective protection at scale without added complexity.  

As an embedding language model, DEMIST-2 classifies and creates meaning out of complex security data. This equips our Self-Learning AI with the insights to compare, correlate, and reason with consistency and precision. Classifications and embeddings power core capabilities across our products where accuracy is not optional, as a part of our multi-layered approach to AI architecture.

Perhaps most importantly, DEMIST-2 features a compact architecture that delivers analyst-level insights while meeting diverse deployment needs across cloud, on-prem, and edge environments. Trained on a mixture of general and domain-specific data and designed to support task specialization, DEMIST-2 provides privacy-preserving inference anywhere, while outperforming larger general-purpose models in key cybersecurity tasks.

This proprietary language model reflects Darktrace's ongoing commitment to continually innovate our AI solutions to meet the unique challenges of the security industry. We approach AI differently, integrating diverse insights to solve complex cybersecurity problems. DEMIST-2 shows that a refined, optimized, domain-specific language model can deliver outsized results in an efficient package. We are redefining possibilities for cybersecurity, but our methods transfer readily to other domains. We are eager to share our findings to accelerate innovation in the field.  

The evolution of DEMIST-2

Key concepts:  

  • Tokens: The smallest units processed by language models. Text is split into fragments based on frequency patterns allowing models to handle unfamiliar words efficiently
  • Low-Rank Adaptors (LoRA): Small, trainable components added to a model that allow it to specialize in new tasks without retraining the full system. These components learn task-specific behavior while the original foundation model remains unchanged. This approach enables multiple specializations to coexist, and work simultaneously, without drastically increasing processing and memory requirements.

Darktrace began using large language models in our products in 2022. DEMIST-2 reflects significant advancements in our continuous experimentation and adoption of innovations in the field to address the unique needs of the security industry.  

It is important to note that Darktrace uses a range of language models throughout its products, but each one is chosen for the task at hand. Many others in the artificial intelligence (AI) industry are focused on broad application of large language models (LLMs) for open-ended text generation tasks. Our research shows that using LLMs for classification and embedding offers better, more reliable, results for core security use cases. We’ve found that using LLMs for open-ended outputs can introduce uncertainty through inaccurate and unreliable responses, which is detrimental for environments where precision matters. Generative AI should not be applied to use cases, such as investigation and threat detection, where the results can deeply matter. Thoughtful application of generative AI capabilities, such as drafting decoy phishing emails or crafting non-consequential summaries are helpful but still require careful oversight.

Data is perhaps the most important factor for building language models. The data used to train DEMIST-2 balanced the need for general language understanding with security expertise. We used both publicly available and proprietary datasets.  Our proprietary dataset included privacy-preserving data such as URIs observed in customer alerts, anonymized at source to remove PII and gathered via the Call Home and aianalyst.darktrace.com services. For additional details, read our Technical Paper.  

DEMIST-2 is our way of addressing the unique challenges posed by security data. It recognizes that security data follows its own patterns that are distinct from natural language. For example, hostnames, HTTP headers, and certificate fields often appear in predictable ways, but not necessarily in a way that mirrors natural language. General-purpose LLMs tend to break down when used in these types of highly specialized domains. They struggle to interpret structure and context, fragmenting important patterns during tokenization in ways that can have a negative impact on performance.  

DEMIST-2 was built to understand the language and structure of security data using a custom tokenizer built around a security-specific vocabulary of over 16,000 words. This tokenizer allows the model to process inputs more accurately like encoded payloads, file paths, subdomain chains, and command-line arguments. These types of data are often misinterpreted by general-purpose models.  

When the tokenizer encounters unfamiliar or irregular input, it breaks the data into smaller pieces so it can still be processed. The ability to fall back to individual bytes is critical in cybersecurity contexts where novel or obfuscated content is common. This approach combines precision with flexibility, supporting specialized understanding with resilience in the face of unpredictable data.  

Along with our custom tokenizer, we made changes to support task specialization without increasing model size. To do this, DEMIST-2 uses LoRA . LoRA is a technique that integrates lightweight components with the base model to allow it to perform specific tasks while keeping memory requirements low. By using LoRA, our proprietary representation of security knowledge can be shared and reused as a starting point for more highly specialized models, for example, it takes a different type of specialization to understand hostnames versus to understand sensitive filenames. DEMIST-2 dynamically adapts to these needs and performs them with purpose.  

The result is that DEMIST-2 is like having a room of specialists working on difficult problems together, while sharing a basic core set of knowledge that does not need to be repeated or reintroduced to every situation. Sharing a consistent base model also improves its maintainability and allows efficient deployment across diverse environments without compromising speed or accuracy.  

Tokenization and task specialization represent only a portion of the updates we have made to our embedding model. In conjunction with the changes described above, DEMIST-2 integrates several updated modeling techniques that reduce latency and improve detections. To learn more about these details, our training data and methods, and a full write-up of our results, please read our scientific whitepaper.

DEMIST-2 in action

In this section, we highlight DEMIST-2's embeddings and performance. First, we show a visualization of how DEMIST-2 classifies and interprets hostnames, and second, we present its performance in a hostname classification task in comparison to other language models.  

Embeddings can often feel abstract, so let’s make them real. Figure 1 below is a 2D visualization of how DEMIST-2 classifies and understands hostnames. In reality, these hostnames exist across many more dimensions, capturing details like their relationships with other hostnames, usage patterns, and contextual data. The colors and positions in the diagram represent a simplified view of how DEMIST-2 organizes and interprets these hostnames, providing insights into their meaning and connections. Just like an experienced human analyst can quickly identify and group hostnames based on patterns and context, DEMIST-2 does the same at scale.  

DEMIST-2 visualization of hostname relationships from a large web dataset.
Figure 1: DEMIST-2 visualization of hostname relationships from a large web dataset.

Next, let’s zoom in on two distinct clusters that DEMIST-2 recognizes. One cluster represents small businesses (Figure 2) and the other, Russian and Polish sites with similar numerical formats (Figure 3). These clusters demonstrate how DEMIST-2 can identify specific groupings based on real-world attributes such as regional patterns in website structures, common formats used by small businesses, and other properties such as its understanding of how websites relate to each other on the internet.

Cluster of small businesses
Figure 2: Cluster of small businesses
Figure 3: Cluster of Russian and Polish sites with a similar numerical format

The previous figures provided a view of how DEMIST-2 works. Figure 4 highlights DEMIST-2’s performance in a security-related classification task. The chart shows how DEMIST-2, with just 95 million parameters, achieves nearly 94% accuracy—making it the highest-performing model in the chart, despite being the smallest. In comparison, the larger model with 278 million parameters achieves only about 89% accuracy, showing that size doesn’t always mean better performance. Small models don’t mean poor performance. For many security-related tasks, DEMIST-2 outperforms much larger models.

Hostname classification task performance comparison against comparable open source foundation models
Figure 4: Hostname classification task performance comparison against comparable open source foundation models

With these examples of DEMIST-2 in action, we’ve shown how it excels in embedding and classifying security data while delivering high performance on specialized security tasks.  

The DEMIST-2 advantage

DEMIST-2 was built for precision and reliability. Our primary goal was to create a high-performance model capable of tackling complex cybersecurity tasks. Optimizing for efficiency and scalability came second, but it is a natural outcome of our commitment to building a strong, effective solution that is available to security teams working across diverse environments. It is an enormous benefit that DEMIST-2 is orders of magnitude smaller than many general-purpose models. However, and much more importantly, it significantly outperforms models in its capabilities and accuracy on security tasks.  

Finding a product that fits into an environment’s unique constraints used to mean that some teams had to settle for less powerful or less performant products. With DEMIST-2, data can remain local to the environment, is entirely separate from the data of other customers, and can even operate in environments without network connectivity. The size of our model allows for flexible deployment options while at the same time providing measurable performance advantages for security-related tasks.  

As security threats continue to evolve, we believe that purpose-built AI systems like DEMIST-2 will be essential tools for defenders, combining the power of modern language modeling with the specificity and reliability that builds trust and partnership between security practitioners and AI systems.

Conclusion

DEMIST-2 has additional architectural and deployment updates that improve performance and stability. These innovations contribute to our ability to minimize model size and memory constraints and reflect our dedication to meeting the data handling and privacy needs of security environments. In addition, these choices reflect our dedication to responsible AI practices.

DEMIST-2 is available in Darktrace 6.3, along with a new DIGEST model that uses GNNs and RNNs to score and prioritize threats with expert-level precision.

[related-resource]

Want more details?

Read the full research paper to explore how DEMIST-2 was built, trained, and optimized to meet the unique challenges of cybersecurity

No items found.
Inside the SOC
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
No items found.

More in this series

No items found.

Blog

/

/

July 17, 2025

Introducing the AI Maturity Model for Cybersecurity

AI maturity model for cybersecurityDefault blog imageDefault blog image

AI adoption in cybersecurity: Beyond the hype

Security operations today face a paradox. On one hand, artificial intelligence (AI) promises sweeping transformation from automating routine tasks to augmenting threat detection and response. On the other hand, security leaders are under immense pressure to separate meaningful innovation from vendor hype.

To help CISOs and security teams navigate this landscape, we’ve developed the most in-depth and actionable AI Maturity Model in the industry. Built in collaboration with AI and cybersecurity experts, this framework provides a structured path to understanding, measuring, and advancing AI adoption across the security lifecycle.

Overview of AI maturity levels in cybersecurity

Why a maturity model? And why now?

In our conversations and research with security leaders, a recurring theme has emerged:

There’s no shortage of AI solutions, but there is a shortage of clarity and understanding of AI uses cases.

In fact, Gartner estimates that “by 2027, over 40% of Agentic AI projects will be canceled due to escalating costs, unclear business value, or inadequate risk controls. Teams are experimenting, but many aren’t seeing meaningful outcomes. The need for a standardized way to evaluate progress and make informed investments has never been greater.

That’s why we created the AI Security Maturity Model, a strategic framework that:

  • Defines five clear levels of AI maturity, from manual processes (L0) to full AI Delegation (L4)
  • Delineating the outcomes derived between Agentic GenAI and Specialized AI Agent Systems
  • Applies across core functions such as risk management, threat detection, alert triage, and incident response
  • Links AI maturity to real-world outcomes like reduced risk, improved efficiency, and scalable operations

[related-resource]

How is maturity assessed in this model?

The AI Maturity Model for Cybersecurity is grounded in operational insights from nearly 10,000 global deployments of Darktrace's Self-Learning AI and Cyber AI Analyst. Rather than relying on abstract theory or vendor benchmarks, the model reflects what security teams are actually doing, where AI is being adopted, how it's being used, and what outcomes it’s delivering.

This real-world foundation allows the model to offer a practical, experience-based view of AI maturity. It helps teams assess their current state and identify realistic next steps based on how organizations like theirs are evolving.

Why Darktrace?

AI has been central to Darktrace’s mission since its inception in 2013, not just as a feature, but the foundation. With over a decade of experience building and deploying AI in real-world security environments, we’ve learned where it works, where it doesn’t, and how to get the most value from it. This model reflects that insight, helping security leaders find the right path forward for their people, processes, and tools

Security teams today are asking big, important questions:

  • What should we actually use AI for?
  • How are other teams using it — and what’s working?
  • What are vendors offering, and what’s just hype?
  • Will AI ever replace people in the SOC?

These questions are valid, and they’re not always easy to answer. That’s why we created this model: to help security leaders move past buzzwords and build a clear, realistic plan for applying AI across the SOC.

The structure: From experimentation to autonomy

The model outlines five levels of maturity :

L0 – Manual Operations: Processes are mostly manual with limited automation of some tasks.

L1 – Automation Rules: Manually maintained or externally-sourced automation rules and logic are used wherever possible.

L2 – AI Assistance: AI assists research but is not trusted to make good decisions. This includes GenAI agents requiring manual oversight for errors.

L3 – AI Collaboration: Specialized cybersecurity AI agent systems  with business technology context are trusted with specific tasks and decisions. GenAI has limited uses where errors are acceptable.

L4 – AI Delegation: Specialized AI agent systems with far wider business operations and impact context perform most cybersecurity tasks and decisions independently, with only high-level oversight needed.

Each level reflects a shift, not only in technology, but in people and processes. As AI matures, analysts evolve from executors to strategic overseers.

Strategic benefits for security leaders

The maturity model isn’t just about technology adoption it’s about aligning AI investments with measurable operational outcomes. Here’s what it enables:

SOC fatigue is real, and AI can help

Most teams still struggle with alert volume, investigation delays, and reactive processes. AI adoption is inconsistent and often siloed. When integrated well, AI can make a meaningful difference in making security teams more effective

GenAI is error prone, requiring strong human oversight

While there is a lot of hype around GenAI agentic systems, teams will need to account for inaccuracy and hallucination in Agentic GenAI systems.

AI’s real value lies in progression

The biggest gains don’t come from isolated use cases, but from integrating AI across the lifecycle, from preparation through detection to containment and recovery.

Trust and oversight are key initially but evolves in later levels

Early-stage adoption keeps humans fully in control. By L3 and L4, AI systems act independently within defined bounds, freeing humans for strategic oversight.

People’s roles shift meaningfully

As AI matures, analyst roles consolidate and elevate from labor intensive task execution to high-value decision-making, focusing on critical, high business impact activities, improving processes and AI governance.

Outcome, not hype, defines maturity

AI maturity isn’t about tech presence, it’s about measurable impact on risk reduction, response time, and operational resilience.

[related-resource]

Outcomes across the AI Security Maturity Model

The Security Organization experiences an evolution of cybersecurity outcomes as teams progress from manual operations to AI delegation. Each level represents a step-change in efficiency, accuracy, and strategic value.

L0 – Manual Operations

At this stage, analysts manually handle triage, investigation, patching, and reporting manually using basic, non-automated tools. The result is reactive, labor-intensive operations where most alerts go uninvestigated and risk management remains inconsistent.

L1 – Automation Rules

At this stage, analysts manage rule-based automation tools like SOAR and XDR, which offer some efficiency gains but still require constant tuning. Operations remain constrained by human bandwidth and predefined workflows.

L2 – AI Assistance

At this stage, AI assists with research, summarization, and triage, reducing analyst workload but requiring close oversight due to potential errors. Detection improves, but trust in autonomous decision-making remains limited.

L3 – AI Collaboration

At this stage, AI performs full investigations and recommends actions, while analysts focus on high-risk decisions and refining detection strategies. Purpose-built agentic AI systems with business context are trusted with specific tasks, improving precision and prioritization.

L4 – AI Delegation

At this stage, Specialized AI Agent Systems performs most security tasks independently at machine speed, while human teams provide high-level strategic oversight. This means the highest time and effort commitment activities by the human security team is focused on proactive activities while AI handles routine cybersecurity tasks

Specialized AI Agent Systems operate with deep business context including impact context to drive fast, effective decisions.

Join the webinar

Get a look at the minds shaping this model by joining our upcoming webinar using this link. We’ll walk through real use cases, share lessons learned from the field, and show how security teams are navigating the path to operational AI safely, strategically, and successfully.

Continue reading
About the author

Blog

/

/

July 17, 2025

Forensics or Fauxrensics: Five Core Capabilities for Cloud Forensics and Incident Response

people working and walking in officeDefault blog imageDefault blog image

The speed and scale at which new cloud resources can be spun up has resulted in uncontrolled deployments, misconfigurations, and security risks. It has had security teams racing to secure their business’ rapid migration from traditional on-premises environments to the cloud.

While many organizations have successfully extended their prevention and detection capabilities to the cloud, they are now experiencing another major gap: forensics and incident response.

Once something bad has been identified, understanding its true scope and impact is nearly impossible at times. The proliferation of cloud resources across a multitude of cloud providers, and the addition of container and serverless capabilities all add to the complexities. It’s clear that organizations need a better way to manage cloud incident response.

Security teams are looking to move past their homegrown solutions and open-source tools to incorporate real cloud forensics capabilities. However, with the increased buzz around cloud forensics, it can be challenging to decipher what is real cloud forensics, and what is “fauxrensics.”

This blog covers the five core capabilities that security teams should consider when evaluating a cloud forensics and incident response solution.

[related-resource]

1. Depth of data

There have been many conversations among the security community about whether cloud forensics is just log analysis. The reality, however, is that cloud forensics necessitates access to a robust dataset that extends far beyond traditional log data sources.

While logs provide valuable insights, a forensics investigation demands a deeper understanding derived from multiple data sources, including disk, network, and memory, within the cloud infrastructure. Full disk analysis complements log analysis, offering crucial context for identifying the root cause and scope of an incident.

For instance, when investigating an incident involving a Kubernetes cluster running on an EC2 instance, access to bash history can provide insights into the commands executed by attackers on the affected instance, which would not be available through cloud logs alone.

Having all of the evidence in one place is also a capability that can significantly streamline investigations, unifying your evidence be it disk images, memory captures or cloud logs, into a single timeline allowing security teams to reconstruct an attacks origin, path and impact far more easily. Multi–cloud environments also require platforms that can support aggregating data from many providers and services into one place. Doing this enables more holistic investigations and reduces security blind spots.

There is also the importance of collecting data from ephemeral resources in modern cloud and containerized environments. Critical evidence can be lost in seconds as resources are constantly spinning up and down, so having the ability to capture this data before its gone can be a huge advantage to security teams, rather than having to figure out what happened after the affected service is long gone.

darktrace / cloud, cado, cloud logs, ost, and memory information. value of cloud combined analysis

2. Chain of custody

Chain of custody is extremely critical in the context of legal proceedings and is an essential component of forensics and incident response. However, chain of custody in the cloud can be extremely complex with the number of people who have access and the rise of multi-cloud environments.

In the cloud, maintaining a reliable chain of custody becomes even more complex than it already is, due to having to account for multiple access points, service providers and third parties. Having automated evidence tracking is a must. It means that all actions are logged, from collection to storage to access. Automation also minimizes the chance of human error, reducing the risk of mistakes or gaps in evidence handling, especially in high pressure fast moving investigations.

The ability to preserve unaltered copies of forensic evidence in a secure manner is required to ensure integrity throughout an investigation. It is not just a technical concern, its a legal one, ensuring that your evidence handling is documented and time stamped allows it to stand up to court or regulatory review.

Real cloud forensics platforms should autonomously handle chain of custody in the background, recording and safeguarding evidence without human intervention.

3. Automated collection and isolation

When malicious activity is detected, the speed at which security teams can determine root cause and scope is essential to reducing Mean Time to Response (MTTR).

Automated forensic data collection and system isolation ensures that evidence is collected and compromised resources are isolated at the first sign of malicious activity. This can often be before an attacker has had the change to move latterly or cover their tracks. This enables security teams to prevent potential damage and spread while a deeper-dive forensics investigation takes place. This method also ensures critical incident evidence residing in ephemeral environments is preserved in the event it is needed for an investigation. This evidence may only exist for minutes, leaving no time for a human analyst to capture it.

Cloud forensics and incident response platforms should offer the ability to natively integrate with incident detection and alerting systems and/or built-in product automation rules to trigger evidence capture and resource isolation.

4. Ease of use

Security teams shouldn’t require deep cloud or incident response knowledge to perform forensic investigations of cloud resources. They already have enough on their plates.

While traditional forensics tools and approaches have made investigation and response extremely tedious and complex, modern forensics platforms prioritize usability at their core, and leverage automation to drastically simplify the end-to-end incident response process, even when an incident spans multiple Cloud Service Providers (CSPs).

Useability is a core requirement for any modern forensics platform. Security teams should not need to have indepth knowledge of every system and resource in a given estate. Workflows, automation and guidance should make it possible for an analyst to investigate whatever resource they need to.

Unifying the workflow across multiple clouds can also save security teams a huge amount of time and resources. Investigations can often span multiple CSP’s. A good security platform should provide a single place to search, correlate and analyze evidence across all environments.

Offering features such as cross cloud support, data enrichment, a single timeline view, saved search, and faceted search can help advanced analysts achieve greater efficiency, and novice analysts are able to participate in more complex investigations.

5. Incident preparedness

Incident response shouldn't just be reactive. Modern security teams need to regularly test their ability to acquire new evidence, triage assets and respond to threats across both new and existing resources, ensuring readiness even in the rapidly changing environments of the cloud.  Having the ability to continuously assess your incident response and forensics workflows enables you to rapidly improve your processes and identify and mitigate any gaps identified that could prevent the organization from being able to effectively respond to potential threats.

Real forensics platforms deliver features that enable security teams to prepare extensively and understand their shortcomings before they are in the heat of an incident. For example, cloud forensics platforms can provide the ability to:

  • Run readiness checks and see readiness trends over time
  • Identify and mitigate issues that could prevent rapid investigation and response
  • Ensure the correct logging, management agents, and other cloud-native tools are appropriately configured and operational
  • Ensure that data gathered during an investigation can be decrypted
  • Verify that permissions are aligned with best practices and are capable of supporting incident response efforts

Cloud forensics with Darktrace

Darktrace delivers a proactive approach to cyber resilience in a single cybersecurity platform, including cloud coverage. Darktrace / CLOUD is a real time Cloud Detection and Response (CDR) solution built with advanced AI to make cloud security accessible to all security teams and SOCs. By using multiple machine learning techniques, Darktrace brings unprecedented visibility, threat detection, investigation, and incident response to hybrid and multi-cloud environments.

Darktrace’s cloud offerings have been bolstered with the acquisition of Cado Security Ltd., which enables security teams to gain immediate access to forensic-level data in multi-cloud, container, serverless, SaaS, and on-premises environments.

[related-resource]

Continue reading
About the author
Your data. Our AI.
Elevate your network security with Darktrace AI