April 17, 2025

Introducing Version 2 of Darktrace’s Embedding Model for Investigation of Security Threats (DEMIST-2)

Learn how Darktrace’s DEMIST-2 embedding model delivers high-accuracy threat classification and detection across any environment, outperforming larger models with efficiency and precision.

Written by

Margaret Cunningham, PhD

VP, Security & AI Strategy, Field CISO

Inside the SOC

Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.

Written by

Margaret Cunningham, PhD

VP, Security & AI Strategy, Field CISO

Apr 2025

DEMIST-2 is Darktrace’s latest embedding model, built to interpret and classify security data with precision. It performs highly specialized tasks and can be deployed in any environment. Unlike generative language models, DEMIST-2 focuses on providing reliable, high-accuracy detections for critical security use cases.

DEMIST-2 Core Capabilities:

Enhances Cyber AI Analyst’s ability to triage and reason about security incidents by providing expert representation and classification of security data, and as a part of our broader multi-layered AI system
Classifies and interprets security data, in contrast to language models that generate unpredictable open-ended text responses
Incorporates new innovations in language model development and architecture, optimized specifically for cybersecurity applications
Deployable across cloud, on-prem, and edge environments, DEMIST-2 delivers low-latency, high-accuracy results wherever it runs. It enables inference anywhere.

Cybersecurity is constantly evolving, but the need to build precise and reliable detections remains constant in the face of new and emerging threats. Darktrace’s Embedding Model for Investigation of Security Threats (DEMIST-2) addresses these critical needs and is designed to create stable, high-fidelity representations of security data while also serving as a powerful classifier. For security teams, this means faster, more accurate threat detection with reduced manual investigation. DEMIST-2's efficiency also reduces the need to invest in massive computational resources, enabling effective protection at scale without added complexity.

As an embedding language model, DEMIST-2 classifies and creates meaning out of complex security data. This equips our Self-Learning AI with the insights to compare, correlate, and reason with consistency and precision. Classifications and embeddings power core capabilities across our products where accuracy is not optional, as a part of our multi-layered approach to AI architecture.

Perhaps most importantly, DEMIST-2 features a compact architecture that delivers analyst-level insights while meeting diverse deployment needs across cloud, on-prem, and edge environments. Trained on a mixture of general and domain-specific data and designed to support task specialization, DEMIST-2 provides privacy-preserving inference anywhere, while outperforming larger general-purpose models in key cybersecurity tasks.

This proprietary language model reflects Darktrace's ongoing commitment to continually innovate our AI solutions to meet the unique challenges of the security industry. We approach AI differently, integrating diverse insights to solve complex cybersecurity problems. DEMIST-2 shows that a refined, optimized, domain-specific language model can deliver outsized results in an efficient package. We are redefining possibilities for cybersecurity, but our methods transfer readily to other domains. We are eager to share our findings to accelerate innovation in the field.

The evolution of DEMIST-2

Key concepts:

Tokens: The smallest units processed by language models. Text is split into fragments based on frequency patterns allowing models to handle unfamiliar words efficiently

Low-Rank Adaptors (LoRA): Small, trainable components added to a model that allow it to specialize in new tasks without retraining the full system. These components learn task-specific behavior while the original foundation model remains unchanged. This approach enables multiple specializations to coexist, and work simultaneously, without drastically increasing processing and memory requirements.

Darktrace began using large language models in our products in 2022. DEMIST-2 reflects significant advancements in our continuous experimentation and adoption of innovations in the field to address the unique needs of the security industry.

It is important to note that Darktrace uses a range of language models throughout its products, but each one is chosen for the task at hand. Many others in the artificial intelligence (AI) industry are focused on broad application of large language models (LLMs) for open-ended text generation tasks. Our research shows that using LLMs for classification and embedding offers better, more reliable, results for core security use cases. We’ve found that using LLMs for open-ended outputs can introduce uncertainty through inaccurate and unreliable responses, which is detrimental for environments where precision matters. Generative AI should not be applied to use cases, such as investigation and threat detection, where the results can deeply matter. Thoughtful application of generative AI capabilities, such as drafting decoy phishing emails or crafting non-consequential summaries are helpful but still require careful oversight.

Data is perhaps the most important factor for building language models. The data used to train DEMIST-2 balanced the need for general language understanding with security expertise. We used both publicly available and proprietary datasets. Our proprietary dataset included privacy-preserving data such as URIs observed in customer alerts, anonymized at source to remove PII and gathered via the Call Home and aianalyst.darktrace.com services. For additional details, read our Technical Paper.

DEMIST-2 is our way of addressing the unique challenges posed by security data. It recognizes that security data follows its own patterns that are distinct from natural language. For example, hostnames, HTTP headers, and certificate fields often appear in predictable ways, but not necessarily in a way that mirrors natural language. General-purpose LLMs tend to break down when used in these types of highly specialized domains. They struggle to interpret structure and context, fragmenting important patterns during tokenization in ways that can have a negative impact on performance.

DEMIST-2 was built to understand the language and structure of security data using a custom tokenizer built around a security-specific vocabulary of over 16,000 words. This tokenizer allows the model to process inputs more accurately like encoded payloads, file paths, subdomain chains, and command-line arguments. These types of data are often misinterpreted by general-purpose models.

When the tokenizer encounters unfamiliar or irregular input, it breaks the data into smaller pieces so it can still be processed. The ability to fall back to individual bytes is critical in cybersecurity contexts where novel or obfuscated content is common. This approach combines precision with flexibility, supporting specialized understanding with resilience in the face of unpredictable data.

Along with our custom tokenizer, we made changes to support task specialization without increasing model size. To do this, DEMIST-2 uses LoRA . LoRA is a technique that integrates lightweight components with the base model to allow it to perform specific tasks while keeping memory requirements low. By using LoRA, our proprietary representation of security knowledge can be shared and reused as a starting point for more highly specialized models, for example, it takes a different type of specialization to understand hostnames versus to understand sensitive filenames. DEMIST-2 dynamically adapts to these needs and performs them with purpose.

The result is that DEMIST-2 is like having a room of specialists working on difficult problems together, while sharing a basic core set of knowledge that does not need to be repeated or reintroduced to every situation. Sharing a consistent base model also improves its maintainability and allows efficient deployment across diverse environments without compromising speed or accuracy.

Tokenization and task specialization represent only a portion of the updates we have made to our embedding model. In conjunction with the changes described above, DEMIST-2 integrates several updated modeling techniques that reduce latency and improve detections. To learn more about these details, our training data and methods, and a full write-up of our results, please read our scientific whitepaper.

DEMIST-2 in action

In this section, we highlight DEMIST-2's embeddings and performance. First, we show a visualization of how DEMIST-2 classifies and interprets hostnames, and second, we present its performance in a hostname classification task in comparison to other language models.

Embeddings can often feel abstract, so let’s make them real. Figure 1 below is a 2D visualization of how DEMIST-2 classifies and understands hostnames. In reality, these hostnames exist across many more dimensions, capturing details like their relationships with other hostnames, usage patterns, and contextual data. The colors and positions in the diagram represent a simplified view of how DEMIST-2 organizes and interprets these hostnames, providing insights into their meaning and connections. Just like an experienced human analyst can quickly identify and group hostnames based on patterns and context, DEMIST-2 does the same at scale.

Figure 1: DEMIST-2 visualization of hostname relationships from a large web dataset.

Next, let’s zoom in on two distinct clusters that DEMIST-2 recognizes. One cluster represents small businesses (Figure 2) and the other, Russian and Polish sites with similar numerical formats (Figure 3). These clusters demonstrate how DEMIST-2 can identify specific groupings based on real-world attributes such as regional patterns in website structures, common formats used by small businesses, and other properties such as its understanding of how websites relate to each other on the internet.

Figure 3: Cluster of Russian and Polish sites with a similar numerical format

The previous figures provided a view of how DEMIST-2 works. Figure 4 highlights DEMIST-2’s performance in a security-related classification task. The chart shows how DEMIST-2, with just 95 million parameters, achieves nearly 94% accuracy—making it the highest-performing model in the chart, despite being the smallest. In comparison, the larger model with 278 million parameters achieves only about 89% accuracy, showing that size doesn’t always mean better performance. Small models don’t mean poor performance. For many security-related tasks, DEMIST-2 outperforms much larger models.

Figure 4: Hostname classification task performance comparison against comparable open source foundation models

With these examples of DEMIST-2 in action, we’ve shown how it excels in embedding and classifying security data while delivering high performance on specialized security tasks.

The DEMIST-2 advantage

DEMIST-2 was built for precision and reliability. Our primary goal was to create a high-performance model capable of tackling complex cybersecurity tasks. Optimizing for efficiency and scalability came second, but it is a natural outcome of our commitment to building a strong, effective solution that is available to security teams working across diverse environments. It is an enormous benefit that DEMIST-2 is orders of magnitude smaller than many general-purpose models. However, and much more importantly, it significantly outperforms models in its capabilities and accuracy on security tasks.

Finding a product that fits into an environment’s unique constraints used to mean that some teams had to settle for less powerful or less performant products. With DEMIST-2, data can remain local to the environment, is entirely separate from the data of other customers, and can even operate in environments without network connectivity. The size of our model allows for flexible deployment options while at the same time providing measurable performance advantages for security-related tasks.

As security threats continue to evolve, we believe that purpose-built AI systems like DEMIST-2 will be essential tools for defenders, combining the power of modern language modeling with the specificity and reliability that builds trust and partnership between security practitioners and AI systems.

Conclusion

DEMIST-2 has additional architectural and deployment updates that improve performance and stability. These innovations contribute to our ability to minimize model size and memory constraints and reflect our dedication to meeting the data handling and privacy needs of security environments. In addition, these choices reflect our dedication to responsible AI practices.

DEMIST-2 is available in Darktrace 6.3, along with a new DIGEST model that uses GNNs and RNNs to score and prioritize threats with expert-level precision.

[related-resource]

Want more details?

Read the full research paper to explore how DEMIST-2 was built, trained, and optimized to meet the unique challenges of cybersecurity

Read the research paper

Written by

Margaret Cunningham, PhD

VP, Security & AI Strategy, Field CISO

Inside the SOC

Written by

Margaret Cunningham, PhD

VP, Security & AI Strategy, Field CISO

•

July 13, 2026

Nathaniel Jones

VP, Security & AI Strategy, Field CISO

•

July 24, 2026

Carlos Gray

Senior Product Marketing Manager, Email

Watch the NIS2 Webinar

Blog

Email

July 24, 2026

Darktrace / EMAIL Expands Behavioral Defense Across Email and Collaboration Workflows

Email and collaboration tools do more than carry messages. They are where organizations approve payments, share sensitive data, reset credentials, and make thousands of everyday decisions. Increasingly, they are interfaces through which humans direct AI agents in their daily activity. Email, Slack and Teams are high volume, rich with sensitive data, and an easy place to hide malicious activity.

The opportunity isn’t lost on bad actors. Darktrace / EMAIL detected more than 32 million high-confidence phishing emails globally in 2025, and 70% of those messages passed DMARC authentication. Phishing is increasingly difficult to detect and familiar trust signals alone are not enough. People and security teams need to understand how a message fits the normal behavior of the sender, recipient, and organization. They also need to correlate activity across platforms to spot threats that span multiple channels.

To effectively secure against today’s evolved threats, security teams need to act at two levels: they need to help each employee make a safer decision ‘in the moment’, and they need to understand the wider patterns that may expose the business to risk.

Darktrace is introducing four new capabilities in Darktrace / EMAIL to address both challenges. The new features explain suspicious content more clearly to end users, strengthen the capabilities of Darktrace / Adaptive Human Defense with richer guidance, let organizations define their own patterns for detecting sensitive data in messages, and give security teams a process-level view of risk across email and collaboration workflows.

Darktrace / EMAIL Inbox Analysis highlights risky content within your emails

A warning is more useful when it explains what the user should look at. To help do that, we’ve expanded Darktrace / EMAIL’s Inbox Analysis Add-In to highlight potentially dangerous content within the body of emails that Darktrace / EMAIL flags as potentially suspicious or high risk.

The add-in can highlight language designed to create urgency, financial references, requests for payment, suspicious links, and content that is unusual for the sender. Each highlighted element includes a pop up that explains why it may be suspicious. Instead of asking an employee to accept a verdict without context, the analysis helps them examine the message and make a more informed decision.

Enhanced Just-In-Time Training Banners in Darktrace / Adaptive Human Defense

Enhanced Just-In-Time Training Banners build on the same principle. The banners now include a contextual header, actionable advice, and specific detection context. This gives employees more useful guidance at the point of risk without adding unnecessary information or cognitive load.

Together, the capabilities help turn a warning into a short learning moment. Employees can see what looks unusual, understand what action to take, and build their judgment.

Custom Sensitive Data Detection in Darktrace / EMAIL - Data Loss Prevention

Sensitive data is different for every business. Standard categories such as payment card details or government identifiers matter, but organizations also have their own customer codes, project names, research formats, account structures, and internal identifiers.

Custom Sensitive Data Detection in Darktrace / EMAIL - Data Loss Prevention allows administrators to write custom expressions for the data their organization needs to protect. Matched content can trigger existing model actions and data loss prevention (DLP) workflows, extending Darktrace's DLP capabilities.

This extends data loss detection beyond a fixed library of common data types. Security teams can apply controls to information that is sensitive in the context of their own organization and adapt those controls as the business changes.

Introducing Email and Collaboration Workflow Risk Posture Dashboards

Some of the most important risks are not isolated events. They are repeated ways of working that create an opening for error, misuse, or attack. For example, a payment request may be one suspicious message, but a recurring approval workflow that relies on weak verification is a business process risk.

The new Email and Collaboration Workflow Risk Posture Dashboard analyzes email and collaboration data across Email, Microsoft Teams, Slack and Zoom to provide a process-level view of risk in the organization. These may include financial authorization workflows, sensitive data sharing patterns, and activity that could expose credentials.

The dashboard brings these patterns into a view and provides actionable recommendations. This helps security teams determine where to investigate or strengthen controls, where ownership needs to be clarified, and where the business may need to change a risky process. It gives CISOs a clearer view of how human and communication risk is embedded in everyday operations, not only where individual alerts occur.

Behavior connects the individual decision to the wider risk

These capabilities build on Darktrace’s unique behavioral approach to security. We use Adaptive AI to learn how people and AI normally behave within an organization, creating the context needed to recognize when activity changes.

Within the Darktrace Behavioral Defense Platform, Darktrace / EMAIL helps protect people against phishing, account takeover, data exfiltration, and human risk across email and collaboration tools. The new capabilities extend that protection in both directions. They give employees clearer context for the decision in front of them, while giving security leaders a broader view of the workflows and behavior that create risk across the organization.

The result is not simply more alerts. It is a better understanding of why something is risky, what action to take, and where the organization can reduce risk before a familiar process becomes an easy route for an attacker.

‍

[related-resource]

About the author

Carlos Gray

Senior Product Marketing Manager, Email

Blog

July 24, 2026

When Guardrails Break: Why Securing AI Requires Behavioral Detection and Autonomous Containment

Bottom line up front: Governance, guardrails, identity controls, and secure development are necessary to secure AI, but they are not sufficient. AI systems are probabilistic, adaptive, and non-deterministic. Therefore, organizations need two critical layers of security:

Behavioral-based detection that can identify when AI begins to act outside its intended purpose; and
Surgical, explainable autonomous containment that can stop risky activity before it causes material damage.

That capability depends on multiple specialized AI models working together, not one LLM making every decision.

Organizations are embedding AI into development, business operations, and security workflows faster than most security programs can adapt. The risk is no longer limited to the model. It extends across prompts, data, identities, agents, memory, APIs, tools, permissions, and the trust relationships connecting them.

In my recent blog, Securing AI: Analysis of the Complete Security Stack with Governance and Controls, I outlined a defense-in-depth strategy spanning governance, identity, data security, secure development, runtime detection, autonomous containment, and recovery. The most urgent requirement across that architecture is the ability to understand how AI behaves in practice and contain it when that behavior becomes risky.

Why non-deterministic systems require behavioral-based detection

Traditional controls remain foundational. Organizations need least privilege, strong identity controls, secure-by-design architecture, data governance, AI inventories, guardrails, testing, and clear boundaries on autonomy.

But deterministic controls, which assume predictable and repeatable behavior, cannot fully secure non-deterministic systems, where the same input may not always produce the same outcome.

AI agents can interpret the same instruction differently, chain individually authorized actions into an unsafe outcome, or pursue a legitimate goal through a method the organization did not anticipate. One of the most recent examples of this is the incident that OpenAI and Hugging Face jointly disclosed, where an autonomous agent escaped its intended testing boundaries and compromised Hugging Face infrastructure.

An agent may have permission to access data and invoke a tool, but that does not mean every use of that access is appropriate. It is not enough to know whether an action is allowed. Organizations need to know whether it makes sense.

Is this normal for this agent?
Is it acting within its intended purpose?
Is it accessing unusual data, invoking an unexpected tool, or beginning to drift?
Do a series of ordinary-looking actions become risky when viewed together?

Behavioral-based detection specific to an environment or organization with an understanding of context and risk enables provides the needed detection engineering for AI systems. It learns normal activity across people, systems, data, devices, and AI agents, then identifies deviations and evaluates their risk, intent, and context. This enables detection of misuse, abuse, compromise, manipulation, and unintended behavior even when no known attack signature or explicit policy violation exists.

Why accuracy is the foundation for SOC optimization

AI will only improve the SOC if it produces accurate, explainable, and actionable outcomes.

If analysts must manually validate every AI-generated finding because they cannot understand the evidence or confidence behind it, automation has not reduced workload. It has moved the workload. False positives increase fatigue. False negatives cause the most risk and damage to organizations. Inaccurate autonomous actions can disrupt critical operations.

Accuracy is therefore more than a model-performance metric. It is the prerequisite for analyst trust, SOC optimization, and safe autonomous response.

That accuracy is unlikely to come from one model.

Generative AI is valuable for natural-language analysis, summarization, and human interaction. But an LLM should not be the sole analytical engine for behavioral-based detection, investigation, risk assessment, and containment. Interpretability and consistency are required for high-consequence security decisions.

A stronger architecture uses multiple specialized AI systems collaboratively:

Behavioral models can establish normal activity.
Unsupervised learning can identify novel anomalies.
Graph analysis can evaluate relationships among agents, identities, systems, and tools.
Other models can correlate events, investigate competing hypotheses, and assess risk.
Semantic models can analyze language where behavior-based language analysis is needed but this can be used in tandem with vector embeddings, graph neural networks, and a variety of other AI systems.

Each model contributes a different analytical perspective. Their outputs can corroborate one another, improving accuracy and creating a more reliable basis for response. The objective is not one model operating as an oracle. It is layered, adaptive intelligence designed to produce decisions the SOC can understand and trust.

Autonomous containment is required to secure autonomous systems

Many SOCs remain hesitant to trust LLM-based agents with autonomous containment. That concern is reasonable. A poorly selected response can isolate the wrong asset, stop a critical workflow, block a legitimate identity, or create more operational damage than the original incident.

But relying exclusively on human response is also not viable.

AI systems can operate at machine speed. They can expose sensitive data, execute workflows, modify records, call tools, or propagate actions across connected systems before an analyst can investigate and intervene. The behavior may be unintentional, the result of an agent optimizing toward a goal, or caused by misuse, compromise, prompt injection, or offensive AI.

Intent affects the investigation. It does not change the need to stop the damage.

Organizations need autonomous response, but it must be surgical and explainable. The objective is not to shut down an entire agent, user, application, or business process whenever an anomaly occurs. It is to interrupt the specific risky behavior: block an unusual connection, constrain a tool call, stop an abnormal data transfer, or temporarily limit an agent when it is performing anomalous, risky activity.

That buys humans time. It stops the spread, limits damage, and allows the SOC to investigate without unnecessarily disrupting the business.

Layered, Adaptive AI provides a path forward

Darktrace has spent more than a decade researching and operationalizing layered, behavioral, Adaptive AI that learns a specific organization rather than relying only on historic attacks or predefined signatures.

The approach is designed to understand normal behavior, identify anomalous activity, assess its risk, correlate related events, autonomously investigate, and, when necessary, apply targeted containment while normal operations continue.

That sequence matters. Autonomous response cannot simply be added to the end of an LLM workflow. Trusted containment depends on broad visibility, continuous behavioral understanding, multiple analytical techniques, risk and context evaluation, autonomous investigation, explainability, and precise response actions.

This represents a more responsible model for security autonomy: not automation for its own sake, but controlled autonomy built to improve security outcomes and protect business operations.

Security must enable AI adoption

The answer for security teams is not to block AI. Organizations are adopting it to improve productivity, accelerate development, and create new business value.

But innovation without behavioral detection and autonomous containment is not sustainable.

Organizations should continue investing in governance, identity, least privilege, data security, secure MLOps, guardrails, testing, evaluation, validation, verification, kill switches, rollback, and forensic readiness. At the same time, they cannot wait for every governance program to mature before addressing runtime risk.

Behavioral-based detection and autonomous containment provide an immediate layer of resilience. They allow organizations to detect exploitation and risky AI behavior they did not anticipate, contain it at machine speed, and preserve human control over broader remediation.

The future of AI security will not be defined by a single model making every decision. It will be defined by multiple specialized AI systems working collaboratively, with sufficient accuracy, transparency, and context to support trusted autonomous action.

Surgical, explainable autonomous containment is no longer a future capability. It is a requirement for scaling AI securely today.

Learn how to build a defense-in-depth strategy for securing AI at scale in our talk at Black Hat on August 5 at 3:15 PM.

[related-resource]

About the author

Your data. Our AI.

Elevate your network security with Darktrace AI

Get a demo

Check out this article by Darktrace: Introducing Version 2 of Darktrace’s Embedding Model for Investigation of Security Threats (DEMIST-2)

Introducing Version 2 of Darktrace’s Embedding Model for Investigation of Security Threats (DEMIST-2)

DEMIST-2 Core Capabilities:

The evolution of DEMIST-2

Key concepts:

DEMIST-2 in action

The DEMIST-2 advantage

Conclusion

Want more details?

Security After Signatures: Operating in a World of Pre‑CVE Disclosure Exploitation, Collapsed Trust Boundaries, and Autonomous Systems

Darktrace / EMAIL Expands Behavioral Defense Across Email and Collaboration Workflows

Enjoying the blog?

More in this series

Blog

Email

July 24, 2026

Darktrace / EMAIL Expands Behavioral Defense Across Email and Collaboration Workflows

Darktrace / EMAIL Inbox Analysis highlights risky content within your emails

Enhanced Just-In-Time Training Banners in Darktrace / Adaptive Human Defense

Custom Sensitive Data Detection in Darktrace / EMAIL - Data Loss Prevention

Introducing Email and Collaboration Workflow Risk Posture Dashboards

Behavior connects the individual decision to the wider risk

Blog

July 24, 2026

When Guardrails Break: Why Securing AI Requires Behavioral Detection and Autonomous Containment

Why non-deterministic systems require behavioral-based detection

Why accuracy is the foundation for SOC optimization

Autonomous containment is required to secure autonomous systems

Layered, Adaptive AI provides a path forward

Security must enable AI adoption