Blog
/
/
October 30, 2023

Exploring AI Threats: Package Hallucination Attacks

Learn how malicious actors exploit errors in generative AI tools to launch packet attacks. Read how Darktrace products detect and prevent these threats!
Inside the SOC
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
Written by
Charlotte Thompson
Cyber Analyst
Written by
Tiana Kelly
Deputy Team Lead, London & Cyber Analyst
Default blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog image
30
Oct 2023

AI tools open doors for threat actors

On November 30, 2022, the free conversational language generation model ChatGPT was launched by OpenAI, an artificial intelligence (AI) research and development company. The launch of ChatGPT was the culmination of development ongoing since 2018 and represented the latest innovation in the ongoing generative AI boom and made the use of generative AI tools accessible to the general population for the first time.

ChatGPT is estimated to currently have at least 100 million users, and in August 2023 the site reached 1.43 billion visits [1]. Darktrace data indicated that, as of March 2023, 74% of active customer environments have employees using generative AI tools in the workplace [2].

However, with new tools come new opportunities for threat actors to exploit and use them maliciously, expanding their arsenal.

Much consideration has been given to mitigating the impacts of the increased linguistic complexity in social engineering and phishing attacks resulting from generative AI tool use, with Darktrace observing a 135% increase in ‘novel social engineering attacks’ across thousands of active Darktrace/Email™ customers from January to February 2023, corresponding with the widespread adoption of ChatGPT and its peers [3].

Less overall consideration, however, has been given to impacts stemming from errors intrinsic to generative AI tools. One of these errors is AI hallucinations.

What is an AI hallucination?

AI “hallucination” is a term which refers to the predictive elements of generative AI and LLMs’ AI model gives an unexpected or factually incorrect response which does not align with its machine learning training data [4]. This differs from regular and intended behavior for an AI model, which should provide a response based on the data it was trained upon.  

Why are AI hallucinations a problem?

Despite the term indicating it might be a rare phenomenon, hallucinations are far more likely than accurate or factual results as the AI models used in LLMs are merely predictive and focus on the most probable text or outcome, rather than factual accuracy.

Given the widespread use of generative AI tools in the workplace employees are becoming significantly more likely to encounter an AI hallucination. Furthermore, if these fabricated hallucination responses are taken at face value, they could cause significant issues for an organization.

Use of generative AI in software development

Software developers may use generative AI for recommendations on how to optimize their scripts or code, or to find packages to import into their code for various uses. Software developers may ask LLMs for recommendations on specific pieces of code or how to solve a specific problem, which will likely lead to a third-party package. It is possible that packages recommended by generative AI tools could represent AI hallucinations and the packages may not have been published, or, more accurately, the packages may not have been published prior to the date at which the training data for the model halts. If these hallucinations result in common suggestions of a non-existent package, and the developer copies the code snippet wholesale, this may leave the exchanges vulnerable to attack.

Research conducted by Vulcan revealed the prevalence of AI hallucinations when ChatGPT is asked questions related to coding. After sourcing a sample of commonly asked coding questions from Stack Overflow, a question-and-answer website for programmers, researchers queried ChatGPT (in the context of Node.js and Python) and reviewed its responses. In 20% of the responses provided by ChatGPT pertaining to Node.js at least one un-published package was included, whilst the figure sat at around 35% for Python [4].

Hallucinations can be unpredictable, but would-be attackers are able to find packages to create by asking generative AI tools generic questions and checking whether the suggested packages exist already. As such, attacks using this vector are unlikely to target specific organizations, instead posing more of a widespread threat to users of generative AI tools.

Malicious packages as attack vectors

Although AI hallucinations can be unpredictable, and responses given by generative AI tools may not always be consistent, malicious actors are able to discover AI hallucinations by adopting the approach used by Vulcan. This allows hallucinated packages to be used as attack vectors. Once a malicious actor has discovered a hallucination of an un-published package, they are able to create a package with the same name and include a malicious payload, before publishing it. This is known as a malicious package.

Malicious packages could also be recommended by generative AI tools in the form of pre-existing packages. A user may be recommended a package that had previously been confirmed to contain malicious content, or a package that is no longer maintained and, therefore, is more vulnerable to hijack by malicious actors.

In such scenarios it is not necessary to manipulate the training data (data poisoning) to achieve the desired outcome for the malicious actor, thus a complex and time-consuming attack phase can easily be bypassed.

An unsuspecting software developer may incorporate a malicious package into their code, rendering it harmful. Deployment of this code could then result in compromise and escalation into a full-blown cyber-attack.

Figure 1: Flow diagram depicting the initial stages of an AI Package Hallucination Attack.

For providers of Software-as-a-Service (SaaS) products, this attack vector may represent an even greater risk. Such organizations may have a higher proportion of employed software developers than other organizations of comparable size. A threat actor, therefore, could utilize this attack vector as part of a supply chain attack, whereby a malicious payload becomes incorporated into trusted software and is then distributed to multiple customers. This type of attack could have severe consequences including data loss, the downtime of critical systems, and reputational damage.

How could Darktrace detect an AI Package Hallucination Attack?

In June 2023, Darktrace introduced a range of DETECT™ and RESPOND™ models designed to identify the use of generative AI tools within customer environments, and to autonomously perform inhibitive actions in response to such detections. These models will trigger based on connections to endpoints associated with generative AI tools, as such, Darktrace’s detection of an AI Package Hallucination Attack would likely begin with the breaching of one of the following DETECT models:

  • Compliance / Anomalous Upload to Generative AI
  • Compliance / Beaconing to Rare Generative AI and Generative AI
  • Compliance / Generative AI

Should generative AI tool use not be permitted by an organization, the Darktrace RESPOND model ‘Antigena / Network / Compliance / Antigena Generative AI Block’ can be activated to autonomously block connections to endpoints associated with generative AI, thus preventing an AI Package Hallucination attack before it can take hold.

Once a malicious package has been recommended, it may be downloaded from GitHub, a platform and cloud-based service used to store and manage code. Darktrace DETECT is able to identify when a device has performed a download from an open-source repository such as GitHub using the following models:

  • Device / Anomalous GitHub Download
  • Device / Anomalous Script Download Followed By Additional Packages

Whatever goal the malicious package has been designed to fulfil will determine the next stages of the attack. Due to their highly flexible nature, AI package hallucinations could be used as an attack vector to deliver a large variety of different malware types.

As GitHub is a commonly used service by software developers and IT professionals alike, traditional security tools may not alert customer security teams to such GitHub downloads, meaning malicious downloads may go undetected. Darktrace’s anomaly-based approach to threat detection, however, enables it to recognize subtle deviations in a device’s pre-established pattern of life which may be indicative of an emerging attack.

Subsequent anomalous activity representing the possible progression of the kill chain as part of an AI Package Hallucination Attack could then trigger an Enhanced Monitoring model. Enhanced Monitoring models are high-fidelity indicators of potential malicious activity that are investigated by the Darktrace analyst team as part of the Proactive Threat Notification (PTN) service offered by the Darktrace Security Operation Center (SOC).

Conclusion

Employees are often considered the first line of defense in cyber security; this is particularly true in the face of an AI Package Hallucination Attack.

As the use of generative AI becomes more accessible and an increasingly prevalent tool in an attacker’s toolbox, organizations will benefit from implementing company-wide policies to define expectations surrounding the use of such tools. It is simple, yet critical, for example, for employees to fact check responses provided to them by generative AI tools. All packages recommended by generative AI should also be checked by reviewing non-generated data from either external third-party or internal sources. It is also good practice to adopt caution when downloading packages with very few downloads as it could indicate the package is untrustworthy or malicious.

As of September 2023, ChatGPT Plus and Enterprise users were able to use the tool to browse the internet, expanding the data ChatGPT can access beyond the previous training data cut-off of September 2021 [5]. This feature will be expanded to all users soon [6]. ChatGPT providing up-to-date responses could prompt the evolution of this attack vector, allowing attackers to publish malicious packages which could subsequently be recommended by ChatGPT.

It is inevitable that a greater embrace of AI tools in the workplace will be seen in the coming years as the AI technology advances and existing tools become less novel and more familiar. By fighting fire with fire, using AI technology to identify AI usage, Darktrace is uniquely placed to detect and take preventative action against malicious actors capitalizing on the AI boom.

Credit to Charlotte Thompson, Cyber Analyst, Tiana Kelly, Analyst Team Lead, London, Cyber Analyst

References

[1] https://seo.ai/blog/chatgpt-user-statistics-facts

[2] https://darktrace.com/news/darktrace-addresses-generative-ai-concerns

[3] https://darktrace.com/news/darktrace-email-defends-organizations-against-evolving-cyber-threat-landscape

[4] https://vulcan.io/blog/ai-hallucinations-package-risk?nab=1&utm_referrer=https%3A%2F%2Fwww.google.com%2F

[5] https://twitter.com/OpenAI/status/1707077710047216095

[6] https://www.reuters.com/technology/openai-says-chatgpt-can-now-browse-internet-2023-09-27/

Inside the SOC
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
Written by
Charlotte Thompson
Cyber Analyst
Written by
Tiana Kelly
Deputy Team Lead, London & Cyber Analyst

More in this series

No items found.

Blog

/

Network

/

September 9, 2025

The benefits of bringing together network and email security

Default blog imageDefault blog image

In many organizations, network and email security operate in isolation. Each solution is tasked with defending its respective environment, even though both are facing the same advanced, multi-domain threats.  

This siloed approach overlooks a critical reality: email remains the most common vector for initiating cyber-attacks, while the network is the primary stage on which those attacks progress. Without direct integration between these two domains, organizations risk leaving blind spots that adversaries can exploit.  

A modern security strategy needs to unify email and network defenses, not just in name, but in how they share intelligence, conduct investigations, and coordinate response actions. Let’s take a look at how this joined-up approach delivers measurable technical, operational, and commercial benefits.

Technical advantages

Pre-alert intelligence: Gathering data before the threat strikes

Most security tools start working when something goes wrong – an unusual login, a flagged attachment, a confirmed compromise. But by then, attackers may already be a step ahead.

By unifying network and email security under a single AI platform (like the Darktrace Active AI Security Platform), you can analyze patterns across both environments in real time, even when there are no alerts. This ongoing monitoring builds a behavioral understanding of every user, device, and domain in your ecosystem.

That means when an email arrives from a suspicious domain, the system already knows whether that domain has appeared on your network before – and whether its behavior has been unusual. Likewise, when new network activity involves a domain first spotted in an email, it’s instantly placed in the right context.

This intelligence isn’t built on signatures or after-the-fact compromise indicators – it’s built on live behavioral baselines, giving your defenses the ability to flag threats before damage is done.

Alert-related intelligence: Connecting the dots in real time

Once an alert does fire, speed and context matter. The Darktrace Cyber AI Analyst can automatically investigate across both environments, piecing together network and email evidence into a single, cohesive incident.

Instead of leaving analysts to sift through fragmented logs, the AI links events like a phishing email to suspicious lateral movement on the recipient’s device, keeping the full attack chain intact. Investigations that might take hours – or even days – can be completed in minutes, with far fewer false positives to wade through.

This is more than a time-saver. It ensures defenders maintain visibility after the first sign of compromise, following the attacker as they pivot into network infrastructure, cloud services, or other targets. That cross-environment continuity is impossible to achieve with disconnected point solutions or siloed workflows.

Operational advantages

Streamlining SecOps across teams

In many organizations, email security is managed by IT, while network defense belongs to the SOC. The result? Critical information is scattered between tools and teams, creating blind spots just when you need clarity.

When email and network data flow into a single platform, everyone is working from the same source of truth. SOC analysts gain immediate visibility into email threats without opening another console or sending a request to another department. The IT team benefits from the SOC’s deeper investigative context.

The outcome is more than convenience: it’s faster, more informed decision-making across the board.

Reducing time-to-meaning and enabling faster response

A unified platform removes the need to manually correlate alerts between tools, reducing time-to-meaning for every incident. Built-in AI correlation instantly ties together related events, guiding analysts toward coordinated responses with higher confidence.

Instead of relying on manual SIEM rules or pre-built SOAR playbooks, the platform connects the dots in real time, and can even trigger autonomous response actions across both environments simultaneously. This ensures attacks are stopped before they can escalate, regardless of where they begin.

Commercial advantages

While purchasing “best-of-breed" for all your different tools might sound appealing, it often leads to a patchwork of solutions with overlapping costs and gaps in coverage. However good a “best-in-breed" email security solution might be in the email realm, it won't be truly effective without visibility across domains and an AI analyst piecing intelligence together. That’s why we think “best-in-suite" is the only “best-in-breed" approach that works – choosing a high-quality platform ensures that every new capability strengthens the whole system.  

On top of that, security budgets are under constant pressure. Managing separate vendors for email and network defense means juggling multiple contracts, negotiating different SLAs, and stitching together different support models.

With a single provider for both, procurement and vendor management become far simpler. You deal with one account team, one support channel, and one unified strategy for both environments. If you choose to layer on managed services, you get consistent expertise across your whole security footprint.

Even more importantly, an integrated AI platform sets the stage for growth. Once email and network are under the same roof, adding coverage for other attack surfaces – like cloud or identity – is straightforward. You’re building on the same architecture, not bolting on new point solutions that create more complexity.

Check out the white paper, The Modern Security Stack: Why Your NDR and Email Security Solutions Need to Work Together, to explore these benefits in more depth, with real-world examples and practical steps for unifying your defenses.

[related-resource]

Continue reading
About the author
Mikey Anderson
Product Marketing Manager, Network Detection & Response

Blog

/

/

September 9, 2025

Unpacking the Salesloft Incident: Insights from Darktrace Observations

solesloft incident Default blog imageDefault blog image

Introduction

On August 26, 2025, Google Threat intelligence Group released a report detailing a widespread data theft campaign targeting the sales automation platform Salesloft, via compromised OAuth tokens used by the third-party Drift AI chat agent [1][2].  The attack has been attributed to the threat actor UNC6395 by Google Threat Intelligence and Mandiant [1].

The attack is believed to have begun in early August 2025 and continued through until mid-August 2025 [1], with the threat actor exporting significant volumes of data from multiple Salesforce instances [1]. Then sifting through this data for anything that could be used to compromise the victim’s environments such as access keys, tokens or passwords. This had led to Google Threat Intelligence Group assessing that the primary intent of the threat actor is credential harvesting, and later reporting that it was aware of in excess of 700 potentially impacted organizations [3].

Salesloft previously stated that, based on currently available data, customers that do not integrate with Salesforce are unaffected by this campaign [2]. However, on August 28, Google Threat Intelligence Group announced that “Based on new information identified by GTIG, the scope of this compromise is not exclusive to the Salesforce integration with Salesloft Drift and impacts other integrations” [2]. Google Threat Intelligence has since advised that any and all authentication tokens stored in or connected to the Drift platform be treated as potentially compromised [1].

This campaign demonstrates how attackers are increasingly exploiting trusted Software-as-a-Service (SaaS) integrations as a pathway into enterprise environment.

By abusing these integrations, threat actors were able to exfiltrate sensitive business data at scale, bypassing traditional security controls. Rather than relying on malware or obvious intrusion techniques, the adversaries leveraged legitimate credentials and API traffic that resembled legitimate Salesforce activity to achieve their goals. This type of activity is far harder to detect with conventional security tools, since it blends in with the daily noise of business operations.

The incident underscores the escalating significance of autonomous coverage within SaaS and third-party ecosystems. As businesses increasingly depend on interconnected platforms, visibility gaps become evident that cannot be managed by conventional perimeter and endpoint defenses.

By developing a behavioral comprehension of each organization's distinct use of cloud services, anomalies can be detected, such as logins from unexpected locations, unusually high volumes of API requests, or unusual document activity. These indications serve as an early alert system, even when intruders use legitimate tokens or accounts, enabling security teams to step in before extensive data exfiltration takes place

What happened?

The campaign is believed to have started on August 8, 2025, with malicious activity continuing until at least August 18. The threat actor, tracked as UNC6395, gained access via compromised OAuth tokens associated with Salesloft Drift integrations into Salesforce [1]. Once tokens were obtained, the attackers were able to issue large volumes of Salesforce API requests, exfiltrating sensitive customer and business data.

Initial Intrusion

The attackers first established access by abusing OAuth and refresh tokens from the Drift integration. These tokens gave them persistent access into Salesforce environments without requiring further authentication [1]. To expand their foothold, the threat actor also made use of TruffleHog [4], an open-source secrets scanner, to hunt for additional exposed credentials. Logs later revealed anomalous IAM updates, including unusual UpdateAccessKey activity, which suggested attempts to ensure long-term persistence and control within compromised accounts.

Internal Reconnaissance & Data Exfiltration

Once inside, the adversaries began exploring the Salesforce environments. They ran queries designed to pull sensitive data fields, focusing on objects such as Cases, Accounts, Users, and Opportunities [1]. At the same time, the attackers sifted through this information to identify secrets that could enable access to other systems, including AWS keys and Snowflake credentials [4]. This phase demonstrated the opportunistic nature of the campaign, with the actors looking for any data that could be repurposed for further compromise.

Lateral Movement

Salesloft and Mandiant investigations revealed that the threat actor also created at least one new user account in early September. Although follow-up activity linked to this account was limited, the creation itself suggested a persistence mechanism designed to survive remediation efforts. By maintaining a separate identity, the attackers ensured they could regain access even if their stolen OAuth tokens were revoked.

Accomplishing the mission

The data taken from Salesforce environments included valuable business records, which attackers used to harvest credentials and identify high-value targets. According to Mandiant, once the data was exfiltrated, the actors actively sifted through it to locate sensitive information that could be leveraged in future intrusions [1]. In response, Salesforce and Salesloft revoked OAuth tokens associated with Drift integrations on August 20 [1], a containment measure aimed at cutting off the attackers’ primary access channel and preventing further abuse.

How did the attack bypass the rest of the security stack?

The campaign effectively bypassed security measures by using legitimate credentials and OAuth tokens through the Salesloft Drift integration. This rendered traditional security defenses like endpoint protection and firewalls ineffective, as the activity appeared non-malicious [1]. The attackers blended into normal operations by using common user agents and making queries through the Salesforce API, which made their activity resemble legitimate integrations and scripts. This allowed them to operate undetected in the SaaS environment, exploiting the trust in third-party connections and highlighting the limitations of traditional detection controls.

Darktrace Coverage

Anomalous activities have been identified across multiple Darktrace deployments that appear associated with this campaign. This included two cases on customers based within the United States who had a Salesforce integration, where the pattern of activities was notably similar.

On August 17, Darktrace observed an account belonging to one of these customers logging in from the rare endpoint 208.68.36[.]90, while the user was seen active from another location. This IP is a known indicator of compromise (IoC) reported by open-source intelligence (OSINT) for the campaign [2].

Cyber AI Analyst Incident summarizing the suspicious login seen for the account.
Figure 1: Cyber AI Analyst Incident summarizing the suspicious login seen for the account.

The login event was associated with the application Drift, further connecting the events to this campaign.

Advanced Search logs showing the Application used to login.
Figure 2: Advanced Search logs showing the Application used to login.

Following the login, the actor initiated a high volume of Salesforce API requests using methods such as GET, POST, and DELETE. The GET requests targeted endpoints like /services/data/v57.0/query and /services/data/v57.0/sobjects/Case/describe, where the former is used to retrieve records based on a specific criterion, while the latter provides metadata for the Case object, including field names and data types [5,6].

Subsequently, a POST request to /services/data/v57.0/jobs/query was observed, likely to initiate a Bulk API query job for extracting large volumes of data from the Ingest Job endpoint [7,8].

Finally, a DELETE request to remove an ingestion job batch, possibly an attempt to obscure traces of prior data access or manipulation.

A case on another US-based customer took place a day later, on August 18. This again began with an account logging in from the rare IP 208.68.36[.]90 involving the application Drift. This was followed by Salesforce GET requests targeting the same endpoints as seen in the previous case, and then a POST to the Ingest Job endpoint and finally a DELETE request, all occurring within one minute of the initial suspicious login.

The chain of anomalous behaviors, including a suspicious login and delete request, resulted in Darktrace’s Autonomous Response capability suggesting a ‘Disable user’ action. However, the customer’s deployment configuration required manual confirmation for the action to take effect.

An example model alert for the user, triggered due to an anomalous API DELETE request.
Figure 3: An example model alert for the user, triggered due to an anomalous API DELETE request.
Figure 4: Model Alert Event Log showing various model alerts for the account that ultimately led to an Autonomous Response model being triggered.

Conclusion

In conclusion, this incident underscores the escalating risks of SaaS supply chain attacks, where third-party integrations can become avenues for attacks. It demonstrates how adversaries can exploit legitimate OAuth tokens and API traffic to circumvent traditional defenses. This emphasizes the necessity for constant monitoring of SaaS and cloud activity, beyond just endpoints and networks, while also reinforcing the significance of applying least privilege access and routinely reviewing OAuth permissions in cloud environments. Furthermore, it provides a wider perspective into the evolution of the threat landscape, shifting towards credential and token abuse as opposed to malware-driven compromise.

Credit to Emma Foulger (Global Threat Research Operations Lead), Calum Hall (Technical Content Researcher), Signe Zaharka (Principal Cyber Analyst), Min Kim (Senior Cyber Analyst), Nahisha Nobregas (Senior Cyber Analyst), Priya Thapa (Cyber Analyst)

Appendices

Darktrace Model Detections

·      SaaS / Access / Unusual External Source for SaaS Credential Use

·      SaaS / Compromise / Login From Rare Endpoint While User Is Active

·      SaaS / Compliance / Anomalous Salesforce API Event

·      SaaS / Unusual Activity / Multiple Unusual SaaS Activities

·      Antigena / SaaS / Antigena Unusual Activity Block

·      Antigena / SaaS / Antigena Suspicious Source Activity Block

Customers should consider integrating Salesforce with Darktrace where possible. These integrations allow better visibility and correlation to spot unusual behavior and possible threats.

IoC List

(IoC – Type)

·      208.68.36[.]90 – IP Address

References

1.     https://cloud.google.com/blog/topics/threat-intelligence/data-theft-salesforce-instances-via-salesloft-drift

2.     https://trust.salesloft.com/?uid=Drift+Security+Update%3ASalesforce+Integrations+%283%3A30PM+ET%29

3.     https://thehackernews.com/2025/08/salesloft-oauth-breach-via-drift-ai.html

4.     https://unit42.paloaltonetworks.com/threat-brief-compromised-salesforce-instances/

5.     https://developer.salesforce.com/docs/atlas.en-us.api_rest.meta/api_rest/resources_query.htm

6.     https://developer.salesforce.com/docs/atlas.en-us.api_rest.meta/api_rest/resources_sobject_describe.htm

7.     https://developer.salesforce.com/docs/atlas.en-us.api_asynch.meta/api_asynch/get_job_info.htm

8.     https://developer.salesforce.com/docs/atlas.en-us.api_asynch.meta/api_asynch/query_create_job.htm

Continue reading
About the author
Emma Foulger
Global Threat Research Operations Lead
Your data. Our AI.
Elevate your network security with Darktrace AI