Blog
/
AI
/
October 30, 2023

Exploring AI Threats: Package Hallucination Attacks

Learn how malicious actors exploit errors in generative AI tools to launch packet attacks. Read how Darktrace products detect and prevent these threats!
Inside the SOC
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
Written by
Charlotte Thompson
Cyber Analyst
Written by
Tiana Kelly
Senior Cyber Analyst & Team Lead
Default blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog image
30
Oct 2023

AI tools open doors for threat actors

On November 30, 2022, the free conversational language generation model ChatGPT was launched by OpenAI, an artificial intelligence (AI) research and development company. The launch of ChatGPT was the culmination of development ongoing since 2018 and represented the latest innovation in the ongoing generative AI boom and made the use of generative AI tools accessible to the general population for the first time.

ChatGPT is estimated to currently have at least 100 million users, and in August 2023 the site reached 1.43 billion visits [1]. Darktrace data indicated that, as of March 2023, 74% of active customer environments have employees using generative AI tools in the workplace [2].

However, with new tools come new opportunities for threat actors to exploit and use them maliciously, expanding their arsenal.

Much consideration has been given to mitigating the impacts of the increased linguistic complexity in social engineering and phishing attacks resulting from generative AI tool use, with Darktrace observing a 135% increase in ‘novel social engineering attacks’ across thousands of active Darktrace/Email™ customers from January to February 2023, corresponding with the widespread adoption of ChatGPT and its peers [3].

Less overall consideration, however, has been given to impacts stemming from errors intrinsic to generative AI tools. One of these errors is AI hallucinations.

What is an AI hallucination?

AI “hallucination” is a term which refers to the predictive elements of generative AI and LLMs’ AI model gives an unexpected or factually incorrect response which does not align with its machine learning training data [4]. This differs from regular and intended behavior for an AI model, which should provide a response based on the data it was trained upon.  

Why are AI hallucinations a problem?

Despite the term indicating it might be a rare phenomenon, hallucinations are far more likely than accurate or factual results as the AI models used in LLMs are merely predictive and focus on the most probable text or outcome, rather than factual accuracy.

Given the widespread use of generative AI tools in the workplace employees are becoming significantly more likely to encounter an AI hallucination. Furthermore, if these fabricated hallucination responses are taken at face value, they could cause significant issues for an organization.

Use of generative AI in software development

Software developers may use generative AI for recommendations on how to optimize their scripts or code, or to find packages to import into their code for various uses. Software developers may ask LLMs for recommendations on specific pieces of code or how to solve a specific problem, which will likely lead to a third-party package. It is possible that packages recommended by generative AI tools could represent AI hallucinations and the packages may not have been published, or, more accurately, the packages may not have been published prior to the date at which the training data for the model halts. If these hallucinations result in common suggestions of a non-existent package, and the developer copies the code snippet wholesale, this may leave the exchanges vulnerable to attack.

Research conducted by Vulcan revealed the prevalence of AI hallucinations when ChatGPT is asked questions related to coding. After sourcing a sample of commonly asked coding questions from Stack Overflow, a question-and-answer website for programmers, researchers queried ChatGPT (in the context of Node.js and Python) and reviewed its responses. In 20% of the responses provided by ChatGPT pertaining to Node.js at least one un-published package was included, whilst the figure sat at around 35% for Python [4].

Hallucinations can be unpredictable, but would-be attackers are able to find packages to create by asking generative AI tools generic questions and checking whether the suggested packages exist already. As such, attacks using this vector are unlikely to target specific organizations, instead posing more of a widespread threat to users of generative AI tools.

Malicious packages as attack vectors

Although AI hallucinations can be unpredictable, and responses given by generative AI tools may not always be consistent, malicious actors are able to discover AI hallucinations by adopting the approach used by Vulcan. This allows hallucinated packages to be used as attack vectors. Once a malicious actor has discovered a hallucination of an un-published package, they are able to create a package with the same name and include a malicious payload, before publishing it. This is known as a malicious package.

Malicious packages could also be recommended by generative AI tools in the form of pre-existing packages. A user may be recommended a package that had previously been confirmed to contain malicious content, or a package that is no longer maintained and, therefore, is more vulnerable to hijack by malicious actors.

In such scenarios it is not necessary to manipulate the training data (data poisoning) to achieve the desired outcome for the malicious actor, thus a complex and time-consuming attack phase can easily be bypassed.

An unsuspecting software developer may incorporate a malicious package into their code, rendering it harmful. Deployment of this code could then result in compromise and escalation into a full-blown cyber-attack.

Figure 1: Flow diagram depicting the initial stages of an AI Package Hallucination Attack.

For providers of Software-as-a-Service (SaaS) products, this attack vector may represent an even greater risk. Such organizations may have a higher proportion of employed software developers than other organizations of comparable size. A threat actor, therefore, could utilize this attack vector as part of a supply chain attack, whereby a malicious payload becomes incorporated into trusted software and is then distributed to multiple customers. This type of attack could have severe consequences including data loss, the downtime of critical systems, and reputational damage.

How could Darktrace detect an AI Package Hallucination Attack?

In June 2023, Darktrace introduced a range of DETECT™ and RESPOND™ models designed to identify the use of generative AI tools within customer environments, and to autonomously perform inhibitive actions in response to such detections. These models will trigger based on connections to endpoints associated with generative AI tools, as such, Darktrace’s detection of an AI Package Hallucination Attack would likely begin with the breaching of one of the following DETECT models:

  • Compliance / Anomalous Upload to Generative AI
  • Compliance / Beaconing to Rare Generative AI and Generative AI
  • Compliance / Generative AI

Should generative AI tool use not be permitted by an organization, the Darktrace RESPOND model ‘Antigena / Network / Compliance / Antigena Generative AI Block’ can be activated to autonomously block connections to endpoints associated with generative AI, thus preventing an AI Package Hallucination attack before it can take hold.

Once a malicious package has been recommended, it may be downloaded from GitHub, a platform and cloud-based service used to store and manage code. Darktrace DETECT is able to identify when a device has performed a download from an open-source repository such as GitHub using the following models:

  • Device / Anomalous GitHub Download
  • Device / Anomalous Script Download Followed By Additional Packages

Whatever goal the malicious package has been designed to fulfil will determine the next stages of the attack. Due to their highly flexible nature, AI package hallucinations could be used as an attack vector to deliver a large variety of different malware types.

As GitHub is a commonly used service by software developers and IT professionals alike, traditional security tools may not alert customer security teams to such GitHub downloads, meaning malicious downloads may go undetected. Darktrace’s anomaly-based approach to threat detection, however, enables it to recognize subtle deviations in a device’s pre-established pattern of life which may be indicative of an emerging attack.

Subsequent anomalous activity representing the possible progression of the kill chain as part of an AI Package Hallucination Attack could then trigger an Enhanced Monitoring model. Enhanced Monitoring models are high-fidelity indicators of potential malicious activity that are investigated by the Darktrace analyst team as part of the Proactive Threat Notification (PTN) service offered by the Darktrace Security Operation Center (SOC).

Conclusion

Employees are often considered the first line of defense in cyber security; this is particularly true in the face of an AI Package Hallucination Attack.

As the use of generative AI becomes more accessible and an increasingly prevalent tool in an attacker’s toolbox, organizations will benefit from implementing company-wide policies to define expectations surrounding the use of such tools. It is simple, yet critical, for example, for employees to fact check responses provided to them by generative AI tools. All packages recommended by generative AI should also be checked by reviewing non-generated data from either external third-party or internal sources. It is also good practice to adopt caution when downloading packages with very few downloads as it could indicate the package is untrustworthy or malicious.

As of September 2023, ChatGPT Plus and Enterprise users were able to use the tool to browse the internet, expanding the data ChatGPT can access beyond the previous training data cut-off of September 2021 [5]. This feature will be expanded to all users soon [6]. ChatGPT providing up-to-date responses could prompt the evolution of this attack vector, allowing attackers to publish malicious packages which could subsequently be recommended by ChatGPT.

It is inevitable that a greater embrace of AI tools in the workplace will be seen in the coming years as the AI technology advances and existing tools become less novel and more familiar. By fighting fire with fire, using AI technology to identify AI usage, Darktrace is uniquely placed to detect and take preventative action against malicious actors capitalizing on the AI boom.

Credit to Charlotte Thompson, Cyber Analyst, Tiana Kelly, Analyst Team Lead, London, Cyber Analyst

References

[1] https://seo.ai/blog/chatgpt-user-statistics-facts

[2] https://darktrace.com/news/darktrace-addresses-generative-ai-concerns

[3] https://darktrace.com/news/darktrace-email-defends-organizations-against-evolving-cyber-threat-landscape

[4] https://vulcan.io/blog/ai-hallucinations-package-risk?nab=1&utm_referrer=https%3A%2F%2Fwww.google.com%2F

[5] https://twitter.com/OpenAI/status/1707077710047216095

[6] https://www.reuters.com/technology/openai-says-chatgpt-can-now-browse-internet-2023-09-27/

Inside the SOC
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
Written by
Charlotte Thompson
Cyber Analyst
Written by
Tiana Kelly
Senior Cyber Analyst & Team Lead

More in this series

No items found.

Blog

/

AI

/

April 28, 2026

State of AI Cybersecurity 2026: 87% of security professionals are seeing more AI-driven threats, but few feel ready to stop them

Default blog imageDefault blog image

The findings in this blog are taken from Darktrace’s annual State of AI Cybersecurity Report 2026.

In part 1 of this blog series, we explored how AI is remaking the attack surface, with new tools, models, agents — and vulnerabilities — popping up just about everywhere. Now embedded in workflows across the enterprise, and often with far-reaching access to sensitive data, AI systems are quickly becoming a favorite target of cyber threat actors.

Among bad actors, though, AI is more often used as a tool than a target. Nearly 62% of organizations  experienced a social engineering attack involving a deepfake, or an incident in which bad actors used AI-generated video or audio to try to trick a biometric authentication system, compared to 32% that reported an AI prompt injection attack.

In the hands of attackers, AI can do many things. It’s being used across the entire kill chain: to supercharge reconnaissance, personalize phishing, accelerate lateral movement, and automate data exfiltration. Evidence from Anthropic demonstrates that threat actors have harnessed AI to orchestrate an entire cyber espionage campaign from end to end, allegedly running it with minimal human involvement.

CISOs inhabit a world where these increasingly sophisticated attacks are ubiquitous. Naturally, combatting AI-powered threats is top of mind among security professionals, but many worry about whether their capabilities are up to the challenge.

AI-powered threats at scale: no longer hypothetical

AI-driven threats share signature characteristics. They operate at speed and scale. Automated tools can probe multiple attack paths, search for multiple vulnerabilities and send out a barrage of phishing emails, all within seconds. The ability to attack everywhere at once, at a pace that no human operator could sustain, is the hallmark of an AI-powered threat. AI-powered threats are also dynamic. They can adapt their behavior to spread across a network more efficiently or rewrite their own code to evade detection.

Security teams are seeing the signs that they’re fighting AI-powered threats at every stage of the kill chain, and the sophistication of these threats is testing their resolve and their resources.

  • 73% say that AI-powered cyber threats are having a significant impact on their organization
  • 92% agree that these threats are forcing them to upgrade their defenses
  • 87% agree that AI is significantly increasing the sophistication and success rate of malware
  • 87% say AI is significantly increasing the workload of their security operations team

These teams now confront a challenge unlike anything they’ve seen before in their careers, and the risks are compounding across workflows, tools, data, and identities. It’s no surprise that 66% of security professionals say their role is more stressful today than it was five years ago, or that 47% report feeling overwhelmed at work.

Up all night: Security professionals’ worry list is long

Traditional security methods were never built to handle the complexity and subtlety of AI-driven behavior. Working in the trenches, defenders have deep firsthand experience of how difficult it can be to detect and stop AI-assisted threats.

Increasingly effective social engineering attacks are among their top concerns. 50% of security leaders mentioned hyper-personalized phishing campaigns as one of their biggest worries, while 40% voiced apprehension about deepfake voice fraud. These concerns are legitimate: AI-generated phishing emails are increasingly tailored to individual organizations, business activities, or individuals. Gone are the telltale signs – like grammar or spelling mistakes – that once distinguished malicious communications. Notably, 33% of the malicious emails Darktrace observed in 2025 contained over 1,000 characters, indicating probable LLM usage.

Security leaders also worry about how bad actors can leverage AI to make attacks even faster and more dynamic. 45% listed automated vulnerability scanning and exploit chaining among their biggest concerns, while 40% mentioned adaptive malware.

Confidence is lacking

Protecting against AI demands capabilities that many organizations have not yet built. It requires interpreting new indicators, uncovering the subtle intent within interactions, and recognizing when AI behavior – human or machine – could be suspicious. Leaders know that their current tools aren’t prepared for this. Nearly half don’t feel confident in their ability to defend against AI-powered attacks.

We’ve asked participants in our survey about their confidence for the last three years now. In 2024, 60% said their organizations were not adequately prepared to defend against AI-driven threats. Last year, that percentage shrunk to 45%, a possible indicator that security programs were making progress. Since then, however, the progress has apparently stalled. 46% of security leaders now feel inadequately prepared to protect their organizations amidst the current threat landscape.

Some of these differences are accentuated across different cultures. Respondents in Japan are far less confident (77% say they are not adequately prepared) than respondents in Brazil (where only 21% don’t feel prepared).

Where security programs are falling short

It’s no longer the case that cybersecurity is overlooked or underfunded by executive leadership. Across industries, management recognizes that AI-powered threats are a growing problem, and insufficient budget is near the bottom of most CISO’s list of reasons that they struggle to defend against AI-powered threats.  

It’s the things that money can’t buy – experience, knowledge, and confidence – that are holding programs back. Near the top of the list of inhibitors that survey participants mention is “insufficient knowledge or use of AI-driven countermeasures.” As bad actors embrace AI technologies en masse, this challenge is coming into clearer focus: attack-centric security tools, which rely on static rules, signatures, and historical attack patterns, were never designed to handle the complexity and subtlety of AI-driven attacks. These challenges feel new to security teams, but they are the core problems Darktrace was built to solve.  

Our Self-Learning AI develops a deep understanding of what “normal” looks like for your organization –including unique traffic patterns, end user habits, application and device profiles – so that it can detect and stop novel, dynamic threats at the first encounter. By focusing on learning the business, rather than the attack, our AI can keep pace with AI-powered threats as they evolve.

Explore the full State of AI Cybersecurity 2026 report for deeper insights into how security leaders are responding to AI-driven risks.

Learn more about securing AI in your enterprise.

[related-resource]

Continue reading
About the author
The Darktrace Community

Blog

/

Email

/

April 24, 2026

Email-Borne Cyber Risk: A Core Challenge for the CISO in the Age of Volume and Sophistication

Default blog imageDefault blog image

The challenge for CISOs

Despite continuous advances in security technologies, humans continue to be exploited by attackers. Credential abuse and social actions like phishing are major factors, accounting for around 60% of all breaches. These attacks rely less on technical vulnerabilities and more on exploiting human behavior and organizational processes. 

From my perspective as a former CISO, protecting humans concentrates three of today’s most pressing challenges: the sheer volume of email-based threats, their increasing sophistication, and the limitations of traditional employee awareness programs in moving the needle on risk. 

My personal experience of security awareness training as a CISO

With over 20 years’ experience as an ICT and Cybersecurity leader across various international organizations, I’ve seen security awareness training (SAT) in many guises. And while the cyber landscape is evolving in every direction, the effectiveness of SAT is reaching a plateau.  

Most programs I’ve seen follow a familiar pattern. Training is delivered through a combination of eLearning modules and internal sessions designed to reinforce IT policies. Employees are typically required to complete a slide deck or video, followed by a multiple-choice quiz. Occasional phishing simulations are distributed throughout the year.

The content is often static and unpersonalized, based on known threats that may already be outdated. Every employee regardless of role or risk exposure receives the same training and the same simulated phishing templates, from front-desk staff to the CEO.

The problem with traditional SAT programs

The issue with the approach to SAT outlined above is that the distribution of power is imbalanced. Humans will always be fallible, particularly when faced with increasingly sophisticated attacks. Providing generic, low-context training risks creating false confidence rather than genuine resilience. Let’s look at some of the problems in detail.

Timing and delivery

Employees today operate under constant cognitive load, making lots of rapid decisions every day to reduce their email volumes. Yet if employees are completing training annually, or on an ad hoc basis, it becomes a standalone occurrence rather than a continuous habit.  

As a result, retention is low. Employees often forget the lessons within weeks, a phenomenon known as the ‘Ebbinghaus Forgetting Curve.’

The graph illustrates that when you first learn something, the information disappears at an exponential rate without retention. In fact, according to the curve, you forget 50% of all new information within a day, and 90% of all new information within a week.  

Simultaneously, most training is conducted within a separate interface. Because it takes place away from the actual moment of decision-making, the "teachable moment" is lost. There is a cognitive disconnect between the action (clicking a link in Outlook) and the education (watching a video in a browser). 

People

In the context of professional risk management, the risks faced by different users are different. Static learning such as everyone receiving the same ‘Password Reset’ email doesn’t help users prepare for the specific threats they are likely to face. It also contributes to user fatigue, driven by repetitive training. And if users receive tests at the same time, news spreads among colleagues, hurting the efficacy of the test.  

Staff turnover introduces further risk. In many organizations, new employees gain access to systems before receiving meaningful training, reducing onboarding to little more than policy acknowledgment.

Measuring success

In my experience, solutions are standalone, without any correlation to other tools in the security stack. In some cases, the programs are delivered by HR rather than the security team, creating a complete silo.  

As a result, SAT is often perceived as a compliance exercise rather than a capability building function. The result is that poor-quality training does little to reduce the likelihood of compromise, regardless of completion rates or quiz performance.

What a modern SAT solution should look like

For today’s CISO, email represents the convergence point of high-volume, high-impact, and human-centric threats. Despite significant security investments, it remains one of the most difficult channels to secure effectively. Given these constraints, CISOs must evolve their approach to SAT.

Success lies in a balanced strategy one that combines advanced technology, attack surface reduction, and pragmatic user enablement, without over-relying on human vigilance as the final line of defense.

This means moving beyond traditional SAT toward continuous, contextual awareness, realistic simulations, and tight integration with security outcomes.

Three requirements for a modern SAT solution

  • Invisible protection: The optimum security solution is one that assists users without impeding their experience. The objective is to enhance human capabilities, rather than simply delivering a lecture. 
  • Real-time feedback: Rather than a monthly quiz, the ideal system would provide a prompt or warning when a user is about to engage with something suspicious. 
  • Positive culture: Shifting the focus away from a "gotcha" culture, which is a contributing factor to a resentment, and instead empowers employees to serve as "sensors" for the company. 

Discover how personalized security coaching can strengthen your human layer and make your email defenses more resilient. Explore Darktrace / Adaptive Human Defense.

Continue reading
About the author
Karim Benslimane
VP, Field CISO
Your data. Our AI.
Elevate your network security with Darktrace AI