Blog
/
/
June 21, 2018

Unsupervised Machine Learning and JA3 for Enhanced Security

Unlock the true power of Darktrace's algorithms. Learn how JA3 enhances cybersecurity defenses with unique TLS/SSL fingerprints & unsupervised machine learning.
Inside the SOC
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
Written by
Max Heinemeyer
Global Field CISO
Default blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog imageDefault blog image
21
Jun 2018

Introducing JA3

JA3 is a methodology for fingerprinting Transport Layer Security applications. It was first posted on GitHub in June 2017 and is the work of Salesforce researchers John Althouse, Jeff Atkinson, and Josh Atkins. The JA3 TLS/SSL fingerprints created can overlap between applications but are still a great Indicator of Compromise (IoC). Fingerprinting is achieved by creating a hash of 5 decimal fields of the Client Hello message that is sent in the initial stages of an TLS/SSL session.

JA3 is an interesting approach to the increasing usage of encryption in networks. There is also a clear uptick in cyber-attacks using encrypted command and control (C2) channels – such as HTTPS – for malware communication.

The benefits of JA3 for enhancing rules-and-signatures security

These near-unique fingerprints can be used to enhance traditional cyber security approaches such as whitelisting, deny-listing, and searching for IoCs.

Let’s take the following JA3 hash for example: 3e860202fc555b939e83e7a7ab518c38. According to one of the public lists that maps JA3s to applications, this JA3 hash is associated with the ‘hola_svc’ application. This is the infamous Hola VPN solution that is non-compliant in most enterprise networks. On the other hand, the following hash is associated with the popular messenger software Slack: a5aa6e939e4770e3b8ac38ce414fd0d5. Traditional cyber security tools can use these hashes like traditional signatures to search for instances of them in data sets or trying to deny-list malicious ones.

While there is some merit to this approach, it comes with all the known limitations of rules-and-signatures defenses, such as the overlaps in signatures, the inability to detect unknown threats, as well as the added complexity of having to maintain a database of known signatures.

JA3 in Darktrace

Darktrace creates JA3 hashes for every TLS/SSL connection it encounters. This is incredibly powerful in a number of ways. First, the JA3 can add invaluable context to a threat hunt. Second, Darktrace can also be queried to see if a particular JA3 was encountered in the network, thus providing actionable intelligence during incident response if JA3 IoCs are known to the incident responders.

Things become much more interesting once we apply our unsupervised machine learning to JA3: Darktrace’s AI algorithms autonomously detect which JA3s are anomalous for the network as a whole and which JA3s are unusual for specific devices.

It basically tells a cyber security expert: This JA3 (3e860202fc555b939e83e7a7ab518c38) has never been seen in the network before and it is only used by one device. It indicates that an application, which is used by nobody else on the network, is initiating TLS/SSL connections. In our experience, this is most often the case for malware or non-compliant software. At this stage, we are observing anomalous behavior.

Darktrace’s AI combines these IoCs (Unusual Network JA3, Unusual Device JA3, …) with many other weak indicators to detect the earliest signs of an emerging threat, including previously unknown threats, without using rules or hard-coded thresholds.

Catching Red-Teams and domain fronting with JA3

The following is an example where Darktrace detected a Red-Team’s C2 communication by observing anomalous JA3 behavior.

The unsupervised machine learning algorithms identified a desktop device using a JA3 that was 100% unusual for the network connecting to an external domain using a Let’s Encrypt certificate, which, along with self-signed certificates, is often abused by malicious actors. As well as the JA3, the domain was also 100% rare for the network – nobody else visited it:

It turned out that a Red-Team had registered a domain that was very similar to the victim’s legitimate domain: www.companyname[.]com (legitimate domain) vs. www.companyname[.]online (malicious domain). This was intentionally done to avoid suspicion and human analysis. Over a 7-day period in a 2,000-device environment, this was the only time that Darktrace flagged unusual behavior of this kind.

As the C2 traffic was encrypted (therefore no intrusion detection was possible on the payload) and the domain was non-suspicious (no reputation-based deny-listing worked), this C2 had remained undetected by the rest of the security stack.

Combining unsupervised machine learning with JA3 is incredibly powerful for the detection of domain fronting. Domain fronting is a popular technique to circumvent censorship and to hide C2 traffic. While some infrastructure providers take action to prevent domain fronting on their end, it is still prevalent and actively used by attackers.

The only agreed-upon method within wide parts of the cyber-security community to detect domain fronting appears to be TLS/SSL inspection. This usually involved breaking up encrypted communication to inspect the clear-text payloads. While this works, it commonly involves additional infrastructure, network restructuring and comes with privacy issues – especially in the context of GDPR.

Unsupervised machine learning makes the detection of domain fronting without having to break up encrypted traffic possible by combining unusual JA3 detection with other anomalies such as beaconing. A good start for a domain fronting threat hunt? A device beaconing to an anomalous CDN with an unusual JA3 hash.

Conclusion

JA3 is not a silver bullet to pre-empt malware compromise. As a signature-based solution, it shares the same limitations of all other defenses that rely on pre-identified threats or deny-lists: having to play a constant game of catch-up with innovative attackers. However, as a novel means of identifying TLS/SSL applications, JA3 hashing can be leveraged as a powerful network behavioral indicator, an additional metric that can flag the use of unauthorized or risky software, or as a means of identifying emerging malware compromises in the initial stages of C2 communication. This is made possible through the power of unsupervised machine learning.

Inside the SOC
Darktrace cyber analysts are world-class experts in threat intelligence, threat hunting and incident response, and provide 24/7 SOC support to thousands of Darktrace customers around the globe. Inside the SOC is exclusively authored by these experts, providing analysis of cyber incidents and threat trends, based on real-world experience in the field.
Written by
Max Heinemeyer
Global Field CISO

More in this series

No items found.

Blog

/

/

November 19, 2025

Securing Generative AI: Managing Risk in Amazon Bedrock with Darktrace / CLOUD

securing generative aiDefault blog imageDefault blog image

Security risks and challenges of generative AI in the enterprise

Generative AI and managed foundation model platforms like Amazon Bedrock are transforming how organizations build and deploy intelligent applications. From chatbots to summarization tools, Bedrock enables rapid agent development by connecting foundation models to enterprise data and services. But with this flexibility comes a new set of security challenges, especially around visibility, access control, and unintended data exposure.

As organizations move quickly to operationalize generative AI, traditional security controls are struggling to keep up. Bedrock’s multi-layered architecture, spanning agents, models, guardrails, and underlying AWS services, creates new blind spots that standard posture management tools weren’t designed to handle. Visibility gaps make it difficult to know which datasets agents can access, or how model outputs might expose sensitive information. Meanwhile, developers often move faster than security teams can review IAM permissions or validate guardrails, leading to misconfigurations that expand risk. In shared-responsibility environments like AWS, this complexity can blur the lines of ownership, making it critical for security teams to have continuous, automated insight into how AI systems interact with enterprise data.

Darktrace / CLOUD provides comprehensive visibility and posture management for Bedrock environments, automatically detecting and proactively scanning agents and knowledge bases, helping teams secure their AI infrastructure without slowing down expansion and innovation.

A real-world scenario: When access goes too far

Consider a scenario where an organization deploys a Bedrock agent to help internal staff quickly answer business questions using company knowledge. The agent was connected to a knowledge base pointing at documents stored in Amazon S3 and given access to internal services via APIs.

To get the system running quickly, developers assigned the agent a broad execution role. This role granted access to multiple S3 buckets, including one containing sensitive customer records. The over-permissioning wasn’t malicious; it stemmed from the complexity of IAM policy creation and the difficulty of identifying which buckets held sensitive data.

The team assumed the agent would only use the intended documents. However, they did not fully consider how employees might interact with the agent or how it might act on the data it processed.  

When an employee asked a routine question about quarterly customer activity, the agent surfaced insights that included regulated data, revealing it to someone without the appropriate access.

This wasn’t a case of prompt injection or model manipulation. The agent simply followed instructions and used the resources it was allowed to access. The exposure was valid under IAM policy, but entirely unintended.

How Darktrace / CLOUD prevents these risks

Darktrace / CLOUD helps organizations avoid scenarios like unintended data exposure by providing layered visibility and intelligent analysis across Bedrock and SageMaker environments. Here’s how each capability works in practice:

Configuration-level visibility

Bedrock deployments often involve multiple components: agents, guardrails, and foundation models, each with its own configuration. Darktrace / CLOUD indexes these configurations so teams can:

  1. Inspect deployed agents and confirm they are connected only to approved data sources.
  2. Track evaluation job setups and their links to Amazon S3 datasets, uncovering hidden data flows that could expose sensitive information.
  3. Maintain full awareness of all AI components, reducing the chance of overlooked assets introducing risk.

By unifying configuration data across Bedrock, SageMaker, and other AWS services, Darktrace / CLOUD provides a single source of truth for AI asset visibility. Teams can instantly see how each component is configured and whether it aligns with corporate security policies. This eliminates guesswork, accelerates audits, and helps prevent misaligned settings from creating data exposure risks.

 Agents for bedrock relationship views.
Figure 1: Agents for bedrock relationship views

Architectural awareness

Complex AI environments can make it difficult to understand how components interact. Darktrace / CLOUD generates real-time architectural diagrams that:

  1. Visualize relationships between agents, models, and datasets.
  1. Highlight unintended data access paths or risk propagation across interconnected services.

This clarity helps security teams spot vulnerabilities before they lead to exposure. By surfacing these relationships dynamically, Darktrace / CLOUD enables proactive risk management, helping teams identify architectural drift, redundant data connections, or unmonitored agents before attackers or accidental misuse can exploit them. This reduces investigation time and strengthens compliance confidence across AI workloads.

Figure 2: Full Bedrock agent architecture including lambda and IAM permission mapping
Figure 2: Full Bedrock agent architecture including lambda and IAM permission mapping

Access & privilege analysis

IAM permissions apply to every AWS service, including Bedrock. When Bedrock agents assume IAM roles that were broadly defined for other workloads, they often inherit excessive privileges. Without strict least-privilege controls, the agent may have access to far more data and services than required, creating avoidable security exposure. Darktrace / CLOUD:

  1. Reviews execution roles and user permissions to identify excessive privileges.
  2. Flags anomalies that could enable privilege escalation or unauthorized API actions.

This ensures agents operate within the principle of least privilege, reducing attack surface. Beyond flagging risky roles, Darktrace / CLOUD continuously learns normal patterns of access to identify when permissions are abused or expanded in real time. Security teams gain context into why an action is anomalous and how it could affect connected assets, allowing them to take targeted remediation steps that preserve productivity while minimizing exposure.

Misconfiguration detection

Misconfigurations are a leading cause of cloud security incidents. Darktrace / CLOUD automatically detects:

  1. Publicly accessible S3 buckets that may contain sensitive training data.
  2. Missing guardrails in Bedrock deployments, which can allow inappropriate or sensitive outputs.
  3. Other issues such as lack of encryption, direct internet access, and root access to models.  

By surfacing these risks early, teams can remediate before they become exploitable. Darktrace / CLOUD turns what would otherwise be manual reviews into automated, continuous checks, reducing time to discovery and preventing small oversights from escalating into full-scale incidents. This automated assurance allows organizations to innovate confidently while keeping their AI systems compliant and secure by design.

Configuration data for Anthropic foundation model
Figure 3: Configuration data for Anthropic foundation model

Behavioral anomaly detection

Even with correct configurations, behavior can signal emerging threats. Using AWS CloudTrail, Darktrace / CLOUD:

  1. Monitors for unusual data access patterns, such as agents querying unexpected datasets.
  2. Detects anomalous training job invocations that could indicate attempts to pollute models.

This real-time behavioral insight helps organizations respond quickly to suspicious activity. Because it learns the “normal” behavior of each Bedrock component over time, Darktrace / CLOUD can detect subtle shifts that indicate emerging risks, before formal indicators of compromise appear. The result is faster detection, reduced investigation effort, and continuous assurance that AI-driven workloads behave as intended.

Conclusion

Generative AI introduces transformative capabilities but also complex risks that evolve alongside innovation. The flexibility of services like Amazon Bedrock enables new efficiencies and insights, yet even legitimate use can inadvertently expose sensitive data or bypass security controls. As organizations embrace AI at scale, the ability to monitor and secure these environments holistically, without slowing development, is becoming essential.

By combining deep configuration visibility, architectural insight, privilege and behavior analysis, and real-time threat detection, Darktrace gives security teams continuous assurance across AI tools like Bedrock and SageMaker. Organizations can innovate with confidence, knowing their AI systems are governed by adaptive, intelligent protection.

[related-resource]

Continue reading
About the author
Adam Stevens
Senior Director of Product, Cloud | Darktrace

Blog

/

Network

/

November 19, 2025

Unmasking Vo1d: Inside Darktrace’s Botnet Detection

Unmasking Vo1d: Inside Darktrace’s Botnet DetectionDefault blog imageDefault blog image

What is Vo1d APK malware?

Vo1d malware first appeared in the wild in September 2024 and has since evolved into one of the most widespread Android botnets ever observed. This large-scale Android malware primarily targets smart TVs and low-cost Android TV boxes. Initially, Vo1d was identified as a malicious backdoor capable of installing additional third-party software [1]. Its functionality soon expanded beyond the initial infection to include deploying further malicious payloads, running proxy services, and conducting ad fraud operations. By early 2025, it was estimated that Vo1d had infected 1.3 to 1.6 million devices worldwide [2].

From a technical perspective, Vo1d embeds components into system storage to enable itself to download and execute new modules at any time. External researchers further discovered that Vo1d uses Domain Generation Algorithms (DGAs) to create new command-and-control (C2) domains, ensuring that regardless of existing servers being taken down, the malware can quickly reconnect to new ones. Previous published analysis identified dozens of C2 domains and hundreds of DGA seeds, along with new downloader families. Over time, Vo1d has grown increasingly sophisticated with clear signs of stronger obfuscation and encryption methods designed to evade detection [2].

Darktrace’s coverage

Earlier this year, Darktrace observed a surge in Vo1d-related activity across customer environments, with the majority of affected customers based in South Africa. Devices that had been quietly operating as expected began exhibiting unusual network behavior, including excessive DNS lookups. Open-source intelligence (OSINT) has long highlighted South Africa as one of the countries most impacted by Vo1d infections [2].

What makes the recent activity particularly interesting is that the surge observed by Darktrace appears to be concentrated specifically in South African environments. This localized spike suggests that a significant number of devices may have been compromised, potentially due to vulnerable software, outdated firmware, or even preloaded malware. Regions with high prevalence of low-cost, often unpatched devices are especially susceptible, as these everyday consumer electronics can be quietly recruited into the botnet’s network. This specifically appears to be the case with South Africa, where public reporting has documented widespread use of low-cost boxes, such as non-Google-certified Android TV sticks, that frequently ship with outdated firmware [3].

The initial triage highlighted the core mechanism Vo1d uses to remain resilient: its use of DGA. A DGA deterministically creates a large list of pseudo-random domain names on a predictable schedule. This enables the malware to compute hundreds of candidate domains using the same algorithm, instead of using a hard-coded single C2 hostname that defenders could easily block or take down. To ensure reproducible from the infected device’s perspective, Vo1d utilizes DGA seeds. These seeds might be a static string, a numeric value, or a combination of underlying techniques that enable infected devices to generate the same list of candidate domains for a time window, provided the same DGA code, seed, and date are used.

Interestingly, Vo1d’s DGA seeds do not appear to be entirely unpredictable, and the generated domains lack fully random-looking endings. As observed in Figure 1, there is a clear pattern in the names generated. In this case, researchers identified that while the first five characters would change to create the desired list of domain names, the trailing portion remained consistent as part of the seed: 60b33d7929a, which OSINT sources have linked to the Vo1d botnet. [2]. Darktrace’s Threat Research team also identified a potential second DGA seed, with devices in some cases also engaging in activity involving hostnames matching the regular expression /[a-z]{5}fc975904fc9\.(com|top|net). This second seed has not been reported by any OSINT vendors at the time of writing.

Another recurring characteristic observed across multiple cases was the choice of top-level domains (TLDs), which included .com, .net, and .top.

Figure 1: Advanced Search results showing DNS lookups, providing a glimpse on the DGA seed utilized.

The activity was detected by multiple models in Darktrace / NETWORK™, which triggered on devices making an unusually large volume of DNS requests for domains uncommon across the network.

During the network investigation, Darktrace analysts traced Vo1d’s infrastructure and uncovered an interesting pattern related to responder ASNs. A significant number of connections pointed to AS16509 (AMAZON-02). By hosting redirectors or C2 nodes inside major cloud environments, Vo1d is able to gain access to highly available and geographically diverse infrastructure. When one node is taken down or reported, operators can quickly enable a new node under a different IP within the same ASN. Another feature of cloud infrastructure that hardens Vo1d’s resilience is the fact that many organizations allow outbound connections to cloud IP ranges by default, assuming they are legitimate. Despite this, Darktrace was able to identify the rarity of these endpoints, identifying the unusualness of the activity.

Analysts further observed that once a generated domain successfully resolved, infected devices consistently began establishing outbound connections to ephemeral port ranges like TCP ports 55520 and 55521. These destination ports are atypical for standard web or DNS traffic. Even though the choice of high-numbered ports appears random, it is likely far from not accidental. Commonly used ports such as port 80 (HTTP) or 443 (HTTPS) are often subject to more scrutiny and deeper inspection or content filtering, making them riskier for attackers. On the other hand, unregistered ports like 55520 and 55521 are less likely to be blocked, providing a more covert channel that blends with outbound TCP traffic. This tactic helps evade firewall rules that focus on common service ports. Regardless, Darktrace was able to identify external connections on uncommon ports to locations that the network does not normally visit.

The continuation of the described activity was identified by Darktrace’s Cyber AI Analyst, which correlated individual events into a broader interconnected incident. It began with the multiple DNS requests for the algorithmically generated domains, followed by repeated connections to rare endpoints later confirmed as attacker-controlled infrastructure. Cyber AI Analyst’s investigation further enabled it to categorize the events as part of the “established foothold” phase of the attack.

Figure 2: Cyber AI Analyst incident illustrating the transition from DNS requests for DGA domains to connections with resolved attacker-controlled infrastructure.

Conclusion

The observations highlighted in this blog highlight the precision and scale of Vo1d’s operations, ranging from its DGA-generated domains to its covert use of high-numbered ports. The surge in affected South African environments illustrate how regions with many low-cost, often unpatched devices can become major hubs for botnet activity. This serves as a reminder that even everyday consumer electronics can play a role in cybercrime, emphasizing the need for vigilance and proactive security measures.

Credit to Christina Kreza (Cyber Analyst & Team Lead) and Eugene Chua (Principal Cyber Analyst & Team Lead)

Edited by Ryan Traill (Analyst Content Lead)

Appendices

Darktrace Model Detections

  • Anomalous Connection / Devices Beaconing to New Rare IP
  • Anomalous Connection / Multiple Connections to New External TCP Port
  • Anomalous Connection / Multiple Failed Connections to Rare Endpoint
  • Compromise / DGA Beacon
  • Compromise / Domain Fluxing
  • Compromise / Fast Beaconing to DGA
  • Unusual Activity / Unusual External Activity

List of Indicators of Compromise (IoCs)

  • 3.132.75[.]97 – IP address – Likely Vo1d C2 infrastructure
  • g[.]sxim[.]me – Hostname – Likely Vo1d C2 infrastructure
  • snakeers[.]com – Hostname – Likely Vo1d C2 infrastructure

Selected DGA IoCs

  • semhz60b33d7929a[.]com – Hostname – Possible Vo1d C2 DGA endpoint
  • ggqrb60b33d7929a[.]com – Hostname – Possible Vo1d C2 DGA endpoint
  • eusji60b33d7929a[.]com – Hostname – Possible Vo1d C2 DGA endpoint
  • uacfc60b33d7929a[.]com – Hostname – Possible Vo1d C2 DGA endpoint
  • qilqxfc975904fc9[.]top – Hostname – Possible Vo1d C2 DGA endpoint

MITRE ATT&CK Mapping

  • T1071.004 – Command and Control – DNS
  • T1568.002 – Command and Control – Domain Generation Algorithms
  • T1568.001 – Command and Control – Fast Flux DNS
  • T1571 – Command and Control – Non-Standard Port

[1] https://news.drweb.com/show/?lng=en&i=14900

[2] https://blog.xlab.qianxin.com/long-live-the-vo1d_botnet/

[3] https://mybroadband.co.za/news/broadcasting/596007-warning-for-south-africans-using-specific-types-of-tv-sticks.html

The content provided in this blog is published by Darktrace for general informational purposes only and reflects our understanding of cybersecurity topics, trends, incidents, and developments at the time of publication. While we strive to ensure accuracy and relevance, the information is provided “as is” without any representations or warranties, express or implied. Darktrace makes no guarantees regarding the completeness, accuracy, reliability, or timeliness of any information presented and expressly disclaims all warranties.

Nothing in this blog constitutes legal, technical, or professional advice, and readers should consult qualified professionals before acting on any information contained herein. Any references to third-party organizations, technologies, threat actors, or incidents are for informational purposes only and do not imply affiliation, endorsement, or recommendation.

Darktrace, its affiliates, employees, or agents shall not be held liable for any loss, damage, or harm arising from the use of or reliance on the information in this blog.

The cybersecurity landscape evolves rapidly, and blog content may become outdated or superseded. We reserve the right to update, modify, or remove any content.

Continue reading
About the author
Christina Kreza
Cyber Analyst
Your data. Our AI.
Elevate your network security with Darktrace AI