The Evolution of PyPI Supply Chain Attacks: Analyzing the Newest Waves of Shai-Hulud, Miasma, and Hades

For security researchers and DevOps engineers, the landscape of software supply chain attacks is often a game of cat and mouse. Just as defenders tune their scanners to catch a specific pattern, threat actors pivot to a new delivery mechanism. We are seeing this exact pattern unfold right now with the latest evolution of the Mini Shai-Hulud, Miasma, and Hades campaigns.

The Socket Threat Research team has identified a significant expansion in this campaign. Moving beyond the initial wave of malicious PyPI wheels, threat actors are now iterating rapidly across different delivery methods, targeting specialized communities like bioinformatics researchers and developers working with Model Context Protocol (MCP) tools. You can find the full, detailed breakdown of these findings in the Socket Threat Research report.

A Campaign That Refuses to Stay Static

What makes this campaign particularly dangerous is not just its scope, but its adaptability. This isn’t a single “hit-and-run” incident; it is a highly iterative campaign that has shifted its execution strategies to evade detection. We have observed three distinct “delivery branches” used by the attackers to plant their Hades-family payload.

1. The Traditional .pth Startup Hook

The original wave relied on malicious wheels containing a *-setup.pth file. In the Python ecosystem, .pth files are executed during the initialization of the Python interpreter. By bundling an obfuscated _index.js file within the wheel, the attackers could use Python to bootstrap the Bun runtime and execute JavaScript-based malware immediately upon installation.

2. The “Stealth” Native Extension Approach

The most sophisticated evolution involves the “bioinformatics subcluster.” Threat actors are now using trojanized native extensions (specifically .abi3.so files). This is a clever tactic: while many security tools and manual reviewers focus heavily on .py source files, compiled native extensions often receive significantly less scrutiny. In this scenario, the Python code itself looks perfectly legitimate, but the moment the module is imported, the compiled binary triggers the execution of the _index.js payload via dlopen().

3. The Loader-Payload Split (The langchain-core-mcp Variant)

Perhaps the most devious tactic is seen in the langchain-core-mcp package. Unlike previous versions that bundled the payload directly, this variant ships a “loader” but not the payload itself. The .pth hook in this package is programmed to scan the entire sys.path for an _index.js file. This allows the attacker to separate the delivery (the loader) from the malicious intent (the payload), making it much harder for a scanner that expects a self-contained package to flag the package as malicious.

Anti-Analysis: Tricking the AI Triage

As AI-assisted security analysis becomes the industry standard, attackers are already finding ways to counter it. The _index.js payload used in these attacks contains a unique “anti-AI” mechanism. The top of the file includes a large JavaScript comment block filled with fake system instructions and policy-triggering content.

While this does nothing to change the code’s execution, it is designed to trigger “safety refusals” in LLMs. If a security analyst uses a tool that feeds the beginning of a file into an AI to determine if it is malicious, the AI may see the fake “policy-violating” instructions and refuse to analyze the rest of the file, effectively allowing the real, obfuscated malware hidden further down to slip past unnoticed.

The High Cost of Compromise: What is at Stake?

The “Hades” payload is a specialized stealer designed to harvest high-value secrets. Once a developer’s workstation or a CI/CD environment is compromised, the attackers go after:

  • Cloud and Registry Credentials: AWS/GCP credentials, GitHub tokens, npm, PyPI, and JFrog tokens.
  • Infrastructure Access: Kubernetes service accounts and SSH keys.
  • Environment Secrets: .env files and shell histories.
  • Development Tools: Configuration files for Docker and various AI developer tools.

In a CI/CD context, a single successful infection can lead to “lateral movement,” where an attacker uses a compromised build token to publish their own malicious versions of popular packages, creating a massive, self-propagating loop of infection.

Defensive Guidance: How to Protect Your Pipeline

To defend against these evolving delivery mechanisms, security teams should move beyond simple source-code auditing and implement more holistic monitoring:

  • Audit Python Environments: Scan your environments for unexpected .pth files and unfamiliar .abi3.so extensions, especially in performance-sensitive scientific packages.
  • Monitor Runtime Behavior: Watch for unusual subprocess calls (like subprocess.run calling Bun) and unexpected network activity from Python processes.
  • Secure Your CI/CD: Implement strict “least privilege” for all CI/CD runners. Ensure that build tokens have the minimum permissions necessary and are rotated frequently.
  • Pin Your Dependencies: Use hash-based pinning (like requirements.txt with --hash) to ensure that the packages you tested are exactly the same ones being deployed.

Note: This article is based on threat intelligence reports regarding evolving supply chain threats. For the most up-to-date Indicators of Compromise (IOCs) and technical deep dives, always consult the original research sources.

Related Articles

Back to top button