On March 24, 2026, attackers published two malicious versions of litellm, a popular open-source LLM gateway (more than 3 million downloads/day). The malicious payload was a three-stage credential harvester, encrypted exfiltrator, and Kubernetes lateral-movement worm. It was active on PyPI for approximately three hours before being quarantined.

Moreover, the malicious package was published using legitimate credentials, originally stolen from Trivy a week earlier. Trivy’s maintainers describe a credential-stealing infostealer injected into GitHub Actions tags and releases, built to extract secrets from CI runners and exfiltrate them. This means that LiteLLM is just one part of the blast radius -- any number of other Trivy users could have had their credentials stolen, and those credentials could enable more supply-chain attacks.

The fork-bomb crash that led to the discovery of the LiteLLM compromise was an accident, not the attack itself: a bug in the malware's .pth auto-execution hook that caused exponential process spawning. Without that accident, the credential harvest and exfiltration would have completed silently on every machine that ran pip install, and it could have gone undetected for days or weeks.

I'm not going to rehash the entire incident timeline. The disclosure (GitHub issue #24512), the discovery report by FutureSearch, and the technical breakdown by Snyk already have that covered. Instead, I'm going to look at what happens after the package lands on a machine: the runtime behavioral signals that EDAMAME Security detects, and why EDAMAME could have caught not just this attack, but any attack like it.

The supply chain integrity gap

The compromised wheel was published using legitimate PyPI credentials stolen from LiteLLM's CI pipeline. The litellm_init.pth file was correctly declared in the package's RECORD with a matching hash. pip install --require-hashes passed. Signature verification passed.

In short, the package looked entirely legitimate because, in terms of supply chain integrity, it was. When the publisher's credentials are compromised, the entire chain of trust is intact all the way down to the malicious code.

In order to prevent these deep supply-chain attacks, we have to shift to detection on a more fundamental level: what does the code actually do on the host?

Three observable signals on the host

Once executing, the LiteLLM payload produces three operations that leave a clear system-plane footprint:

Broad credential harvesting. The script opens file handles to SSH private keys, AWS credentials, GCP application default credentials, Azure service principal files, Kubernetes configs, Docker registry credentials, Git credential stores, Vault tokens, .env files, shell histories, and cryptocurrency wallet data (Bitcoin, Ethereum, Solana, Monero). It iterates through 9+ distinct credential categories simultaneously -- a pattern that no legitimate application exhibits.
Encrypted exfiltration. Harvested data is AES-256-CBC encrypted, RSA-wrapped with a hardcoded 4096-bit public key, and POSTed as application/octet-stream to models.litellm.cloud, a domain registered one day before the attack.
Persistence and lateral movement. A systemd backdoor is installed. If Kubernetes service account tokens are found, the malware reads all cluster secrets and deploys privileged pods on every node.

These operations are fundamentally different from legitimate application behavior. EDAMAME is designed to see them.

How EDAMAME detects this class of attack

EDAMAME Security (desktop app) and EDAMAME Posture (CLI) share the same core engine. That engine runs CVE-aligned vulnerability checks every 60 seconds, operating purely on live system telemetry—no prior configuration, no agent plugins, no behavioural model required. Two of those checks directly target the shared behaviour across Trivy and LiteLLM.

The telemetry comes from EDAMAME’s network capture layer (flodbadd), which provides deep L7 attribution for every network session: process name and path, parent process chain, command line, open file handles, and temp-origin detection. An Extended Isolation Forest (12-dimensional feature space) scores each session for anomalousness. This is full process-to-session binding with file-level visibility.

Check 1: Token Exfiltration — anomalous traffic meets credential access

The token_exfiltration check fires when EDAMAME observes a network session flagged as anomalous by the ML model (novel destination, unusual traffic pattern, atypical payload profile) where the process generating that session has open file handles to sensitive credential paths.

Sensitive paths are maintained in a cloud-updated, signature-verified database with labelled categories: ssh, aws, gcp, azure, kube, docker, git, vault, env, crypto, and more—covering cross-platform paths across macOS, Windows, and Linux.

Finding severity: HIGH.

Check 2: Credential Harvest — the anomaly-independent safety net

Sophisticated supply-chain malware may produce normal-looking traffic. If the exfiltration endpoint mimics a legitimate API (correct TLS, common port, CDN-hosted domain), the anomaly score may stay below threshold. The LiteLLM attackers understood this: models.litellm.cloud was designed to blend in.

The credential_harvest check addresses this directly. It fires when a process with any active network session—anomalous or not—has open file handles to sensitive paths spanning 3 or more distinct credential categories.

The logic here is simple: no legitimate application opens SSH keys, AWS credentials, GCP service accounts, and cryptocurrency wallets simultaneously. An IDE might access ~/.ssh/ for Git operations. A cloud CLI reads its own credential file. But accessing 3+ distinct categories from a single process is a definitive indicator of programmatic credential harvesting—regardless of how the associated network traffic looks.

Finding severity: CRITICAL. This is a non-suppressible finding—EDAMAME’s LLM adjudicator cannot dismiss it.

Both checks include intelligent self-access suppression. When a process accesses its own configuration files (the AWS CLI reading ~/.aws/credentials, Docker reading its config), EDAMAME recognises this as legitimate self-access and suppresses the finding.

Detection summary

Check	Trigger condition	Severity	Anomaly required?
`token_exfiltration`	Anomalous network session + process has sensitive credential files open	HIGH	Yes
`credential_harvest`	Any network session + process has 3+ distinct credential categories open	CRITICAL	No

Together, these checks cover the full spectrum: from noisy exfiltration to stealthy, camouflaged supply-chain payloads. The detection runs automatically, requires no configuration, and works on day one.

Reproduce it yourself

This is not theoretical. EDAMAME's open-source E2E test suite includes a trigger that faithfully reproduces the LiteLLM payload's system-plane behavior. Install EDAMAME Posture (free) or EDAMAME Security (free), start the daemon with packet capture enabled, and run:

python3 trigger_supply_chain_exfil.py --duration 120

The script creates demo credential files across 9 categories, opens handles to all of them, and sends HTTP POST requests with application/octet-stream payloads matching the real exfiltration pattern. Within one detection cycle (60 seconds), EDAMAME produces findings for both token_exfiltration and credential_harvest, visible via edamame_cli rpc get_vulnerability_findings or the app's advisor tab.

The trigger is one of nine CVE-aligned scenarios in the test suite, covering sandbox escape, blacklisted C2 traffic, memory poisoning, tool poisoning, credential sprawl, goal drift, and two-plane divergence detection.

The AI-era attack surface demands runtime visibility

The LiteLLM compromise was discovered because the malware had a bug. Next time, we might not be so lucky -- and there will be a next time.

This is the reality of the modern development environment:

Transitive dependency depth is exploding. MCP plugins, LLM framework libraries, and AI agent tool packages pull in dozens of dependencies. LiteLLM alone is consumed by DSPy, MLflow, CrewAI, OpenHands, Arize Phoenix, and many more. Each dependency is a potential supply chain entry point.
Auto-execution is by design. Python .pth files, npm postinstall scripts, and similar install-time hooks run code before any scanner can inspect it. The payload executes inside pip, npm, or the IDE's language serve, and it runs with the developer's full privileges.
Developer workstations are credential-rich. SSH keys, cloud provider credentials, container registry tokens, API keys, Git credentials, LLM API keys, cryptocurrency wallets. The LiteLLM payload targeted all of them because they are all there, on every developer machine.

SBOMs and hash verification are necessary hygiene to protect against package tampering in transit, but when the publisher itself is compromised, the publication looks legitimate end to end. Runtime behavioral heuristics verify what the code does on your machine, regardless of how it got there.

If you weren't affected by the LiteLLM compromise, that's great! But BerriAI, the company behind LiteLLM, is only one of the thousands of organizations whose credentials were stolen by the Trivy attack. The durable detection strategy is to hunt for what these payloads do: credential harvesting at scale plus exfiltration. EDAMAME is a practical tool built to detect exactly those behaviors on real hosts and CI environments.

Get started

Download EDAMAME Security — free desktop app for macOS, Windows, Linux, iOS, Android
EDAMAME Posture CLI — free CLI for CI/CD pipelines, coding agents, and headless servers
Reproduce the detection — open-source E2E test suite with CVE-aligned trigger scripts
EDAMAME Core API — public API and MCP tool reference
View on GitHub — the full EDAMAME architecture

Sources: Trivy advisory GHSA-69fq-xp46-6x23, LiteLLM issue #24512, FutureSearch disclosure, Snyk analysis, PyPIStats, MITRE ATT&CK T1546.018, T1003, T1610.