PyPI supply chain attack via litellm and the dependency risk problem in ML engineering

The PyPI Dependency Trap Nobody Wants to Talk About

Last week, ML engineers got a very clear look at how fragile the tooling ecosystem really is. A poisoned PyPI release of litellm, version 1.82.8, sat live on the registry for less than an hour. In that window, it was fully capable of exfiltrating SSH keys, AWS credentials, Kubernetes configs, environment variables, shell history, crypto wallets, SSL private keys, CI/CD secrets, and database passwords from any machine that installed it. One package. One hour. Potentially millions of compromised machines.

What stopped it was not a scanner, not a lockfile, not a security team. A developer’s machine ran out of RAM and crashed.

🔥 The Actual Blast Radius

Andrej Karpathy laid this out clearly on Tuesday: litellm pulls 97 million downloads per month. That number is already alarming. But the more dangerous number is zero, because that is how many direct litellm installs were required to get hit. If you ran pip install dspy, and dspy had a dependency on litellm>=1.64.0, you were exposed. Same for any other package in the ecosystem with litellm sitting somewhere in its tree.

The attacker embedded a file called litellm_init.pth containing base64-encoded instructions to send every credential it could find to a remote server, then self-replicate. The bug that caused the RAM exhaustion is the only reason this didn’t run silently for days or weeks before anyone noticed. Callum McMahon was using an MCP plugin inside Cursor that pulled litellm in as a transitive dependency. His machine crashed. He investigated. The attack was discovered.

That is not a defense. That is luck.

The Dependency Philosophy Problem

Karpathy made a point that I think a lot of engineers are uncomfortable sitting with: classical software engineering treats dependency reuse as a virtue. Small, composable packages. Don’t reinvent wheels. Build pyramids from bricks. This philosophy was never wrong, exactly, but it was written for a different threat model.

The problem is that every dependency you add is not just a package. It is every future version of that package, every future maintainer, every future bad actor who might compromise that maintainer’s account. When you write litellm>=1.64.0 in your requirements file, you are writing a blank check against your users’ machines for every release from that point forward.

In ML engineering specifically, the dependency graphs are deep and wide. We pull in model libraries, tokenizers, API wrappers, observability tools, CLI frameworks. A mid-size ML project can easily have 200+ transitive dependencies. The attack surface is enormous.

⚙️ What Karpathy Is Actually Suggesting

He’s been increasingly vocal about preferring to “yoink” functionality using LLMs rather than pulling in a full dependency when the logic is simple enough to write inline. I think that position is more practical than it sounds. Not every package needs to be a dependency. If you need to parse a config file or call an HTTP endpoint, writing that code directly gives you something that cannot be remotely poisoned through a registry update.

This is not a call to avoid all dependencies. Some packages are too complex, too well-audited, or too foundational to reasonably replace. But there is a real difference between depending on numpy and depending on a rapidly-versioned API wrapper maintained by a small team.

The ML tooling space moves fast. Packages get popular quickly, maintainer turnover is high, and the pressure to ship means version constraints often get written loosely. That combination is exactly what attackers are looking for.

What Actually Helps

Pin your dependencies. Use exact version hashes in production environments, not floating ranges. Run tools like pip-audit or socket.dev in your CI pipeline. Review changelogs before upgrading, especially for packages that touch credentials or network I/O.

None of this is glamorous. None of it would have guaranteed you caught litellm 1.82.8 in under an hour. But pinning alone would have meant the bad release never touched your environment unless you explicitly pulled the update.

The harder cultural shift is treating your dependency tree as a security boundary, not just a convenience layer. Every time you add a package, you are making a trust decision about that package and everyone who will ever commit to it in the future.

This attack was a close call. The next one might not have a bug that crashes machines.

Sources

#MLEngineering #SupplyChainSecurity #PyPI #Python #MachineLearning #DevSecOps

Watch the full breakdown on YouTube

PyPI supply chain attack via litellm and the dependency risk problem in ML engineering

Sources & Further Reading

Designing codebases for AI memory loss: why information architecture beats prompting every time

AI-generated code velocity mismatch creating high blast radius production incidents, and why review burden should increase not decrease with AI assistance

Palantir AI and Claude used for military targeting in Iran operation, raising questions about human oversight in deployed AI systems

“From Pixels to Masterpieces: The Evolution of Video Game Development Through the Ages”

Anthropic’s 81,000-person qualitative Claude user study and what it means for AI product builders

Hot take on using AI as a thinking partner vs. a search engine, and why context depth separates good engineers from great ones

Leave a Reply Cancel reply

Sources & Further Reading

Similar Posts

Leave a Reply Cancel reply