Illustration of a cybersecurity breach inside an AI training company, with hackers extracting data through a poisoned open-source package in a dark, high-tech operations center
A poisoned version of LiteLLM – an open-source Python library with 97 million monthly downloads – gave hackers access to Mercor’s systems for roughly 40 minutes in late March, long enough to exfiltrate an estimated 4 terabytes of data. The $10 billion AI training startup confirmed the breach on April 2, acknowledging it was among thousands of companies hit by the same supply chain attack.
The stolen data includes 939 GB of platform source code, a user database containing names and Social Security numbers of more than 40,000 contractors, approximately 3 TB of video interview recordings, and proprietary AI training methodologies belonging to Mercor’s clients – which include OpenAI, Anthropic, and Meta. The extortion group Lapsus$ has claimed responsibility and is demanding payment.
Why It Matters
Meta has indefinitely suspended all work with Mercor. Contractors assigned to Meta AI projects can no longer log hours, and Meta has declined to comment on specifics. OpenAI confirmed it is investigating but has not paused its Mercor projects. Anthropic has not publicly responded.
The exposure of training methodologies – how frontier labs select, label, and weight training data – makes this breach different from a standard corporate hack. That information represents closely guarded competitive intelligence. If it reaches rival AI programs, including those backed by nation-state actors, it could accelerate the development of competing models in ways that are difficult to detect or quantify.
The wider blast radius of the LiteLLM attack is still being measured. Cybersecurity firm Mandiant has identified over 1,000 compromised SaaS environments so far and expects that number to rise significantly. Researcher group Vx-underground estimates data was pulled from more than 500,000 machines globally.
What’s Next
At least four class action lawsuits have already been filed against Mercor, with the lead case – Gill v. Mercor.io Corporation – lodged in the Northern District of California on April 1. The suits allege Mercor failed to implement multi-factor authentication, encrypt sensitive data, or monitor systems for unusual activity.
Expect AI labs to respond by tightening vendor security requirements across the board. The model of large AI companies outsourcing training data operations to lightly audited startups is now under direct scrutiny. Mandatory software bill of materials disclosures and code-signing requirements for high-download open-source packages are likely to gain momentum in policy discussions.
Mercor’s future is uncertain. With its largest client frozen, four active lawsuits, and its reputation damaged, the company faces serious pressure despite its headline valuation. The broader industry is watching to see whether this breach triggers a lasting structural change in how AI research supply chains are managed.
Sources: Fortune · The Next Web · The Register
