Anthropic Accuses DeepSeek and Other Chinese AI Firms of Model Distillation Attempts

The Chinese artificial intelligence (AI) companies, such as DeepSeek, were accused Tuesday of trying to extract information from its AI systems using a method called distillation. The US-based AI firm said it detected activity consistent with large scale model distillation attempts targeting its systems. This effort, according to Anthropic, was designed specifically for ‘using outputs from its models in order to train competing AI systems’ and it has “resisted blocking and stopping such activity” as described above.

What Are Distillation Attacks?

A technique in machine learning is distillation, where a smaller “student” model is trained to replicate the outputs of – and larger ‘teacher’ model. In a blog post, the company explained that this is often used to make lightweight versions of more efficient systems.

While distillation can be a form of intellectual property (IP) extraction without explicit permission, it may become ‘independently unauthorised’. For example, in a distillation attack, when he repeatedly asks the question “What is this proprietary AI model” (in its public interface or API), collects large numbers of responses and then uses that data to train ‘the new model which mimics behaviour of the original system”, according to Anthropic.

But the AI firm explained that “such activity can allow competitors to benefit from the performance, aligning work and safety guardrails of frontier models without requiring the same research and training costs.”

What Anthropic Alleged About DeepSeek and Others

According to Anthropic, it found industrial-scale experiments by three AI labs – DeepSeek, Moonshot and MiniMax which illegally tried to “steal” Claude’s capabilities. AI firm also detailed details of three separate operations it says it identified.

DeepSeek has been charged with more than 150,000 exchanges of Claude’s reasoning skills across many tasks, including rubric-based grading that transformed Antoine into a reward model for reinforcement learning. The anthropic also alleged that DeepSeek created “censorship-safe alternatives to politically sensitive questions” (most likely to train its own systems to avoid restricted topics).

A source of Anthropic said DeepSeek was’synchronised traffic across multiple accounts, with the same patterns and shared payment methods as well as coordinated timing suggesting deliberate load balancencing to increase throughput and avoid detection by using an anti-defence system. However, the requested metadata allowed it to track these activities with specific researchers at the lab.

It also accused Moonshot AI of doing more than 3. Four million exchanges centered on agentic reasoning, coding, tool use, computer-use agent development and computer vision tasks were related to 4 million of the four million transactions. Moonshot used hundreds of bogus accounts across multiple access pathways to obscure coordination, according to Anthropic claims.

Lastly, MiniMax is accused of having exchanged over 13 million messages about agentic coding and tool orchestration. The Anthropic states that attribution was made from request metadata and infrastructure indicators. The AI firm said it detected this campaign while it was still active, before MiniMax’s in-training model was released.

Anthropic’s Response

Anthropic said it is investing heavily in defences designed to make distillation attacks harder to execute and easier to identify when they are being attacked. In API traffic, it is claimed to have ‘built multiple detection systems including classifiers and behavioural fingerprinting tools’ for flagging patterns consistent with distillation in the flow of data.

The firm is also releasing technical data with other AI labs, cloud providers and relevant authorities in an attempt to highlight the problem of distillation. It has also strengthened access controls, particularly around educational accounts, security research programmes and startup pathways that it says are often used to create fraudulent accounts.

Finally, Anthropic is developing countermeasures at the product, API and model levels to reduce its output effectiveness for illegal distillation without compromising customer experience. According to a company that published the details, it was intended for stakeholders to be aware of the evidence in favor of protecting advanced AI systems.


Thanks for reading Anthropic Accuses DeepSeek and Other Chinese AI Firms of Model Distillation Attempts
MightNews
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.