Share A deep dive on AI model distillation attacks

Copy link

April 29, 2026

A deep dive on AI model distillation attacks

1 hour 12 minutes

In this solo episode of Risky Business Features James Wilson explores how distillation techniques are both a legitimate way to train smaller models, as well as a way to steal model capabilities. It’s not just a problem for frontier labs! Any LLM-based product could have its competitive advantage stolen through these attacks.

James covers:

High-level concept of distillation

Why it matters including close/open-weight/open-source explanation

Types of distillation and the prompts used

The distillation pipeline end to end

Distillation at scale and mitigation techniques

Hardware resource constraints for distillation

Show notes

Self-Instruct: Aligning Language Models with Self-Generated Instructions

Alpaca: A Strong, Replicable Instruction-Following Model

Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality

Orca: Progressive Learning from Complex Explanation Traces of GPT-4

Zephyr: Direct Distillation of LM Alignment

Stealing Part of a Production Language Model

Microsoft probes if DeepSeek-linked group improperly obtained OpenAI data, Bloomberg News reports

Detecting and preventing distillation attacks

...more

View all episodes

By Risky Business Media