October 15, 2025

Small Fixed Samples Poison Large LLMs

11 minutes

This episode dive deep on an Anthropic report and a related research paper, detail a joint study on the vulnerability of large language models (LLMs) to data poisoning attacks. The research surprisingly demonstrates that injecting a near-constant, small number of malicious documents—as few as 250—is sufficient to successfully introduce a backdoor vulnerability, regardless of the LLM's size (up to 13 billion parameters) or the total volume of its clean training data.

...more

View all episodes

By Fourth Mind

October 15, 2025

Small Fixed Samples Poison Large LLMs

11 minutes

...more

Share Small Fixed Samples Poison Large LLMs

Sign up to save your podcasts

Small Fixed Samples Poison Large LLMs

Small Fixed Samples Poison Large LLMs