February 05, 2026

Catching AI Sleeper Agent - LLM Backdoors

15 minutes

Could your trusted AI model be a hidden "sleeper agent" just waiting for a secret command to turn malicious? We explore a new methodology that extracts and reconstructs backdoor triggers by exploiting the surprising fact that these models often strongly memorize their own poisoning data. Tune in to discover how this inference-only scanner can unmask hidden threats across various LLMs without needing any prior knowledge of the attacker’s specific trigger or target behavior.

Source: https://arxiv.org/pdf/2602.03085

...more

View all episodes

By Build Wiz AI

February 05, 2026

Catching AI Sleeper Agent - LLM Backdoors

15 minutes

Source: https://arxiv.org/pdf/2602.03085

...more

Share Catching AI Sleeper Agent - LLM Backdoors

Sign up to save your podcasts

Catching AI Sleeper Agent - LLM Backdoors

Catching AI Sleeper Agent - LLM Backdoors