Awesome Agents Podcast

JBDistill Generates Its Own Jailbreaks - 81.8% Attack Rate


Listen Later

Johns Hopkins and Microsoft's JBDistill achieves 81.8% attack success rate across 13 LLMs by auto-generating fresh adversarial prompts on demand.
...more
View all episodesView all episodes
Download on the App Store

Awesome Agents PodcastBy Awesome Agents