June 18, 2026

Viral Prompt Shows ChatGPT's Content Filters Don't Work

10 minutes

Mindgard red-team research shows that ChatGPT's image generator can be manipulated into producing violent and sexually explicit content that users never directly requested, by abusing a fun viral "restore the photo" prompt circulating on social media. This episode walks through how nondescript prompts slip past input and output filters, why prompt repetition makes things worse, and OpenAI's inadequate response. Content warning: discussion of death, sexual violence and murder. The exact jailbreak prompts are deliberately not included.

...more

View all episodes

By Damian

June 18, 2026

Viral Prompt Shows ChatGPT's Content Filters Don't Work

10 minutes

...more

Share Viral Prompt Shows ChatGPT's Content Filters Don't Work

Sign up to save your podcasts

Viral Prompt Shows ChatGPT's Content Filters Don't Work

Viral Prompt Shows ChatGPT's Content Filters Don't Work