
Sign up to save your podcasts
Or


As a continuation of Episode 238, I explain some effective and fun attacks to conduct against LLMs. Such attacks are even more effective on models served locally, that are hardly controlled by human feedback.
Have great fun and learn them responsibly.
References
https://www.jailbreakchat.com/
https://www.reddit.com/r/ChatGPT/comments/10tevu1/new_jailbreak_proudly_unveiling_the_tried_and/
https://arxiv.org/abs/2305.13860
By Francesco Gadaleta4.2
7272 ratings
As a continuation of Episode 238, I explain some effective and fun attacks to conduct against LLMs. Such attacks are even more effective on models served locally, that are hardly controlled by human feedback.
Have great fun and learn them responsibly.
References
https://www.jailbreakchat.com/
https://www.reddit.com/r/ChatGPT/comments/10tevu1/new_jailbreak_proudly_unveiling_the_tried_and/
https://arxiv.org/abs/2305.13860

4,022 Listeners

26,380 Listeners

756 Listeners

626 Listeners

12,130 Listeners

6,467 Listeners

306 Listeners

113,121 Listeners

56,944 Listeners

14 Listeners

4,025 Listeners

8,043 Listeners

212 Listeners

6,462 Listeners

16,525 Listeners