
Sign up to save your podcasts
Or


Your Java AI application is live in production. But have you tested whether it can be jailbroken, manipulated into revealing its system prompt, or tricked into printing content it should never output?
In this episode, Iryna Dohndorf, Software Engineer at Karakun Group and creator of Tiberius, explains how to bring security testing to LLM-powered Java applications. We cover why traditional unit tests break down with non-deterministic systems, how the Scan-Fixture-Validate workflow works, what buff mutation testing is, and why even well-trained models can be cracked with something as simple as the grandmother attack.
Topics include:
Guest
Iryna Dohndorf - Software Engineer at Karakun Group
LinkedIn
Links
Article on Foojay
Tiberius on GitHub
Security Testing Guide
Timestamps
00:00 Introduction of topic and guest
01:05 The problem Tiberius wants to solve
06:39 How "traditional" unit tests don't work for LLM integrations
10:23 Scan-Fixture-Validate principle and sharing artifacts
15:15 Using different skills, for example, the grandmother skill
17:33 Testing for required versus forbidden bias
19:35 The probes across nine attack categories used by Tiberius
20:44 Buff mutation testing
26:55 Using Tiberius in your pipelines and when to fail
29:35 Using multi-trial scans
31:14 Fingerprinting: which model you use, should not be detectable
32:55 Combining multiple models, model as a judge
34:41 Sharing JSON models to improve tests
36:05 How to get started with Tiberius in Spring and with LangChain4j
36:41 Quarkus not supported yet, plans for the future
39:07 Conclusions and a call out to everyone to become a Foojay author
By Foojay.io | Java and Programming CommunityYour Java AI application is live in production. But have you tested whether it can be jailbroken, manipulated into revealing its system prompt, or tricked into printing content it should never output?
In this episode, Iryna Dohndorf, Software Engineer at Karakun Group and creator of Tiberius, explains how to bring security testing to LLM-powered Java applications. We cover why traditional unit tests break down with non-deterministic systems, how the Scan-Fixture-Validate workflow works, what buff mutation testing is, and why even well-trained models can be cracked with something as simple as the grandmother attack.
Topics include:
Guest
Iryna Dohndorf - Software Engineer at Karakun Group
LinkedIn
Links
Article on Foojay
Tiberius on GitHub
Security Testing Guide
Timestamps
00:00 Introduction of topic and guest
01:05 The problem Tiberius wants to solve
06:39 How "traditional" unit tests don't work for LLM integrations
10:23 Scan-Fixture-Validate principle and sharing artifacts
15:15 Using different skills, for example, the grandmother skill
17:33 Testing for required versus forbidden bias
19:35 The probes across nine attack categories used by Tiberius
20:44 Buff mutation testing
26:55 Using Tiberius in your pipelines and when to fail
29:35 Using multi-trial scans
31:14 Fingerprinting: which model you use, should not be detectable
32:55 Combining multiple models, model as a judge
34:41 Sharing JSON models to improve tests
36:05 How to get started with Tiberius in Spring and with LangChain4j
36:41 Quarkus not supported yet, plans for the future
39:07 Conclusions and a call out to everyone to become a Foojay author

288 Listeners

29 Listeners

64 Listeners

26 Listeners