
Sign up to save your podcasts
Or


We sit down with Anna Guo — a Singapore-based lawyer, startup advisor, and founder of LegalBenchmarks.ai — who has quietly built one of the most rigorous practitioner-driven evaluation frameworks for legal AI tools in the industry. Her community now spans close to 900 legal and AI professionals. Her research has produced findings that challenge industry assumptions: that legal-specific AI tools don't always outperform general-purpose models, that accuracy isn't actually the top driver of lawyer adoption, and that in some drafting tasks, AI is already matching or exceeding human reliability.
This is a watch-don't-only-listen episode. Anna shares her screen throughout — running us through a live, double-blind benchmarking exercise where we rank outputs from legal AI, general-purpose AI, and human lawyers without knowing which is which. She also demonstrates how prompt injection attacks can bypass AI guardrails using techniques as simple as low-resource languages (Vietnamese or ASCII code?), surfacing security risks that become particularly acute as we move closer toward widespread agentic AI adoption.
What You'll Learn:
The Three Dimensions of Tool Evaluation — Why measuring accuracy alone misses the point, and how Anna assesses output reliability, output usefulness, and platform workflow support as distinct layers
What Actually Drives Adoption — Survey data revealing that lawyers prioritise context management and verification over raw accuracy when choosing AI tools
Where Humans Still Win — High-judgment, context-sparse tasks requiring commercial reasoning remain firmly in human territory; routine, context-complete work is where AI excels
Prompt Injection in Practice — Live demonstrations of how attackers can trick AI models into revealing harmful information using low-resource languages and clever framing
---
Connect with Anna: LinkedIn | LegalBenchmarks.ai
---
If you found this episode interesting, please tell us and do share it with a friend, colleague or community who might take something from it! For more, head to lawwhatsnext.substack.com for: (i) Focused conversations with leading practitioners, technologists, and educators; (ii) Deep dives into the intersection of law, technology, and organisational behaviour; and (iii) Practical analysis of how AI is augmenting our potential.
By Tom Rice and Alex HerrityWe sit down with Anna Guo — a Singapore-based lawyer, startup advisor, and founder of LegalBenchmarks.ai — who has quietly built one of the most rigorous practitioner-driven evaluation frameworks for legal AI tools in the industry. Her community now spans close to 900 legal and AI professionals. Her research has produced findings that challenge industry assumptions: that legal-specific AI tools don't always outperform general-purpose models, that accuracy isn't actually the top driver of lawyer adoption, and that in some drafting tasks, AI is already matching or exceeding human reliability.
This is a watch-don't-only-listen episode. Anna shares her screen throughout — running us through a live, double-blind benchmarking exercise where we rank outputs from legal AI, general-purpose AI, and human lawyers without knowing which is which. She also demonstrates how prompt injection attacks can bypass AI guardrails using techniques as simple as low-resource languages (Vietnamese or ASCII code?), surfacing security risks that become particularly acute as we move closer toward widespread agentic AI adoption.
What You'll Learn:
The Three Dimensions of Tool Evaluation — Why measuring accuracy alone misses the point, and how Anna assesses output reliability, output usefulness, and platform workflow support as distinct layers
What Actually Drives Adoption — Survey data revealing that lawyers prioritise context management and verification over raw accuracy when choosing AI tools
Where Humans Still Win — High-judgment, context-sparse tasks requiring commercial reasoning remain firmly in human territory; routine, context-complete work is where AI excels
Prompt Injection in Practice — Live demonstrations of how attackers can trick AI models into revealing harmful information using low-resource languages and clever framing
---
Connect with Anna: LinkedIn | LegalBenchmarks.ai
---
If you found this episode interesting, please tell us and do share it with a friend, colleague or community who might take something from it! For more, head to lawwhatsnext.substack.com for: (i) Focused conversations with leading practitioners, technologists, and educators; (ii) Deep dives into the intersection of law, technology, and organisational behaviour; and (iii) Practical analysis of how AI is augmenting our potential.