DataTalks.Club

How to Build and Evaluate AI systems in the Age of LLMs - Hugo Bowne-Anderson


Listen Later

In this talk, Hugo Bowne-Anderson, an independent data and AI consultant, educator, and host of the podcasts Vanishing Gradients and High Signal, shares his journey from academic research and curriculum design at DataCamp to advising teams at Netflix, Meta, and the US Air Force. Together, we explore how to build reliable, production-ready AI systems—from prompt evaluation and dataset design to embedding agents into everyday workflows.


You’ll learn about:

  • How to structure teams and incentives for successful AI adoption
  • Practical prompting techniques for accurate timestamp and data generation
  • Building and maintaining evaluation sets to avoid “prompt overfitting”- Cost-effective methods for LLM evaluation and monitoring
  • Tools and frameworks for debugging and observing AI behavior (Logfire, Braintrust, Phoenix Arise)
  • The evolution of AI agents—from simple RAG systems to proactive, embedded assistants
  • How to escape “proof of concept purgatory” and prioritize AI projects that drive business value
  • Step-by-step guidance for building reliable, evaluable AI agents


This session is ideal for AI engineers, data scientists, ML product managers, and startup founders looking to move beyond experimentation into robust, scalable AI systems. Whether you’re optimizing RAG pipelines, evaluating prompts, or embedding AI into products, this talk offers actionable frameworks to guide you from concept to production.


LINKS

  • Escaping POC Purgatory: Evaluation-Driven Development for AI Systems - https://www.oreilly.com/radar/escaping-poc-purgatory-evaluation-driven-development-for-ai-systems/
  • Stop Building AI Agents - https://www.decodingai.com/p/stop-building-ai-agents
  • How to Evaluate LLM Apps Before You Launch - https://www.youtube.com/watch?si=90fXJJQThSwGCaYv&v=TTr7zPLoTJI&feature=youtu.be
  • My Vanishing Gradients Substack - https://hugobowne.substack.com/
  • Building LLM Applications for Data Scientists and Software Engineers
  • https://maven.com/hugo-stefan/building-ai-apps-ds-and-swe-from-first-principles?promoCode=datatalksclub

TIMECODES:

00:00 Introduction and Expertise

04:04 Transition to Freelance Consulting and Advising

08:49 Restructuring Teams and Incentivizing AI Adoption

12:22 Improving Prompting for Timestamp Generation

17:38 Evaluation Sets and Failure Analysis for Reliable Software

23:00 Evaluating Prompts: The Cost and Size of Gold Test Sets

27:38 Software Tools for Evaluation and Monitoring

33:14 Evolution of AI Tools: Proactivity and Embedded Agents

40:12 The Future of AI is Not Just Chat

44:38 Avoiding Proof of Concept Purgatory: Prioritizing RAG for Business Value

50:19 RAG vs. Agents: Complexity and Power Trade-Offs

56:21 Recommended Steps for Building Agents

59:57 Defining Memory in Multi-Turn Conversations


Connect with Hugo

  • Twitter - https://x.com/hugobowne
  • Linkedin - https://www.linkedin.com/in/hugo-bowne-anderson-045939a5/
  • Github - https://github.com/hugobowne
  • Website - https://hugobowne.github.io/


Connect with DataTalks.Club:

  • Join the community - https://datatalks.club/slack.html
  • Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ
  • Check other upcoming events - https://lu.ma/dtc-events
  • GitHub: https://github.com/DataTalksClub- LinkedIn - https://www.linkedin.com/company/datatalks-club/
  • Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/
...more
View all episodesView all episodes
Download on the App Store

DataTalks.ClubBy DataTalks.Club

  • 5
  • 5
  • 5
  • 5
  • 5

5

7 ratings


More shows like DataTalks.Club

View all
Radiolab by WNYC Studios

Radiolab

44,012 Listeners

Hidden Brain by Hidden Brain, Shankar Vedantam

Hidden Brain

43,566 Listeners

The Knowledge Project by Shane Parrish

The Knowledge Project

2,673 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

302 Listeners

Data Engineering Podcast by Tobias Macey

Data Engineering Podcast

144 Listeners

The Real Python Podcast by Real Python

The Real Python Podcast

141 Listeners

Huberman Lab by Scicomm Media

Huberman Lab

29,128 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

15,966 Listeners

ReThinking by TED

ReThinking

634 Listeners

Data Career Podcast: Helping You Land a Data Analyst Job FAST by Avery Smith - Data Career Coach

Data Career Podcast: Helping You Land a Data Analyst Job FAST

160 Listeners

The Analytics Engineering Podcast by dbt Labs, Inc.

The Analytics Engineering Podcast

28 Listeners

The Tucker Carlson Show by Tucker Carlson Network

The Tucker Carlson Show

16,678 Listeners