Engineering Enablement by DX

Running data-driven evaluations of AI engineering tools


Listen Later

AI engineering tools are evolving fast. New coding assistants, debugging agents, and automation platforms emerge every month. Engineering leaders want to take advantage of these innovations while avoiding costly experiments that create more distraction than impact.


In this episode of the Engineering Enablement podcast, host Laura Tacho and Abi Noda outline a practical model for evaluating AI tools with data. They explain how to shortlist tools by use case, run trials that mirror real development work, select representative cohorts, and ensure consistent support and enablement. They also highlight why baselines and frameworks like DX’s Core 4 and the AI Measurement Framework are essential for measuring impact.


Where to find Laura Tacho: 

• LinkedIn: https://www.linkedin.com/in/lauratacho/

• X: https://x.com/rhein_wein

• Website: https://lauratacho.com/

• Laura’s course (Measuring Engineering Performance and AI Impact): https://lauratacho.com/developer-productivity-metrics-course

Where to find Abi Noda:

• LinkedIn: https://www.linkedin.com/in/abinoda  

• Substack: ​​https://substack.com/@abinoda  


In this episode, we cover:

(00:00) Intro: Running a data-driven evaluation of AI tools

(02:36) Challenges in evaluating AI tools

(06:11) How often to reevaluate AI tools

(07:02) Incumbent tools vs challenger tools

(07:40) Why organizations need disciplined evaluations before rolling out tools

(09:28) How to size your tool shortlist based on developer population

(12:44) Why tools must be grouped by use case and interaction mode

(13:30) How to structure trials around a clear research question

(16:45) Best practices for selecting trial participants

(19:22) Why support and enablement are essential for success

(21:10) How to choose the right duration for evaluations

(22:52) How to measure impact using baselines and the AI Measurement Framework

(25:28) Key considerations for an AI tool evaluation

(28:52) Q&A: How reliable is self-reported time savings from AI tools?

(32:22) Q&A: Why not adopt multiple tools instead of choosing just one?

(33:27) Q&A: Tool performance differences and avoiding vendor lock-in


Referenced:

  • Measuring AI code assistants and agents
  • QCon conferences
  • DX Core 4 engineering metrics
  • DORA’s 2025 research on the impact of AI
  • Unpacking METR’s findings: Does AI slow developers down?
  • METR’s study on how AI affects developer productivity
  • Claude Code
  • Cursor
  • Windsurf
  • Do newer AI-native IDEs outperform other AI coding assistants?
...more
View all episodesView all episodes
Download on the App Store

Engineering Enablement by DXBy DX

  • 5
  • 5
  • 5
  • 5
  • 5

5

38 ratings


More shows like Engineering Enablement by DX

View all
Software Engineering Radio - the podcast for professional software developers by team@se-radio.net (SE-Radio Team)

Software Engineering Radio - the podcast for professional software developers

271 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

290 Listeners

The a16z Show by Andreessen Horowitz

The a16z Show

1,091 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

623 Listeners

The Cloudcast by Massive Studios

The Cloudcast

151 Listeners

Soft Skills Engineering by Jamison Dance and Dave Smith

Soft Skills Engineering

289 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

43 Listeners

Data Engineering Podcast by Tobias Macey

Data Engineering Podcast

146 Listeners

Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

Syntax - Tasty Web Development Treats

987 Listeners

REWORK by 37signals

REWORK

210 Listeners

Practical AI by Practical AI LLC

Practical AI

207 Listeners

The Stack Overflow Podcast by The Stack Overflow Podcast

The Stack Overflow Podcast

63 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

132 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

93 Listeners

The Pragmatic Engineer by Gergely Orosz

The Pragmatic Engineer

63 Listeners