YAAP (Yet Another AI Podcast)

Why AI Leaderboards Miss the Point


Listen Later

Leaderboards reward “best average score.”

Real users reward “answer fast, don’t hallucinate, don’t bankrupt me.”

 

In this special deep dive episode, AI21’s CTO Barak Lenz walks through four gaps between what models can do and what real AI systems deliver: validation, contextualization (pick the right approach per input), latency (parallelize and stop early), and decomposition (making those choices continuously inside long workflows).

Less “best model.” More “best execution.”

...more
View all episodesView all episodes
Download on the App Store

YAAP (Yet Another AI Podcast)By AI21