Mechanical Dreams

Evaluation data contamination in LLMs: How do we measure it and (when) does it matter?


Listen Later

...more
View all episodesView all episodes
Download on the App Store

Mechanical DreamsBy Mechanical Dirk