Beaker Banter

Is AI Benchmarking Broken? The Truth Behind "con@64" Revealed Brought to you by Avonetics.com


Listen Later

Discover the controversial "con@64" technique, where AI models are prompted 64 times to reach a consensus answer. Is this a legitimate way to reduce variance or a sneaky trick to inflate benchmark scores? Dive into the heated debate on whether this practice skews real-world performance comparisons and unfairly impacts perceptions of model capabilities. Learn why some accuse XAI engineers of overhyping AI and how differing "con" values could be misleading the industry. For advertising opportunities, visit Avonetics.com.

...more
View all episodesView all episodes
Download on the App Store

Beaker BanterBy Beaker Banter