
Sign up to save your podcasts
Or
This episode is a little different from our usual fare: It’s a conversation with our head of AI training Alex Duffy about Good Start Labs, a company he incubated inside Every. Today, Good Start Labs is spinning out of Every as a separate company with $3.6 million in funding from General Catalyst, Inovia, Every, and a group of angel investors from top-tier AI labs like DeepMind. We get into how Alex learned some of his biggest lessons about the real world from games, starting with RuneScape, which taught him how markets work and how not to get scammed. He explains why the static benchmarks we use to evaluate LLMs today are breaking down, and how games like Diplomacy offer a richer, more dynamic way to test and train large language models. Finally, Alex shares where he sees the most promise in AI—software, life sciences, and education—and why he believes games can make the models we use smarter, while helping people understand and use AI more effectively.
If you found this episode interesting, please like, subscribe, comment, and share.
Want even more?
Sign up for Every to unlock our ultimate guide to prompting ChatGPT here: https://every.ck.page/ultimate-guide-to-prompting-chatgpt. It’s usually only for paying subscribers, but you can get it here for free.
To hear more from Dan Shipper:
Timestamps
00:00:00 - Start
00:01:48 - Introduction
00:04:14 - Why evals and benchmarks are broken
00:07:13 - The sneakiest LLMs in the market
00:13:00 - A competition that turns prompting into a sport
00:15:49 - Building a business around using games to make AI better
00:22:39 - Can language models learn how to be funny
00:25:31 - Why games are a great way to evaluate and train new models
00:26:58 - What child psychology tells us about games and AI
00:30:10 - Using games to unlock continual learning in AI
00:36:42 - Why Alex cares deeply about games
00:44:37 - Where Alex sees the most promise in AI
00:50:54 - Rethinking how young people start their careers in the age of AI
Links to resources mentioned in the episode:
4.9
2929 ratings
This episode is a little different from our usual fare: It’s a conversation with our head of AI training Alex Duffy about Good Start Labs, a company he incubated inside Every. Today, Good Start Labs is spinning out of Every as a separate company with $3.6 million in funding from General Catalyst, Inovia, Every, and a group of angel investors from top-tier AI labs like DeepMind. We get into how Alex learned some of his biggest lessons about the real world from games, starting with RuneScape, which taught him how markets work and how not to get scammed. He explains why the static benchmarks we use to evaluate LLMs today are breaking down, and how games like Diplomacy offer a richer, more dynamic way to test and train large language models. Finally, Alex shares where he sees the most promise in AI—software, life sciences, and education—and why he believes games can make the models we use smarter, while helping people understand and use AI more effectively.
If you found this episode interesting, please like, subscribe, comment, and share.
Want even more?
Sign up for Every to unlock our ultimate guide to prompting ChatGPT here: https://every.ck.page/ultimate-guide-to-prompting-chatgpt. It’s usually only for paying subscribers, but you can get it here for free.
To hear more from Dan Shipper:
Timestamps
00:00:00 - Start
00:01:48 - Introduction
00:04:14 - Why evals and benchmarks are broken
00:07:13 - The sneakiest LLMs in the market
00:13:00 - A competition that turns prompting into a sport
00:15:49 - Building a business around using games to make AI better
00:22:39 - Can language models learn how to be funny
00:25:31 - Why games are a great way to evaluate and train new models
00:26:58 - What child psychology tells us about games and AI
00:30:10 - Using games to unlock continual learning in AI
00:36:42 - Why Alex cares deeply about games
00:44:37 - Where Alex sees the most promise in AI
00:50:54 - Rethinking how young people start their careers in the age of AI
Links to resources mentioned in the episode:
530 Listeners
1,088 Listeners
228 Listeners
211 Listeners
484 Listeners
105 Listeners
203 Listeners
133 Listeners
96 Listeners
121 Listeners
557 Listeners
60 Listeners
19 Listeners
57 Listeners
41 Listeners