March 07, 2024

Ian Osband

1 hour 8 minutes

Ian Osband is a Research scientist at OpenAI (ex DeepMind, Stanford) working on decision making under uncertainty.

We spoke about:

- Information theory and RL

- Exploration, epistemic uncertainty and joint predictions

- Epistemic Neural Networks and scaling to LLMs

Featured References

Reinforcement Learning, Bit by Bit
Xiuyuan Lu, Benjamin Van Roy, Vikranth Dwaracherla, Morteza Ibrahimi, Ian Osband, Zheng Wen

From Predictions to Decisions: The Importance of Joint Predictive Distributions

Zheng Wen, Ian Osband, Chao Qin, Xiuyuan Lu, Morteza Ibrahimi, Vikranth Dwaracherla, Mohammad Asghari, Benjamin Van Roy

Epistemic Neural Networks

Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy

Approximate Thompson Sampling via Epistemic Neural Networks

Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy

Additional References

Thesis defence, Ian Osband
Homepage, Ian Osband
Epistemic Neural Networks at Stanford RL Forum
Behaviour Suite for Reinforcement Learning, Osband et al 2019
Efficient Exploration for LLMs, Dwaracherla et al 2024

...more

View all episodes

By Robin Ranjit Singh Chauhan

4.9

2929 ratings

March 07, 2024

Ian Osband

1 hour 8 minutes

Ian Osband is a Research scientist at OpenAI (ex DeepMind, Stanford) working on decision making under uncertainty.

We spoke about:

- Information theory and RL

- Exploration, epistemic uncertainty and joint predictions

- Epistemic Neural Networks and scaling to LLMs

Featured References

Reinforcement Learning, Bit by Bit
Xiuyuan Lu, Benjamin Van Roy, Vikranth Dwaracherla, Morteza Ibrahimi, Ian Osband, Zheng Wen

From Predictions to Decisions: The Importance of Joint Predictive Distributions

Zheng Wen, Ian Osband, Chao Qin, Xiuyuan Lu, Morteza Ibrahimi, Vikranth Dwaracherla, Mohammad Asghari, Benjamin Van Roy

Epistemic Neural Networks

Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy

Approximate Thompson Sampling via Epistemic Neural Networks

Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy

Additional References

Thesis defence, Ian Osband
Homepage, Ian Osband
Epistemic Neural Networks at Stanford RL Forum
Behaviour Suite for Reinforcement Learning, Osband et al 2019
Efficient Exploration for LLMs, Dwaracherla et al 2024

...more

More shows like TalkRL: The Reinforcement Learning Podcast

View all

Planet Money

30,666 Listeners

Making Sense with Sam Harris

26,250 Listeners

Conversations with Tyler

2,447 Listeners

The a16z Show

1,093 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn

301 Listeners

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

4,170 Listeners

Practical AI

208 Listeners

Google DeepMind: The Podcast

202 Listeners

All-In with Chamath, Jason, Sacks & Friedberg

10,182 Listeners

Machine Learning Street Talk (MLST)

99 Listeners

Dwarkesh Podcast

576 Listeners

Hard Fork

5,530 Listeners

No Priors: Artificial Intelligence | Technology | Startups

143 Listeners

Latent Space: The AI Engineer Podcast

101 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis

682 Listeners

Share Ian Osband

Sign up to save your podcasts

Ian Osband

Ian Osband

More shows like TalkRL: The Reinforcement Learning Podcast

Planet Money

Making Sense with Sam Harris

Conversations with Tyler

The a16z Show

Super Data Science: ML & AI Podcast with Jon Krohn

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

Practical AI

Google DeepMind: The Podcast

All-In with Chamath, Jason, Sacks & Friedberg

Machine Learning Street Talk (MLST)

Dwarkesh Podcast

Hard Fork

No Priors: Artificial Intelligence | Technology | Startups

Latent Space: The AI Engineer Podcast

The AI Daily Brief: Artificial Intelligence News and Analysis