Data Science at Home

Leveling Up AI: Reinforcement Learning with Human Feedback (Ep. 222)


Listen Later

In this episode, we dive into the not-so-secret sauce of ChatGPT, and what makes it a different model than its predecessors in the field of NLP and Large Language Models.

We explore how human feedback can be used to speed up the learning process in reinforcement learning, making it more efficient and effective.

Whether you're a machine learning practitioner, researcher, or simply curious about how machines learn, this episode will give you a fascinating glimpse into the world of reinforcement learning with human feedback.

 

Sponsors

This episode is supported by How to Fix the Internet, a cool podcast from the Electronic Frontier Foundation and Bloomberg, global provider of financial news and information, including real-time and historical price data, financial data, trading news, and analyst coverage.

 

References

Learning through human feedback

https://www.deepmind.com/blog/learning-through-human-feedback

 

Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

https://arxiv.org/abs/2204.05862

...more
View all episodesView all episodes
Download on the App Store

Data Science at HomeBy Francesco Gadaleta

  • 4.2
  • 4.2
  • 4.2
  • 4.2
  • 4.2

4.2

72 ratings


More shows like Data Science at Home

View all
Radiolab by WNYC Studios

Radiolab

43,860 Listeners

TED Talks Daily by TED

TED Talks Daily

11,288 Listeners

Learning English Conversations by BBC Radio

Learning English Conversations

1,051 Listeners

Stuff You Should Know by iHeartPodcasts

Stuff You Should Know

77,249 Listeners

Data Skeptic by Kyle Polich

Data Skeptic

474 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

585 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

200 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

297 Listeners

Learning English from the News by BBC Radio

Learning English from the News

254 Listeners

DataFramed by DataCamp

DataFramed

268 Listeners

Practical AI by Practical AI LLC

Practical AI

196 Listeners

The Intelligence from The Economist by The Economist

The Intelligence from The Economist

2,537 Listeners

Raport o stanie świata Dariusza Rosiaka by Dariusz Rosiak

Raport o stanie świata Dariusza Rosiaka

42 Listeners

The Ancients by History Hit

The Ancients

2,827 Listeners

Hard Fork by The New York Times

Hard Fork

5,371 Listeners