Linear Digressions

A Key Concept in AI Alignment: Deep Reinforcement Learning from Human Preferences


Listen Later

Modern AI chatbots have a few different things that go into creating them. Today we're going to talk about a really important part of the process: the alignment training, where the chatbot goes from being just a pre-trained model—something that's kind of a fancy autocomplete—to something that really gives responses to human prompts that are more conversational, that are closer to the ones that we experience when we actually use a model like ChatGPT or Gemini or Claude.
To go from the pre-trained model to one that's aligned, that's ready for a human to talk with, it uses reinforcement learning. And a really important step in figuring out the right way to frame the reinforcement learning problem happened in 2017 with a paper that we're going to talk about today: Deep Reinforcement Learning from Human Preferences.
You are listening to Linear Digressions.
The paper discussed in this episode is Deep Reinforcement Learning from Human Preferences
https://arxiv.org/abs/1706.03741
...more
View all episodesView all episodes
Download on the App Store

Linear DigressionsBy Katie Malone

  • 4.8
  • 4.8
  • 4.8
  • 4.8
  • 4.8

4.8

354 ratings