Interconnects

RLHF Roundup: Trying to get good at PPO, charting RLHF's impact, RewardBench retrospective, and a reward model competition


Listen Later

Things to be aware of if you work on language model fine-tuning.
This is AI generated audio with Python and 11Labs.
Source code: https://github.com/natolambert/interconnects-tools
Original post: https://www.interconnects.ai/p/rlhf-roundup-2024

00:00 RLHF Roundup: Trying to get good at PPO, charting RLHF's impact, RewardBench retrospective, and a reward model competition
04:32 How big is the impact of RLHF relative to pretraining?
05:54 RewardBench retrospective after 100 models and 90% peak accuracy
09:19 LMSYS's reward modeling competition

Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf-roundup/img_009.png
Fig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf-roundup/img_012.png
Fig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf-roundup/img_017.png
Fig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf-roundup/img_026.png



This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe
...more
View all episodesView all episodes
Download on the App Store

InterconnectsBy Nathan Lambert

  • 4.1
  • 4.1
  • 4.1
  • 4.1
  • 4.1

4.1

9 ratings


More shows like Interconnects

View all
a16z Podcast by Andreessen Horowitz

a16z Podcast

1,003 Listeners

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch by Harry Stebbings

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

512 Listeners

ChinaTalk by Jordan Schneider

ChinaTalk

270 Listeners

Practical AI by Practical AI LLC

Practical AI

193 Listeners

Google DeepMind: The Podcast by Hannah Fry

Google DeepMind: The Podcast

199 Listeners

Last Week in AI by Skynet Today

Last Week in AI

279 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

88 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

348 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

123 Listeners

This Day in AI Podcast by Michael Sharkey, Chris Sharkey

This Day in AI Podcast

190 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

62 Listeners

"Econ 102" with Noah Smith and Erik Torenberg by Turpentine

"Econ 102" with Noah Smith and Erik Torenberg

138 Listeners

BG2Pod with Brad Gerstner and Bill Gurley by BG2Pod

BG2Pod with Brad Gerstner and Bill Gurley

445 Listeners

AI + a16z by a16z

AI + a16z

29 Listeners

Training Data by Sequoia Capital

Training Data

31 Listeners