Argmax

15: InstructGPT


Listen Later

In this episode we discuss the paper "Training language models to follow instructions with human feedback" by Ouyang et al (2022). We discuss the RLHF paradigm and how important RL is to tuning GPT.

...more
View all episodesView all episodes
Download on the App Store

ArgmaxBy Vahe Hagopian, Taka Hasegawa, Farrukh Rahman