
Sign up to save your podcasts
Or


If we can accurately recognize good performance on alignment, we could elicit lots of useful alignment work from our models, even if they're playing the training game: https://www.planned-obsolescence.org/training-ais-to-help-us-align-ais/
By Ajeya CotraIf we can accurately recognize good performance on alignment, we could elicit lots of useful alignment work from our models, even if they're playing the training game: https://www.planned-obsolescence.org/training-ais-to-help-us-align-ais/

26,260 Listeners

9,645 Listeners

87,532 Listeners

112,220 Listeners

56,587 Listeners

5,533 Listeners

9 Listeners

16,196 Listeners

12 Listeners

3,512 Listeners