
Sign up to save your podcasts
Or


If we can accurately recognize good performance on alignment, we could elicit lots of useful alignment work from our models, even if they're playing the training game: https://www.planned-obsolescence.org/training-ais-to-help-us-align-ais/
By Ajeya CotraIf we can accurately recognize good performance on alignment, we could elicit lots of useful alignment work from our models, even if they're playing the training game: https://www.planned-obsolescence.org/training-ais-to-help-us-align-ais/

26,380 Listeners

9,724 Listeners

87,868 Listeners

113,121 Listeners

56,944 Listeners

5,576 Listeners

9 Listeners

16,525 Listeners

13 Listeners

3,538 Listeners