
Sign up to save your podcasts
Or
If we can accurately recognize good performance on alignment, we could elicit lots of useful alignment work from our models, even if they're playing the training game: https://www.planned-obsolescence.org/training-ais-to-help-us-align-ais/
If we can accurately recognize good performance on alignment, we could elicit lots of useful alignment work from our models, even if they're playing the training game: https://www.planned-obsolescence.org/training-ais-to-help-us-align-ais/
26,462 Listeners
9,140 Listeners
86,615 Listeners
111,785 Listeners
56,191 Listeners
5,443 Listeners
9 Listeners
15,237 Listeners
12 Listeners
3,326 Listeners