
Sign up to save your podcasts
Or


There are a couple of frames I find useful when understanding why different people talk very differently about AI safety - the wall, and the bridge.
A wall is incrementally useful. Every additional brick you add is good, and the more bricks you add the better. If you are adding a brick to the wall you are doing something good, regardless of the current state of the wall.
A bridge requires a certain amount of investment. There's not much use for half a bridge. Once the bridge crosses the lake, it can be improved - but until you get a working bridge, you have nothing.
A solid example of wall thinking is the image in this thread by Chris Olah. Any approach around “eating marginal probability” involves a wall frame. Another example is the theory of change of the standards work I've done for Inspect Evals, which I would summarise as “Other fields like aviation and rocketry have solid safety standards and paradigms. We need to build this for evaluations - it's the kind of thing that a mature AI safety field needs to have.” This theory doesn’t have a full story of how it helps [...]
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
By LessWrong There are a couple of frames I find useful when understanding why different people talk very differently about AI safety - the wall, and the bridge.
A wall is incrementally useful. Every additional brick you add is good, and the more bricks you add the better. If you are adding a brick to the wall you are doing something good, regardless of the current state of the wall.
A bridge requires a certain amount of investment. There's not much use for half a bridge. Once the bridge crosses the lake, it can be improved - but until you get a working bridge, you have nothing.
A solid example of wall thinking is the image in this thread by Chris Olah. Any approach around “eating marginal probability” involves a wall frame. Another example is the theory of change of the standards work I've done for Inspect Evals, which I would summarise as “Other fields like aviation and rocketry have solid safety standards and paradigms. We need to build this for evaluations - it's the kind of thing that a mature AI safety field needs to have.” This theory doesn’t have a full story of how it helps [...]
---
First published:
Source:
---
Narrated by TYPE III AUDIO.

112,326 Listeners

130 Listeners

7,242 Listeners

559 Listeners

16,321 Listeners

4 Listeners

14 Listeners

2 Listeners