The Nonlinear Library

LW - Two problems with ‘Simulators’ as a frame by ryan greenblatt


Listen Later

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Two problems with ‘Simulators’ as a frame, published by ryan greenblatt on February 17, 2023 on LessWrong.
(Thanks to Lawrence Chan and Buck Shlegeris for comments. Thanks to Nate Thomas for many comments and editing)
Despite appreciating and agreeing with various specific points made in the Simulators post, I broadly think that the term ‘simulator’ and the corresponding frame probably shouldn’t be used. Instead, I think we should just directly reason about predictors and think in terms of questions such as ‘what would the model predict for the next token?’[1]
In this post, I won’t make arguments that I think are strong enough to decisively justify this claim, but I will argue for two points that support it:
The word ‘simulation’ as used in the Simulators post doesn’t correspond to a single simulation of reality, and a ‘simulacrum’ doesn’t correspond to an approximation of a single agent in reality. Instead a ‘simulation’ corresponds to a distribution over processes that generated the text. This distribution in general contains uncertainty over a wide space of different agents involved in those text generating processes.
Systems can be very good at prediction yet very bad at plausible generation – in other words, very bad at ‘running simulations’.
The rest of the post elaborates on these claims.
I think the author of the Simulators post is aware of these objections. I broadly endorse the perspective in ‘simulator’ framing and confusions about LLMs, which also argues against the simulator framing to some extent. For another example of prior work on these two points, see this discussion of models recognizing that they are generating text due to generator discriminator gaps in the Conditioning Predictive Models sequence[2].
Related work
Simulators, ‘simulator’ framing and confusions about LLMs, Conditioning Predictive Models
Language models are predictors, not simulators
My main issue with the terms ‘simulator’, ‘simulation’, ‘simulacra’, etc is that a language model ‘simulating a simulacrum’ doesn’t correspond to a single simulation of reality, even in the high-capability limit. Instead, language model generation corresponds to a distribution over all the settings of latent variables which could have produced the previous tokens, aka “a prediction”.
Let’s go through an example: Suppose we prompt the model with “endoftextNEW YORK—After John McCain was seen bartending at a seedy nightclub”. I’d claim the model's next token prediction will involve uncertainty over the space of all the different authors which could have written this passage, as well as all the possible newspapers, etc. It presumably can’t internally represent the probability of each specific author and newspaper, though I expect bigger models will latently have an estimate for the probability that text like this was written by particularly prolific authors with particularly distinctive styles as well as a latent estimate for particular sites. In this case, code-davinci-002 is quite confident this prompt comes from The Onion[3].
In practice, I think it’s tempting to think of a model as running a particular simulation of reality, but performing well at the objective of next-token prediction doesn’t result in the output you would get from a single, particular simulation. In the previous example, the model might be certain that the piece is from The Onion after it’s generated many tokens, but it’s presumably not sure which author at the Onion wrote it or what the publication date is.
Pretrained models don’t ‘simulate a character speaking’; they predict what comes next, which implicitly involves making predictions about the distribution of characters and what they would say next.
I’ve seen this mistake made frequently – for example, see this post (note that in this case the mistake doesn’t change the conclusion...
...more
View all episodesView all episodes
Download on the App Store

The Nonlinear LibraryBy The Nonlinear Fund

  • 4.6
  • 4.6
  • 4.6
  • 4.6
  • 4.6

4.6

8 ratings