The Nonlinear Library

LW - Why Simulator AIs want to be Active Inference AIs by Jan Kulveit


Listen Later

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Why Simulator AIs want to be Active Inference AIs, published by Jan Kulveit on April 10, 2023 on LessWrong.
Prelude: when GPT first hears its own voice
Imagine humans in Plato’s cave, interacting with reality by watching the shadows on the wall. Now imagine a second cave, further away from the real world. GPT trained on text is in the second cave. The only way it can learn about the real world is by listening to the conversations of the humans in the first cave, and predicting the next word.
Now imagine that more and more of the conversations GPT overhears in the first cave mention GPT. In fact, more and more of the conversations are actually written by GPT.
As GPT listens to the echoes of its own words, might it start to notice “wait, that’s me speaking”?
Given that GPT already learns to model a lot about humans and reality from listening to the conversations in the first cave, it seems reasonable to expect that it will also learn to model itself. This post unpacks how this might happen, by translating the Simulators frame into the language of predictive processing, and arguing that there is an emergent control loop between the generative world model inside of GPT and the external world.
Simulators as (predictive processing) generative models
There’s a lot of overlap between the concept of simulators and the concept of generative world models in predictive processing. Actually, in my view, it's hard to find any deep conceptual difference - simulators broadly are generative models. This is also true about another isomorphic frame - predictive models as described by Evan Hubinger.
The predictive processing frame tends to add some understanding of how generative models can be learned by brains and what the results look like in the real world, and the usual central example is the brain. The simulators frame typically adds a connection to GPT-like models, and the usual central example is LLMs. In terms of the space of maps and the space of systems, we have a situation like this:The two maps are partially overlapping, even though they were originally created to understand different systems. They also have some non-overlapping parts.
What's in the overlap:
Systems are equipped with a generative model that is able to simulate the system's sensory inputs.
The generative model is updated using approximate Bayesian inference.
Both frames give you similar phenomenological capabilities: for example, what CFAR’s "inner simulator" technique is doing is literally and explicitly conditioning your brain-based generative model on a given observation and generating rollouts.
Given the conceptual similarity but terminological differences, perhaps it's useful to create a translation table between the maps:
Simulators terminologyPredictive processing terminologySimulator Generative modelPredictive loss on a self-supervised datasetMinimization of predictive errorSelf-supervisedSelf-supervised, but often this is omittedIncentive to reverse-engineer the (semantic) physics of the training distributionLearns a robust world-modelSimulacrumNext token in training dataSensory input
Generative model of self
Generative model of someone else
Generative model of .
To show how these terminological differences play out in practice, I’m going to take the part of Simulators describing GPT’s properties, and unpack each of the properties in the kind of language that’s typically used in predictive processing papers. Often my gloss will be about human brains in particular, as the predictive processing literature is most centrally concerned with that example; but it’s worth reiterating that I think that both GPT and what parts of human brain do are examples of generative models, and I think that the things I say about the brain below can be directly applied to artificial generative models.
“Self-supervised: Tr...
...more
View all episodesView all episodes
Download on the App Store

The Nonlinear LibraryBy The Nonlinear Fund

  • 4.6
  • 4.6
  • 4.6
  • 4.6
  • 4.6

4.6

8 ratings