Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The Limit of Language Models, published by DragonGod on January 6, 2023 on LessWrong.
Epistemic Status
Highlighting a thesis in Janus' "Simulators" that I think is insufficiently appreciated.
Thesis
In the limit, models optimised for minimising predictive loss on humanity's text corpus converge towards general intelligence.
Preamble
From Janus' Simulators:
Something which can predict everything all the time is more formidable than any demonstrator it predicts: the upper bound of what can be learned from a dataset is not the most capable trajectory, but the conditional structure of the universe implicated by their sum (though it may not be trivial to extract that knowledge).
Introduction
I affectionately refer to the above quote as the "simulators thesis". Reading and internalising that passage was an "aha!" moment for me. I was already aware (at latest July 2020) that language models were modelling reality. I was persuaded by arguments of the below form:
Premise 1: Modelling is transitive. If X models Y and Y models Z, then X models Z.
Premise 2: Language models reality. "Dogs are mammals" occurs more frequently in text than "dogs are reptiles" because dogs are in actuality mammals and not reptiles. This statistical regularity in text corresponds to a feature of the real world. Language is thus a map (albeit flawed) of the external world.
Premise 3: GPT-3 models language. This is how it works to predict text.
Conclusion: GPT-3 models the external world.
But I hadn't yet fully internalised all the implications of what it means to model language and hence our underlying reality. The limit that optimisation for minimising predictive loss on humanity's text corpus will converge to. I belatedly make those updates.
Interlude: The Requisite Capabilities for Language Modelling
Janus again:
If loss keeps going down on the test set, in the limit – putting aside whether the current paradigm can approach it – the model must be learning to interpret and predict all patterns represented in language, including common-sense reasoning, goal-directed optimization, and deployment of the sum of recorded human knowledge.
Its outputs would behave as intelligent entities in their own right. You could converse with it by alternately generating and adding your responses to its prompt, and it would pass the Turing test. In fact, you could condition it to generate interactive and autonomous versions of any real or fictional person who has been recorded in the training corpus or even could be recorded (in the sense that the record counterfactually “could be” in the test set).
Implications
The limit of predicting text is predicting the underlying processes that generated said text. If said underlying processes are agents, then sufficiently capable language models can predict agent (e.g., human) behaviour to arbitrary fidelity. If it turns out to be the case that the most efficient way of predicting the behaviour of conscious entities (as discriminated via text records) is to instantiate conscious simulacra, then such models may perpetuate mindcrime.
Furthermore, the underlying processes that generate text aren't just humans, but the world which we inhabit. That is, a significant fraction of humanity's text corpus reports on empirical features of our external environment or the underlying structure of reality:
Timestamps
And other empirical measurements
Log files
Database files
Including CSVs and similar
Experiment records
Research findings
Academic journals in quantitative fields
Other reports
Etc.
Moreover, such text is often clearly distinguished from other kinds of text (fiction, opinion pieces, etc.) via its structure, formatting, titles, etc. In the limit of minimising predictive loss on such text, language models must learn the underlying processes that generated them — the conditional str...