The Nonlinear Library

LW - GPTs are Predictors, not Imitators by Eliezer Yudkowsky


Listen Later

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: GPTs are Predictors, not Imitators, published by Eliezer Yudkowsky on April 8, 2023 on LessWrong.
(Related text posted to Twitter; this version is edited and has a more advanced final section.)
Imagine yourself in a box, trying to predict the next word - assign as much probability mass to the next token as possible - for all the text on the Internet.
Koan: Is this a task whose difficulty caps out as human intelligence, or at the intelligence level of the smartest human who wrote any Internet text? What factors make that task easier, or harder? (If you don't have an answer, maybe take a minute to generate one, or alternatively, try to predict what I'll say next; if you do have an answer, take a moment to review it inside your mind, or maybe say the words out loud.)
Consider that somewhere on the internet is probably a list of thruples: .
GPT obviously isn't going to predict that successfully for significantly-sized primes, but it illustrates the basic point:
There is no law saying that a predictor only needs to be as intelligent as the generator, in order to predict the generator's next token.
Indeed, in general, you've got to be more intelligent to predict particular X, than to generate realistic X. GPTs are being trained to a much harder task than GANs.
Same spirit: pairs, which you can't predict without cracking the hash algorithm, but which you could far more easily generate typical instances of if you were trying to pass a GAN's discriminator about it (assuming a discriminator that had learned to compute hash functions).
Consider that some of the text on the Internet isn't humans casually chatting. It's the results section of a science paper. It's news stories that say what happened on a particular day, where maybe no human would be smart enough to predict the next thing that happened in the news story in advance of it happening.
As Ilya Sutskever compactly put it, to learn to predict text, is to learn to predict the causal processes of which the text is a shadow.
Lots of what's shadowed on the Internet has a complicated causal process generating it.
Consider that sometimes human beings, in the course of talking, make errors.
GPTs are not being trained to imitate human error. They're being trained to predict human error.
Consider the asymmetry between you, who makes an error, and an outside mind that knows you well enough and in enough detail to predict which errors you'll make.
If you then ask that predictor to become an actress and play the character of you, the actress will guess which errors you'll make, and play those errors. If the actress guesses correctly, it doesn't mean the actress is just as error-prone as you.
Consider that a lot of the text on the Internet isn't extemporaneous speech. It's text that people crafted over hours or days.
GPT-4 is being asked to predict it in 200 serial steps or however many layers it's got, just like if a human was extemporizing their immediate thoughts.
A human can write a rap battle in an hour. A GPT loss function would like the GPT to be intelligent enough to predict it on the fly.
Or maybe simplest:
Imagine somebody telling you to make up random words, and you say, "Morvelkainen bloombla ringa mongo."
Imagine a mind of a level - where, to be clear, I'm not saying GPTs are at this level yet
Imagine a Mind of a level where it can hear you say 'morvelkainen blaambla ringa', and maybe also read your entire social media history, and then manage to assign 20% probability that your next utterance is 'mongo'.
The fact that this Mind could double as a really good actor playing your character, does not mean They are only exactly as smart as you.
When you're trying to be human-equivalent at writing text, you can just make up whatever output, and it's now a human output because you're human and you chose to output that.
GPT-4 is...
...more
View all episodesView all episodes
Download on the App Store

The Nonlinear LibraryBy The Nonlinear Fund

  • 4.6
  • 4.6
  • 4.6
  • 4.6
  • 4.6

4.6

8 ratings