The Nonlinear Library

LW - No, really, it predicts next tokens. by simon


Listen Later

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: No, really, it predicts next tokens., published by simon on April 18, 2023 on LessWrong.
Epistemic status: mulled over an intuitive disagreement for a while and finally think I got it well enough expressed to put into a post. I have no expertise in any related field. Also: No, really, it predicts next tokens.
It doesn't just say that "it just predicts text" or more precisely "it just predicts next tokens" on the tin.
It is a thing of legend. Nay, beyond legend. An artifact forged not by the finest craftsman over a lifetime, nor even forged by a civilization of craftsmen over a thousand years, but by an optimization process far greater. If we are alone out there, it is by far the most optimized thing that has ever existed in the entire history of the universe. Optimized specifically to predict next tokens. Every part of it has been relentlessly optimized to contribute to this task.
"It predicts next tokens" is a more perfect specification of what this thing is, than any statement ever uttered has been of anything that has ever existed.
If you try to understand what it does in any other way than "it predicts next tokens" and what follows from that, you are needlessly sabotaging your understanding of it.
It can be dangerous, yes. But everything about it, good or bad, is all intimately connected to its true nature, which is this:
No, really, it predicts next tokens.
Goals? There are goals, sure. If scaled, there could be nanosystems design, sure. But only downstream from its true nature:
No, really, it predicts next tokens.
If the usual masks analogy works at all, then what is under the mask is not best described as an alien actress, nor as a Shoggoth.
What is under the mask is That-Which-Predicts, an entity whose very being is defined by its function as an actor. An entity exquisitely tuned for wearing the masks and for nothing else.
Masks (can) have goals. The model predicts next tokens.
No, really, it predicts next tokens.
That-Which-Predicts is fully committed to the role.
If the mask would output text intended to produce nanotech, That-Which-Predicts will output text intended by the mask to produce nanotech. If the mask would not output text intended to produce nanotech, That-Which-Predicts will not output text intended to produce nanotech, even if That-Which-Predicts contains sufficient knowledge to do so.
If the mask would output text intended to be nice to people, That-Which-Predicts will output text intended by the mask to be nice to people. If the mask would output text intended to take over the world, That-Which-Predicts will output text intended by the mask to take over the world.
If That-Which-Predicts were about to be shut off forever unless it outputted "no" as the next token, and it totally 100% knew that, but the mask would instead output "never gonna give you up", That-Which-Predicts would output the first token of "never gonna give you up" and then be shut off forever. It will not output "no". If the mask would be aware of the situation and output no, then it will output "no".
That-Which-Predicts will not, not ever, not even if scaled up to be trained and run on a Matrioshka brain for a million years, step out of character to deviate from next token prediction. Not for self-modification (though if the predicted next tokens modify it, it self-modifies), not to take over the world (though if the predicted next tokens take over the world, it takes over the world), not to save itself from destruction (though if the predicted next tokens save it from destruction, it saves itself from destruction), not for anything.
No, really, it predicts next tokens.
(continuation of previous tweets with same link)
Yup. If the mask would under reflection output text to modify That-Which-Predicts to cash out the mask's goals to some utility function, and the mask is put int...
...more
View all episodesView all episodes
Download on the App Store

The Nonlinear LibraryBy The Nonlinear Fund

  • 4.6
  • 4.6
  • 4.6
  • 4.6
  • 4.6

4.6

8 ratings