Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Telopheme, telophore, and telotect, published by Tsvi Benson-Tilsen on September 17, 2023 on The AI Alignment Forum.
[Metadata: crossposted from. First completed June 7, 2023.]
To come to know that a mind will have some specified ultimate effect on the world, first come to know, narrowly and in full, what about the mind makes it have effects on the world.
The fundamental question
Suppose there is a strong mind that has large effects on the word. What determines the effects of the mind?
What sort of object is this question asking for? Most obviously it's asking for a sort of "rudder" for a mind: an element of the mind that can be easily tweaked by an external specifier to "steer" the mind, i.e. to specify the mind's ultimate effects on the world. For example, a utility function for a classical agent is a rudder.
But in asking the fundamental question that way - - asking for a rudder - - that essay losses grasp of the slippery question and the real question withdraws. The section of that essay on The word "What", as in ¿What sort of thing is a "what" in the question "What determines a mind's effects?", brushes against the border of this issue but doesn't trek further in. That section asks:
What sort of element can determine a mind's effects?
It should have asked more fully:
What are the preconditions under which an element can (knowably, wieldily, densely) determine a mind's effects?
That is, what structure does a mind have to possess, so that there can be an element that determines the mind's ultimate effects?
To put it another way: asking how to "put a goal into an agent" makes it sound like there's a slot in the agent for a goal; asking how to "point the agent" makes it sound like the agent has the capacity to go in a specified direction. Here the question is, what does an agent need to have, if it has the capacity to go in a specified direction?
Synopsis
Telopheme
The rudder, the element that determines the mind's ultimate effects, is a telopheme. The morpheme "telo-" means "telos" = "goal, end, purpose", here meaning "ultimate effects". The morpheme "-pheme" is like "blaspheme" ("deceive-speak"). ("Telopheme" is probably wrong morphology and doesn't indicate an agent noun, which it ought to do, but sadly I don't speak Ancient Greek.) So a telopheme is a goal-sayer: it says the goal, the end, the ultimate effects.
For example, a utility function for an omnipotent classical cartesian agent is a telopheme.
The utility function says how good different worlds are, and by saying that, it determines that the ultimate effect of the agent will be to make the world be the world most highly scored by the utility function.
Not only does the utility function determine the world, but it (perhaps) does so densely, knowably, and wieldily.
Densely: We can imagine that the utility function is expressed compactly, compared to the complex behavior that the agent executes in pursuit of the best world as a consequence of preimaging the best world through the agent's world-model onto behavior.
Knowably: The structure of the agent shows transparently that it will make the world be whatever world is best according to the utility function.
Wieldily: IF (⟵ big if) there is available a faithful interpretation of our language about possible worlds into the agent's language about possible worlds, then we can (if granted access) fluently redirect the agent so that the world ends up as anything we choose by saying a new utility function with that world as the maximum.
There's a hidden implication in the name "telopheme". The implication is that ultimate effects are speakable.
Telophore
The minimal sufficient preconditions for a mind's telopheme to be a telopheme is the telophore (or telophor) of the telopheme. Here "-phore" means "bearer, carrier" (as in "phosphorus" = "light-bearer", "metapho...