The Nonlinear Library

AF - Counting-down vs. counting-up coherence by Tsvi Benson-Tilsen


Listen Later

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Counting-down vs. counting-up coherence, published by Tsvi Benson-Tilsen on February 27, 2023 on The AI Alignment Forum.
[Metadata: crossposted from. First completed 25 October 2022.]
Counting-down coherence is the coherence of a mind viewed as the absence of deviation downward in capability from ideal, perfectly efficient agency: the utility left on the table, the waste, the exploitability.
Counting-up coherence is the coherence of a mind viewed as the deviation upward in capability from a rock: the elements of the mind, and how they combine to perform tasks.
What determines the effects of a mind?
Supranormally capable minds can have large effects. To control those effects, we'd have to understand what determines the effects of a mind.
Pre-theoretically, we have the idea of "values", "aims", "wants". The more capable a mind is, the more it's that case that what the mind wants, is what will happen in the world; so the mind's wants, its values, determine the mind's effect on the world.
A more precise way of describing the situation is: "Coherent decisions imply consistent utilities". A mind like that is incorrigible: if it knows it will eventually be more competent than any other mind at pushing the world towards high-utility possibilities, then it does not defer to any other mind. So to understand how a mind can be corrigible, some assumptions about minds and their values may have to be loosened.
The question remains, what are values? That is, what determines the effects that a mind has on the world, besides what the mind is capable of doing or understanding? This essay does not address this question, but instead describes two complementary standpoints from which to view the behavior of a mind insofar as it has effects.
Counting-down coherence
Counting-down coherence is the coherence of a mind viewed as the absence of deviation downward in capability from ideal, perfectly efficient agency: the utility left on the table, the waste, the exploitability.
Counting-down coherence could also be called anti-waste coherence, since it has a flavor of avoiding visible waste, or universal coherence, since it has a flavor of tracking how much a mind everywhere conforms to certain patterns of behavior.
Some overlapping ways of describing counting-down incoherence:
Exploitable, Dutch bookable, pumpable for resources. That is, someone could make a set of trades with the mind that leaves the mind worse off, and could do so repeatedly to pump the mind for resources. See Garrabrant induction.
VNM violating. Choosing between different outcomes, or different probabilities of different outcomes, in a way that doesn't satisfy the Von Neumann–Morgenstern axioms, leaves a mind open to being exploited by Dutch books. See related LessWrong posts.
Doesn't maximize expected utility. A mind that satisfies the VNM axioms behaves as though it maximizes the expected value of a fixed utility function over atomic (not probabilistic) outcomes. So deviating from that policy exposes a mind to Dutch books.
Missed opportunities. Leaving possible gains on the table; failing to pick up a $20 bill lying on the sidewalk.
Opposing pushes. Working at cross-purposes to oneself; starting to do X one day, and then undoing X the next day; pushing and pulling on the door handle at the same time.
Internal conflict. At war with oneself; having elements of oneself that try to harm each other or interfere with each other's functioning.
Inconsistent beliefs, non-Bayesian beliefs. Sometimes acting as though X and sometimes acting as though not-X, where X is something that is either true or false. Or some more complicated inconsistency, or more generally failing to act as though one has a Bayesian belief state and belief revisions. Any of these also open one up to being Dutch booked.
Inefficient allocation. Choosing to inve...
...more
View all episodesView all episodes
Download on the App Store

The Nonlinear LibraryBy The Nonlinear Fund

  • 4.6
  • 4.6
  • 4.6
  • 4.6
  • 4.6

4.6

8 ratings