Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Consequentialists: One-Way Pattern Traps, published by David Udell on January 16, 2023 on LessWrong.
Generated during MATS 2.1.
A distillation of my understanding of Eliezer-consequentialism.
Thanks to Jeremy Gillen, Ben Goodman, Paul Colognese, Daniel Kokotajlo, Scott Viteri, Peter Barnett, Garrett Baker, and Olivia Jimenez for discussion and/or feedback; to Eliezer Yudkowsky for briefly chatting about relevant bits in planecrash; to Quintin Pope for causally significant conversation; and to many others that I've bounced my thoughts on this topic off of.
Introduction
What is Eliezer-consequentialism? In a nutshell, I think it's the way that some physical structures monotonically accumulate patterns in the world. Some of these patterns afford influence over other patterns, and some physical structures monotonically accumulate patterns-that-matter in particular -- resources. We call such a resource accumulator a consequentialist -- or, equivalently, an "agent," an "intelligence," etc.
A consequentialist understood in this way is (1) a coherent profile of reflexes (a set of behavioral reflexes that together monotonically take in resources) plus (2) an inventory (some place where accumulated resources can be stored with better than background-chance reliability.)
Note that an Eliezer-consequentialist is not necessarily a consequentialist in the normative ethics sense of the term. By consequentialists we'll just mean agents, including wholly amoral agents. I'll freely use the terms 'consequentialism' and 'consequentialist' henceforth with this meaning, without fretting any more about this confusion.
Path to Impact
I noticed hanging around the MATS London office that even full-time alignment researchers disagree quite a bit about what consequentialism involves. I'm betting here that my Eliezer-model is good enough that I've understood his ideas on the topic better than many others have, and can concisely communicate this better understanding.
Since most of the possible positive impact of this effort lives in the fat tail of outcomes where it makes a lot of Eliezerisms click for a lot of alignment workers, I'll make this an effortpost.
The Ideas to be Clarified
I've noticed that Eliezer seems to think the von Neumann-Morgenstern (VNM) theorem is obviously far reaching in a way that few others do.
Understand the concept of VNM rationality, which I recommend learning from the Wikipedia article... Von Neumann and Morgenstern showed that any agent obeying a few simple consistency axioms acts with preferences characterizable by a utility function.
MIRI Research Guide (2015)
Can you explain a little more what you mean by "have different parts of your thoughts work well together"? Is this something like the capacity for metacognition; or the global workspace; or self-control; or...?
No, it's like when you don't, like, pay five apples for something on Monday, sell it for two oranges on Tuesday, and then trade an orange for an apple.
I have still not figured out the homework exercises to convey to somebody the Word of Power which is "coherence" by which they will be able to look at the water, and see "coherence" in places like a cat walking across the room without tripping over itself.
When you do lots of reasoning about arithmetic correctly, without making a misstep, that long chain of thoughts with many different pieces diverging and ultimately converging, ends up making some statement that is... still true and still about numbers! Wow! How do so many different thoughts add up to having this property? Wouldn't they wander off and end up being about tribal politics instead, like on the Internet?
And one way you could look at this, is that even though all these thoughts are taking place in a bounded mind, they are shadows of a higher unbounded structure which is the model identifie...