The Nonlinear Library

LW - United We Align: Harnessing Collective Human Intelligence for AI Alignment Progress by Shoshannah Tekofsky


Listen Later

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: United We Align: Harnessing Collective Human Intelligence for AI Alignment Progress, published by Shoshannah Tekofsky on April 20, 2023 on LessWrong.
Summary: Collective Human Intelligence (CHI) represents both the current height of general intelligence and a model of alignment among intelligent agents. However, CHI's efficiency and inner workings remain underexplored. This research agenda proposes six experimental directions to enhance and understand CHI, in pursuit of potential runway for alignment solutions and a possible paradigm shift similar to neural networks in early AI. Through playful, iterative, and collaborative experiments, this broad and exploratory approach can hopefully bring significant incremental progress towards AI alignment while shedding light on an often overlooked area of study.
Introduction
This research agenda is about accelerating alignment research by increasing our Collective Human Intelligence (CHI) while exploring the structure of interhuman alignment along the way. A collective intelligence can be increased by improving the quantity, quality, and coordination of its nodes. At the same time, CHI is a form of interhuman alignment that may contain mechanics that can prove useful for AI alignment. Here "interhuman alignment" refers to the instinctual and emotional desire of humans to help each other. It does not refer to calculated, conscious trade.
In this document, I propose six research directions to explore the potential of CHI to accelerate progress on the alignment problem. Feel free to skip straight to the proposed experiments, or walk with me through definitions, framing, and background literature. Lastly, there will be a short discussion of the possible failure modes of this research agenda. If you find yourself intrigued and would like to discuss these ideas with me in more detail, feel free to reach out.
Definitions: Parameterized Intelligence
Intelligence has at least nine different competing definitions. To sidestep a semantic rabbit hole, I will use the word "intelligence" to specifically refer to how good something is at "mathematical optimization", or "the selection of the best element based on a particular criterion from a set of available alternatives". Such intelligence has a direction (motivation), a depth (generality), a type (human or artificial), and a multi-dimensional property of modularity (collectivity). I won’t argue that this is the ground truth of intelligence, but it is my current working hypothesis on the conceptual structure of it.
Let’s have a quick look at each property.
Motivation is assumed to be uncorrelated to intelligence as stated in the orthogonality thesis. As such, it underlies much of the alignment problem itself. If intelligence was somehow related to motivation then we could leverage that relationship to aim AI where we want it to go. Unfortunately, nearly everything is an equally valid target for optimization. Thus, the art and science of aiming an artificial intelligence at the goal of (at the very least) not killing us all, is what I refer to as "the AI alignment problem". Throughout this research agenda, I'm agnostic toward value versus intent alignment implementations. Generating runway for solving the problem benefits both approaches, while understanding interhuman alignment will most plausibly result in us recreating whatever implementation of alignment our own brains happen to run.
Generality of intelligence can be conceptualized as the depth parameter of an optimization process. By depth I mean to refer to the deeper patterns in reality that generalize farther. Generality is a continuous or rank-order categorical variable with extremely many categories, such that one can learn ever more general patterns (to some unknown limit). It is the opposite of narrow intelligence - the ability t...
...more
View all episodesView all episodes
Download on the App Store

The Nonlinear LibraryBy The Nonlinear Fund

  • 4.6
  • 4.6
  • 4.6
  • 4.6
  • 4.6

4.6

8 ratings