The Nonlinear Library: Alignment Forum

AF - A hermeneutic net for agency by Tsvi Benson-Tilsen


Listen Later

Link to original article

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: A hermeneutic net for agency, published by Tsvi Benson-Tilsen on January 1, 2024 on The AI Alignment Forum.
[Metadata: crossposted from https://tsvibt.blogspot.com/2023/09/a-hermeneutic-net-for-agency.html. First completed September 4, 2023.]
A hermeneutic net for agency is a natural method to try, to solve a bunch of philosophical difficulties relatively quickly. Not to say that it would work. It's just the obvious thing to try.
Thanks to Sam Eisenstat for related conversations.
Summary
To create AGI that's aligned with human wanting, it's necessary to design deep mental structures and resolve confusions about mind. To design structures and resolve confusions, we want to think in terms of suitable concepts. We don't already have the concepts we'd need to think clearly enough about minds. So we want to modify our concepts and create new concepts. The new concepts have to be selected by the Criterion of providing suitable elements of thinking that will be adequate to create AGI that's aligned with human wanting.
The Criterion of providing suitable elements of thinking is expressed in propositions. These propositions use the concepts we already have. Since the concepts we already have are inadequate, the propositions do not express the Criterion quite rightly. So, we question one concept, with the goal of replacing it with one or more concepts that will more suitably play the role that the current concept is playing.
But when we try to answer the demands of a proposition, we're also told to question the other concepts used by that proposition. The other concepts are not already suitable to be questioned - - and they will, themselves, if questioned, tell us to question yet more concepts. Lacking all conviction, we give up even before we are really overwhelmed.
The hermeneutic net would brute-force this problem by analyzing all the concepts relevant to AGI alignment "at once". In the hermeneutic net, each concept would be questioned, simultaneously trying to rectify or replace that concept and also trying to preliminarily analyze the concept. The concept is preliminarily analyzed in preparation, so that, even if it is not in its final form, it at least makes itself suitably available for adjacent inquiries.
The preliminary analysis collects examples, lays out intuitions, lays out formal concepts, lays out the relations between these examples, intuitions, and formal concepts, collects desiderata for the concept such as propositions that use the concept, and finds inconsistencies in the use of the concept and in propositions asserted about it.
Then, when it comes time to think about another related concept - - for example, "corrigibility", which involves "trying" and "flaw" and "self" and "agent" and so on - - those concepts ("flaw" and so on) have been prepared to well-assist with the inquiry about "corrigibility". Those related concepts have been prepared so that they easily offer up, to the inquiry about "corrigibility", the rearrangeable conceptual material needed to arrange a novel, suitable idea of "flaw" - - a novel idea of "flaw" that will both be locally suitable to the local inquiry of "corrigibility" (suitable, that is, in the role that was preliminarily assigned, by the inquiry, to the preliminary idea of "flaw"), and that will also have mostly relevant meaning mostly transferable across to other contexts that will want to use the idea of "flaw".
The need for better concepts
Hopeworthy paths start with pretheoretical concepts
The only sort of pathway that appears hopeworthy to work out how to align an AGI with human wanting is the sort of pathway that starts with a pretheoretical idea that relies heavily on inexplicit intuitions, expressed in common language. As an exemplar, take the "Hard problem of corrigibility":
The "hard problem of corrigibility" is to bui...
...more
View all episodesView all episodes
Download on the App Store

The Nonlinear Library: Alignment ForumBy The Nonlinear Fund


More shows like The Nonlinear Library: Alignment Forum

View all
AXRP - the AI X-risk Research Podcast by Daniel Filan

AXRP - the AI X-risk Research Podcast

9 Listeners