The Nonlinear Library: Alignment Forum

AF - What is a Tool? by johnswentworth


Listen Later

Link to original article

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: What is a Tool?, published by johnswentworth on June 25, 2024 on The AI Alignment Forum.
Throughout this post, we're going to follow the Cognition -> Convergence -> Consequences methodology[1]. That means we'll tackle tool-ness in three main stages, each building on the previous:
Cognition: What does it mean, cognitively, to view or model something as a tool?
Convergence: Insofar as different minds (e.g. different humans) tend to convergently model the same things as tools, what are the "real patterns" in the environment which give rise to that convergence?
Consequences: Having characterized the real patterns convergently recognized as tool-ness, what other properties or consequences of tool-ness can we derive? What further predictions does our characterization make?
We're not going to do any math in this post, though we will gesture at the spots where proofs or quantitative checks would ideally slot in.
Cognition: What does it mean, cognitively, to view or model something as a tool?
Let's start with a mental model of (the cognition of) problem solving, then we'll see how "tools" naturally fit into that mental model.
When problem-solving, humans often come up with partial plans - i.e. plans which have "gaps" in them, which the human hasn't thought through how to solve, but expects to be tractable. For instance, if I'm planning a roadtrip from San Francisco to Las Vegas, a partial plan might look like "I'll take I-5 down the central valley, split off around Bakersfield through the Mojave, then get on the highway between LA and Vegas".
That plan has a bunch of gaps in it: I'm not sure exactly what route I'll take out of San Francisco onto I-5 (including whether to go across or around the Bay), I don't know which specific exits to take in Bakersfield, I don't know where I'll stop for gas, I haven't decided whether I'll stop at the town museum in Boron, I might try to get pictures of the airplane storage or the solar thermal power plant, etc.
But I expect those to be tractable problems which I can solve later, so it's totally fine for my plan to have such gaps in it.
How do tools fit into that sort of problem-solving cognition?
Well, sometimes similar gaps show up in many different plans (or many times in one plan). And if those gaps are similar enough, then it might be possible to solve them all "in the same way". Sometimes we can even build a physical object which makes it easy to solve a whole cluster of similar gaps.
Consider a screwdriver, for instance. There's a whole broad class of problems for which my partial plans involve unscrewing screws. Those partial plans involve a bunch of similar "unscrew the screw" gaps, for which I usually don't think in advance about how I'll unscrew the screw, because I expect it to be tractable to solve that subproblem when the time comes. A screwdriver is a tool for that class of gaps/subproblems[2].
So here's our rough cognitive characterization:
Humans naturally solve problems using partial plans which contain "gaps", i.e. subproblems which we put off solving until later
Sometimes there are clusters of similar gaps
A tool makes some such cluster relatively easy to solve.
Convergence: Insofar as different minds (e.g. different humans) tend to convergently model the same things as tools, what are the "real patterns" in the environment which give rise to that convergence?
First things first: there are limits to how much different minds do, in fact, convergently model the same things as tools.
You know that thing where there's some weird object or class of objects, and you're not sure what it is or what it's for, but then one day you see somebody using it for its intended purpose and you're like "oh, that's what it's for"? ()
From this, we learn several things about tools:
Insofar as different humans convergently model the same thing...
...more
View all episodesView all episodes
Download on the App Store

The Nonlinear Library: Alignment ForumBy The Nonlinear Fund


More shows like The Nonlinear Library: Alignment Forum

View all
AXRP - the AI X-risk Research Podcast by Daniel Filan

AXRP - the AI X-risk Research Podcast

8 Listeners