The Nonlinear Library: Alignment Forum

AF - Which Model Properties are Necessary for Evaluating an Argument? by Vojtech Kovarik


Listen Later

Link to original article

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Which Model Properties are Necessary for Evaluating an Argument?, published by Vojtech Kovarik on February 21, 2024 on The AI Alignment Forum.
This post overlaps with our recent paper Extinction Risk from AI: Invisible to Science?.
Summary: If you want to use a mathematical model to evaluate an argument, that model needs to allow for dynamics that are crucial to the argument. For example, if you want to evaluate the argument that "a rocket will miss the Moon because the Moon moves", you, arguably, can't use a model where the Moon is stationary.
This seems, and is, kind of obvious. However, I think this principle has some less obvious implications, including as a methodology for determining what does and doesn't constitute evidence for AI risk. Additionally, I think some of what I write is quite debatable --- if not the ideas, then definitely the formulations. So I am making this a separate post, to decouple the critiques of these ideas/formulations from the discussion of other ideas that build on top of it.
Epistemic status: I am confident that the ideas the text is trying to point at are valid and important. And I think they are not appreciated widely enough. At the same time, I don't think that I have found the right way to phrase them. So objections are welcome, and if you can figure out how to phrase some of this in a clearer way, that would be even better.
Motivating Story
Consider the following fictitious scenario where Alice and Bob disagree on whether Alice's rocket will succeed at landing on the Moon[1]. (Unsurprisingly, we can view this as a metaphor for disagreements about the success of plans to align AGI. However, the point I am trying to make is supposed to be applicable more generally.)
Alice: Look! I have built a rocket. I am sure that if I launch it, it will land on the Moon.
Bob: I don't think it will work.
Alice: Uhm, why? I don't see a reason why it shouldn't work.
Bob: Actually, I don't think the burden of proof is on me here. And honestly, I don't know what exactly your rocket will do. But I see many arguments I could give you, for why the rocket is unlikely to land on the Moon. So let me try to give you a simple one.
Alice: Sure, go ahead.
Bob: Do I understand it correctly that the rocket is currently pointing at the Moon, and it has no way to steer after launch?
Alice: That's right.
Bob: Ok, so my argument is this: the rocket will miss because the Moon moves.[2] That is, let's say the rocket is pointed at where the Moon is at the time of the launch. Then by time the rocket would have reached the Moon's position, the Moon will already be at some other place.
Alice: Ok, fine. Let's say I agree that this argument makes sense. But that still doesn't mean that the rocket will miss!
Bob: That's true. Based on what I said, the rocket could still fly so fast that the Moon wouldn't have time to move away.
Alice: Alright. So let's try to do a straightforward evaluation[3] of your argument. I suppose that would mean building some mathematical model, finding the correct data to plug into it (for example, the distance to the Moon), and calculating whether the model-rocket will model-hit the model-Moon.
Bob: Exactly. Now to figure out which model we should use for this...
Evaluating Arguments, rather than Propositions
An important aspect of the scenario above is that Alice and Bob decided not to focus directly on the proposition "The rocket will hit the moon.", but instead on a particular argument against that proposition.
This approach has an important advantage that, if the argument is simple, evaluating it can be much easier than evaluating the proposition. For example, suppose the rocket will in fact miss "because" of some other reason, such as exploding soon after launch or colliding with a satellite. Then this might be beyond Alice and Bob's ability t...
...more
View all episodesView all episodes
Download on the App Store

The Nonlinear Library: Alignment ForumBy The Nonlinear Fund


More shows like The Nonlinear Library: Alignment Forum

View all
AXRP - the AI X-risk Research Podcast by Daniel Filan

AXRP - the AI X-risk Research Podcast

9 Listeners