The Nonlinear Library

AF - Löb's Theorem for implicit reasoning in natural language: Löbian party invitations by Andrew Critch


Listen Later

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Löb's Theorem for implicit reasoning in natural language: Löbian party invitations, published by Andrew Critch on January 1, 2023 on The AI Alignment Forum.
Related to: Löb's Lemma: an easier approach to Löb's Theorem.
Natural language models are really taking off, and it turns out there's an analogue of Löb's Theorem that occurs entirely in natural language — no math needed. This post will walk you through the details in a simple example: a very implicit party invitation.
Motivation
(Skip this if you just want to see the argument.)
Understanding the structure here may be helpful for anticipating whether Löbian phenomena can, will, or should arise amongst language-based AI systems. For instance, Löb's Theorem has implications for the emergence of cooperation and defection in groups of formally defined agents (LaVictoire et al, 2014; Critch, Dennis, Russell, 2022). The natural language version of Löb could play a similar role amongst agents that use language, which is something I plan to explore in a future post. Aside from being fun, I'm hoping this post will make clear that the phenomenon underlying Löb's Theorem isn't just a feature of formal logic or arithmetic, but of any language that can talk about reasoning and deduction in that language, including English. And as Ben Pace points out here, invitations are often self-referential, such as when people say "You are hereby invited to the party": hereby means "by this utterance" (google search).
So invitations a natural place to explore the kind of self-reference happening in Löb's Theorem.
This post isn't really intended as an "explanation" of Löb's Theorem in its classical form, which is about arithmetic. Rather, the arguments here stand entirely on their own, are written in natural language, and are about natural language phenomena. That said, this post could still function as an "explanation" of Löb's Theorem because of the tight analogy with it.
Implicitness
Okay, imagine there's a party, and maybe you're invited to it. Or maybe you're implicitly invited to it. Either way, we'll be talking a bunch about things being implicit, with phrasing like this:
"It's implicit that X",
"Implicitly X", or
"X is implicit".
These will all mean "X is implied by things that are known (to you) (via deduction or logical inference)".
Explicit knowledge is also implicit. In this technical sense of the word, "implicit" and "explicit" are not actually mutually exclusive: X trivially implies X, so if you explicitly observed X in the world, then you also know X implicitly. If you find this bothersome or confusing, just grant me this anyway, or skip to "Why I don't treat 'implicit' and inexplicit' as synonyms here" at the end.
Abbreviations. To abbreviate things and to show there's a simple structure at play here, I'll sometimes use the box symbol "□" as shorthand to say things are implicit:
"□(cats love kittens)" will mean "It's implicit that cats love kittens"
"□X" will mean "It's implicit that X"
A peculiar invitation
Okay! Let p be the statement "You're invited to the party". You'd love to receive such a straightforward invitation to the party, like some people did, those poo poo heads, but instead the host just sends you the following intriguing message:
Abbreviation: □pp
Interesting! Normally, being invited to a party and being implicitly invited are not the same thing, but for you in this case, apparently they are. Seeing this, you might feel like the host is hinting around at implicitly inviting you, and maybe you'll start to wonder if you're implicitly invited by virtue of the kind of hinting around that the host is doing with this very message. Well then, you'd be right! Here's how.
For the moment, forget about the host's message, and consider the following sentence, without assuming its truth (or implicitness):
Ψ:
The sentenc...
...more
View all episodesView all episodes
Download on the App Store

The Nonlinear LibraryBy The Nonlinear Fund

  • 4.6
  • 4.6
  • 4.6
  • 4.6
  • 4.6

4.6

8 ratings