The Nonlinear Library

LW - Capabilities and alignment of LLM cognitive architectures by Seth Herd


Listen Later

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Capabilities and alignment of LLM cognitive architectures, published by Seth Herd on April 18, 2023 on LessWrong.
Epistemic status: Hoping for help working through all of these new ideas.
TLDR:
Scaffolded, "agentized" LLMs that combine and extend the approaches in AutoGPT, HuggingGPT, Reflexion, and BabyAGI seem likely to be a focus of near-term AI development. LLMs by themselves are like a human with great automatic language processing, but no goal-directed agency, executive function, episodic memory, or sensory processing. Recent work has added all of these to LLMs, making language model cognitive architectures (LMCAs). These implementations are currently limited but will improve.
Cognitive capacities interact synergistically in human cognition. In addition, this new direction of development will allow individuals and small businesses to contribute to progress on AGI. These new factors of compounding progress may speed progress in this direction. LMCAs might well become intelligent enough to create X-risk before other forms of AGI. I expect LMCAs to enhance the effective intelligence of LLMs by performing extensive, iterative, goal-directed "thinking" that incorporates topic-relevant web searches.
The possible shortening of timelines-to-AGI is a downside, but the upside may be even larger. LMCAs pursue goals and do much of their “thinking” in natural language, enabling a natural language alignment (NLA) approach. They reason about and balance ethical goals much as humans do. This approach to AGI and alignment has large potential benefits relative to existing approaches to AGI and alignment.
Overview
I still think it's likely that agentized LLMs will change the alignment landscape for the better, although I've tempered my optimism a bit since writing that. A big piece of the logic for that hypothesis is why I expect this approach to become very useful, and possibly become the de-facto standard for AGI progress. The other piece was the potential positive impacts on alignment work. Both of those pieces of logic were compressed in that post. I expand on them here.
Beginning with a caveat may be appropriate since much of the below sounds both speculative and optimistic. I describe many potential improvements and positive-sum synergies between different capabilities. There will surely be difficulties and many things that don’t work as well or as easily as they might, for deep reasons that will slow development. It’s quite possible that there are enough of those things that this direction will be eclipsed by continued development of large models, and that progress in integrating cognitive capacities will take a different route. In particular, this approach relies heavily on calls to large language models (LLMs). Calling cutting-edge LLMs will continue to have nontrivial costs in both time and money, as they require substantial computing resources. These may hamper this direction, or drive progress in substantially less human-like (e.g., parallel) or interpretable (e.g., a move to non-natural language core processing) directions.
With these caveats in mind, I think the potentials for capabilities and alignment are enough to merit serious consideration from the alignment community, even this early in the game.
I think AutoGPT, HuggingGPT, and similar script wrappers and tool extensions for LLMs are just the beginning, and there are low-hanging fruit and synergies that will add capability to LLMs, effectively enhancing their intelligence and usefulness. This approach makes an LLM the natural language cognitive engine at the center of a cognitive architecture. Cognitive architectures are computational models of human brain function, including separate cognitive capacities that work synergistically.
Cognitive architectures are a longstanding field of research at the conjunctio...
...more
View all episodesView all episodes
Download on the App Store

The Nonlinear LibraryBy The Nonlinear Fund

  • 4.6
  • 4.6
  • 4.6
  • 4.6
  • 4.6

4.6

8 ratings