Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Updating Drexler's CAIS model, published by Matthew Barnett on June 16, 2023 on LessWrong.
Eric Drexler's report Reframing Superintelligence: Comprehensive AI Services (CAIS) as General Intelligence reshaped how a lot of people think about AI (summary 1, summary 2). I still agree with many parts of it, perhaps even the core elements of the model. However, after looking back on it more than four years later, I think the general picture it gave missed some crucial details about how AI will go.
The problem seems to be that his report neglected a discussion of foundation models, which I think have transformed how we should think about AI services and specialization.
The general vibe I got from CAIS (which may not have been Drexler's intention) was something like the following picture:
For each task in the economy, we will train a model from scratch to automate the task, using the minimum compute necessary to train an AI to do well on the task. Over time, the fraction of tasks automated will slowly expand like a wave, starting with the tasks that are cheapest to automate computationally, and ending with the most expensive tasks. At some point, automation will be so widespread that it will begin to meaningfully feed into itself, increasing AI R&D, and accelerating the rate of technological progress.
The problem with this approach to automation is that it's extremely wasteful to train models from scratch for each task. It might make sense when training budgets are tiny — as they mostly were in 2018 — but it doesn't make sense when it takes 10^25 FLOP to reach adequate performance on a given set of tasks.
The big obvious-in-hindsight idea that we've gotten over the last several years is that, instead of training from scratch for each new task, we'll train train a foundation model on some general distribution, which can then be fine-tuned using small amounts of compute to perform well on any task. In the CAIS model, "general intelligence" is just the name we can give to the collection of all AI services in the economy. In this new paradigm, "general intelligence" refers to the fact that sufficiently large foundation models can efficiently transfer their knowledge to obtain high performance on almost any downstream task, which is pretty closely analogous to what humans do to take over jobs.
The fact that generalist models can be efficiently adapted to perform well on almost any task is an incredibly important fact about our world, because it implies that a very large fraction of the costs of automation can be parallelized across almost all tasks.
Let me illustrate this fact with a hypothetical example.
Suppose we previously thought that it would take $1 trillion to automate each task in our economy, such as language translation, box stacking, and driving cars. Since the cost of automating each of these tasks is $1 trillion each, you might expect companies would slowly automate all the tasks in the economy, starting with the most profitable ones, and then finally getting around to the least profitable ones once economic growth allowed for us to spend enough money on automating not-very-profitable stuff.
But now suppose we think it costs $999 billion to create "general intelligence", which then once built, can be quickly adapted to automate any other task at a cost of $1 billion. In this world, we will go very quickly from being able to automate almost nothing to being able to automate almost anything. In other words we will get one big innovation "lump", which is the opposite of what Robin Hanson predicted. Even if we won't invent monolithic agents that take over the world by being smarter than everything else, we won't have a gradual decades-long ramp-up to full automation either.
Of course, the degree of suddenness in the foundation model paradigm is still debatable, because ...