AIs accelerating AI research

By Ajeya Cotra — Apr 4, 2023

Researchers could potentially design the next generation of ML models more quickly by delegating some work to existing models, creating a feedback loop of ever-accelerating progress.

The concept of an “intelligence explosion” has played an important role in discourse about advanced AI for decades. Early computer scientist I.J. Good described it like this in 1965:

Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an ‘intelligence explosion,’ and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control.

This presentation, like most other popular presentations of the intelligence explosion concept, focuses on what happens after we have a single AI system that can already do better at every task than any human (which Good calls an “ultraintelligent machine” above, and others have called “an artificial superintelligence”). It calls to mind an image of AI progress with two phases:

In Phase 1, humans are doing all the AI research, and progress ramps up steadily. We can more or less predict the rate of future progress (i.e. how quickly AI systems will improve their capabilities) by extrapolating from past rates of progress.^[1]
Eventually humans succeed at building an artificial superintelligence (or ASI), leading to Phase 2. In Phase 2, this ASI is doing all of the AI research by itself. All of a sudden, progress in AI capabilities is no longer bottlenecked by slow human researchers, and an intelligence explosion is kicked off. The rate of progress in AI research goes up sharply — perhaps years of progress is compressed into days or weeks.

But I think this picture is probably too all-or-nothing. Today’s large language models (LLMs) like GPT-4 are not (yet) capable of completely taking over AI research by themselves — but they are able to write code, come up with ideas for ML experiments, and help troubleshoot bugs and other issues. Anecdotally, several ML researchers I know are starting to delegate simple tasks that come up in their research to these LLMs, and they say that makes them meaningfully more productive. (When chatGPT went down for 6 hours, I know of one ML researcher who postponed their coding tasks for 6 hours and worked on other things in the meantime.^[2])

If this holds true more broadly, researchers could potentially design and train the next generation of ML models more quickly and easily by delegating to existing LLMs.^[3] This calls to mind a more continuous “intelligence explosion” that begins before we have any single artificial superintelligence:

Currently, human researchers collectively are responsible for almost all of the progress in AI research, but are starting to delegate a small fraction of the work to large language models. This makes it somewhat easier to design and train the next generation of models.
The next generation is able to handle harder tasks and more different types of tasks, so human researchers delegate more of their work to them. This makes it significantly easier to train the generation after that. Using models gives a much bigger boost than it did the last time around.
Each round of this process makes the whole field move faster and faster. In each round, human researchers delegate everything they can productively delegate to the current generation of models — and the more powerful those models are, the more they contribute to research and thus the faster AI capabilities can improve.

This feedback loop could be getting started now. If it goes on for enough cycles without hitting any fundamental blockers, at some point our AI systems will have taken over all the work involved in designing more powerful AI systems. And it could keep going beyond that, with a research community consisting entirely of AIs working at an inhuman pace to make yet-more-sophisticated AIs. Once AI systems have automated AI research entirely, I think it’s likely that the full obsolescence regime that we discussed in our first post will come soon after.^[4]

If so, the end state would be similar to what IJ Good envisioned — we could have “artificial superintelligence”^[5] that improves AI capabilities further and quickly leaves human capabilities far behind. But before we have artificial superintelligence, we might have already vastly accelerated the pace of progress in AI research^[6] with the help of lesser models.

Exactly how much acceleration might happen before we have AI systems that can handle all the AI research by themselves, and how much might happen after? Will it feel like a pretty sudden jump — we spend a while with some neat, mildly useful AI assistants and then all of a sudden we develop AI that obsoletes humanity? Or will we have many years in which AI systems get increasingly impressive and perceptibly accelerate the pace of progress before humans are fully obsolete?

This is a very complicated question that I’m not going to get into in this post, but my colleague Tom Davidson put out a thorough research report exploring takeoff speeds — essentially, how quickly and suddenly we move from the world of today to the obsolescence regime. If you’re interested in this topic, I’d encourage you to check it out.

One important implication of Tom’s analysis: we may hit major milestones of AI progress sooner than you’d guess, and blow past them faster than you’d guess. Suppose you have some intuitions about, say, when an AI system might be able to win a gold medal in the International Math Olympiad. If you were previously picturing human researchers doing all the work of AI research, your guess should move toward “sooner” when you factor in the possibility that AI systems themselves could start helping a lot soon. Similarly, factoring in the possibility of this feedback loop should move your guess for when we might enter the obsolescence regime toward “sooner” as well.

In reality, even if humans are the only ones doing AI research, we can’t always predict future progress by simply extrapolating from past progress. For example, if AI starts to get much more attention from investors and more money floods in, it’s likely that more people will switch into AI research, meaning that future research progress might go a lot faster than recent past progress. ↩︎
I’d love to see more systematic data collection about this! ↩︎
Is this actually an interesting or significant observation? After all, lots of tools (from calculators to better programming languages to search engines) have made programmers and researchers more productive historically. What would it matter if we could add LLMs to this list? In my mind, the key difference is that ML models could provide bigger, broader productivity gains than other tools, and these gains could keep increasing massively with each jump in scale. ↩︎
Specifically, I’d guess this happens in less than a year. ↩︎
Albeit potentially distributed across multiple systems, rather than housed in one machine. ↩︎
And potentially in other areas of scientific R&D. ↩︎