Scale, schlep, and systems This startlingly fast progress in LLMs was driven both by scaling up LLMs and doing schlep to make usable systems out of them. We think scale and schlep will both improve rapidly.
Language models surprised us Most experts were surprised by progress in language models in 2022 and 2023. There may be more surprises ahead, so experts should register their forecasts now about 2024 and 2025.
The costs of caution If you thought we might be able to cure cancer in 2200, then I think you ought to expect there’s a good chance we can do it within years of the advent of AI systems that can do the research work humans can do.
Is it time for a pause? The single most important thing we can do is to pause when the next model we train would be powerful enough to obsolete humans entirely. If it were up to me, I would slow down AI development starting now — and then later slow down even more.
The ethics of AI red-teaming If we’ve decided we’re collectively fine with unleashing millions of spam bots, then the least we can do is actually study what they can – and can’t – do.
Playing the training game We're creating incentives for AI systems to make their behavior look as desirable as possible, while intentionally disregarding human intent when that conflicts with maximizing reward.
Situational awareness AI systems that have a precise understanding of how they’ll be evaluated and what behavior we want them to display will earn more reward than AI systems that don’t.