We are not short of attempts at achieving continual learning — self-distillation, real-time RL, memory scaffolds, replay methods, regularization, gradient projections, KL penalties, on-policy data, and countless more.

我们并不缺少实现持续学习的尝试 — self-distillation，real-time RL，memory scaffolds，replay methods，正则化，梯度投影，KL penalties，on-policy data，以及无数更多方法。

My complaint with a lot of these is that they are not even trying to solve the right problem.

我对很多这些的抱怨是，它们甚至没有试图解决正确的问题。

This is my attempt to sketch out a principled, ambitious definition of continual learning in LLMs, grounded in both the classical ML literature and some of the discourse on the topic. I will also lay out my case for in-weights continual learning and lay out some of the open questions ahead.

TLDR: we are interested in an LLM being able to efficiently and compositionally learn new capabilities during sequential exposure to new, differently-distributed data, while at least preserving general capabilities.

TLDR: we are interested in an LLM being able to efficiently and compositionally learn new capabilities during sequential exposure to new, differently-distributed data, while at least preserving general capabilities.

This is part 1; soon we will share some exciting approaches we've been developing with @PrimeIntellect aimed at actually evaluating approaches using this principled definition.

A breakdown of the desiderata

理想要求的分解

The foundational problem at the core of continual learning (CL) is that of catastrophic forgetting — models trained on new task distributions exhibit worse performance on th...

定义 Continual Learning

定义 Continual Learning

A breakdown of the desiderata

理想要求的分解