定义 Continual Learning
We are not short of attempts at achieving continual learning — self-distillation, real-time RL, memory scaffolds, replay methods, regularization, gradient projections, KL penalties, on-policy data, and countless more.
我们并不缺少实现持续学习的尝试 — self-distillation,real-time RL,memory scaffolds,replay methods,正则化,梯度投影,KL penalties,on-policy data,以及无数更多方法。
My complaint with a lot of these is that they are not even trying to solve the right problem.
我对很多这些的抱怨是,它们甚至没有试图解决正确的问题。
This is my attempt to sketch out a principled, ambitious definition of continual learning in LLMs, grounded in both the classical ML literature and some of the discourse on the topic. I will also lay out my case for in-weights continual learning and lay out some of the open questions ahead.
This is my attempt to sketch out a principled, ambitious definition of continual learning in LLMs, grounded in both the classical ML literature and some of the discourse on the topic. I will also lay out my case for in-weights continual learning and lay out some of the open questions ahead.
TLDR: we are interested in an LLM being able to efficiently and compositionally learn new capabilities during sequential exposure to new, differently-distributed data, while at least preserving general capabilities.
TLDR: we are interested in an LLM being able to efficiently and compositionally learn new capabilities during sequential exposure to new, differently-distributed data, while at least preserving general capabilities.
This is part 1; soon we will share some exciting approaches we've been developing with @PrimeIntellect aimed at actually evaluating approaches using this principled definition.
This is part 1; soon we will share some exciting approaches we've been developing with @PrimeIntellect aimed at actually evaluating approaches using this principled definition.
A breakdown of the desiderata
理想要求的分解
The foundational problem at the core of continual learning (CL) is that of catastrophic forgetting — models trained on new task distributions exhibit worse performance on th...