论文标题
关于解决持续学习的理论研究
A Theoretical Study on Solving Continual Learning
论文作者
论文摘要
持续学习(CL)会逐步学习一系列任务。有两个流行的CL设置:类增量学习(CIL)和任务增量学习(TIL)。 CL的主要挑战是灾难性遗忘(CF)。尽管已经可以有效地克服TIL的CF,但CIL仍然是高度挑战的。到目前为止,几乎没有进行理论研究来提供有关如何解决CIL问题的原则指导。本文进行了这样的研究。它首先表明,概率可以将CIL问题分解为两个子问题:任务内预测(WP)和任务ID预测(TP)。它进一步证明了TP与连接CIL和OOD检测的分布(OOD)检测相关。这项研究的关键结论是,无论WP和TP或OOD检测是否是通过CIL算法明确或隐式定义的,良好的WP和良好的TP或OOD检测是必要的,并且足以满足良好的CIL性能。另外,TIL只是WP。基于理论结果,还设计了新的CIL方法,在CIL和TIL设置中都超过了强大的基线。
Continual learning (CL) learns a sequence of tasks incrementally. There are two popular CL settings, class incremental learning (CIL) and task incremental learning (TIL). A major challenge of CL is catastrophic forgetting (CF). While a number of techniques are already available to effectively overcome CF for TIL, CIL remains to be highly challenging. So far, little theoretical study has been done to provide a principled guidance on how to solve the CIL problem. This paper performs such a study. It first shows that probabilistically, the CIL problem can be decomposed into two sub-problems: Within-task Prediction (WP) and Task-id Prediction (TP). It further proves that TP is correlated with out-of-distribution (OOD) detection, which connects CIL and OOD detection. The key conclusion of this study is that regardless of whether WP and TP or OOD detection are defined explicitly or implicitly by a CIL algorithm, good WP and good TP or OOD detection are necessary and sufficient for good CIL performances. Additionally, TIL is simply WP. Based on the theoretical result, new CIL methods are also designed, which outperform strong baselines in both CIL and TIL settings by a large margin.