临时MDL神经网络的顺序学习

论文标题

临时MDL神经网络的顺序学习

Sequential Learning Of Neural Networks for Prequential MDL

论文作者

Bornschein, Jorg, Li, Yazhe, Hutter, Marcus

论文摘要

最小描述长度（MDL）为有原则的模型评估提供了一个框架和目标。它正式化了Occam的剃须刀，可以应用于非平稳来源的数据。在MDL的急需配方中，目的是在依次浏览数据并使用先前的观测值进行参数估计时，将累积的下一步对数损失最小化。因此，它与持续的或在线学习的问题非常相似。在这项研究中，我们评估了用于计算具有神经网络图像分类数据集的特性描述长度的方法。考虑到计算成本，我们发现与以前使用的块估计相比，在线学习的性能优惠。我们提出了向前校准，以更好地使模型预测与经验观测结果相结合，并引入重播流，这是一种Minibatch增量训练技术，以有效地实现近似随机重播，同时避免大型内存重播缓冲区。结果，我们提出了一套图像分类数据集的描述长度，这些长度根据以前报告的大幅度报告的结果改善。

Minimum Description Length (MDL) provides a framework and an objective for principled model evaluation. It formalizes Occam's Razor and can be applied to data from non-stationary sources. In the prequential formulation of MDL, the objective is to minimize the cumulative next-step log-loss when sequentially going through the data and using previous observations for parameter estimation. It thus closely resembles a continual- or online-learning problem. In this study, we evaluate approaches for computing prequential description lengths for image classification datasets with neural networks. Considering the computational cost, we find that online-learning with rehearsal has favorable performance compared to the previously widely used block-wise estimation. We propose forward-calibration to better align the models predictions with the empirical observations and introduce replay-streams, a minibatch incremental training technique to efficiently implement approximate random replay while avoiding large in-memory replay buffers. As a result, we present description lengths for a suite of image classification datasets that improve upon previously reported results by large margins.

下载PDF全文

下载文献需遵守相关版权规定

论文标题