用于模型服务的神经网络再培训

论文标题

用于模型服务的神经网络再培训

Neural Network Retraining for Model Serving

论文作者

Klabjan, Diego, Zhu, Xiaofeng

论文摘要

我们建议对神经网络模型进行增量（重新）训练，以应对模型服务过程中推断的连续推断。因此，这是一个终身学习过程。我们解决了终身训练的两个挑战：灾难性的遗忘和有效的再训练。如果我们结合了所有过去和新数据，它很容易就可以棘手地重新训练神经网络模型。另一方面，如果仅使用新数据对模型进行重新培训，则很容易遭受灾难性的遗忘，因此达到正确的平衡至关重要。此外，如果我们每次收集新数据时都会重新培训模型的所有权重，那么RETARing倾向于需要过多的计算资源。为了解决这两个问题，我们提出了一个新颖的重新培训模型，该模型可以选择重要的样本和使用多军匪徒的重要权重。为了进一步解决遗忘，我们提出了一个新的正规化术语，重点是突触和神经元的重要性。我们分析了多个数据集，以记录提出的再培训方法的结果。各种实验表明，我们的再训练方法可以减轻灾难性遗忘问题，同时提高模型性能。

We propose incremental (re)training of a neural network model to cope with a continuous flow of new data in inference during model serving. As such, this is a life-long learning process. We address two challenges of life-long retraining: catastrophic forgetting and efficient retraining. If we combine all past and new data it can easily become intractable to retrain the neural network model. On the other hand, if the model is retrained using only new data, it can easily suffer catastrophic forgetting and thus it is paramount to strike the right balance. Moreover, if we retrain all weights of the model every time new data is collected, retraining tends to require too many computing resources. To solve these two issues, we propose a novel retraining model that can select important samples and important weights utilizing multi-armed bandits. To further address forgetting, we propose a new regularization term focusing on synapse and neuron importance. We analyze multiple datasets to document the outcome of the proposed retraining methods. Various experiments demonstrate that our retraining methodologies mitigate the catastrophic forgetting problem while boosting model performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题