论文标题
LDP:可学习的动态精度,用于有效的深度神经网络培训和推理
LDP: Learnable Dynamic Precision for Efficient Deep Neural Network Training and Inference
论文作者
论文摘要
低精度的深神经网络(DNN)培训是提高DNN训练效率的最有效技术之一,因为它可以从最高的位降低培训成本。尽管现有作品主要在整个培训过程中修复了模型的精度,但一些开创性的作品表明,动态精度时间表可帮助DNN融合到更高的准确性,同时导致培训成本低于其静态精确培训对应物。但是,现有的动态低精度训练方法依靠手动设计的精确时间表来实现有利的效率和准确的权衡,从而限制了其更全面的实用应用和可实现的性能。为此,我们提出了一个可学习的动态精度DNN培训框架,它可以在培训期间自动学习时间和空间动态的精确时间表,以实现最佳的准确性和效率折衷。值得注意的是,在推断期间,受LDP培训的DNN本质上是有效的。此外,我们可视化最终的时间和空间精确时间表以及在不同任务上培训的LDP培训DNN的分布,以更好地了解在培训期间和之后的不同培训阶段和DNN层的相应DNN特征,从而培养了促进进一步创新的见解。广泛的实验和消融研究(七个网络,五个数据集和三个任务)表明,在培训效率方面,提议的LDP始终超过最先进的(SOTA)低精度DNN培训技术,并实现了准确的权衡。例如,除了具有自动化的优势外,与最佳SOTA方法相比,当训练CIFAR-10训练Resnet-20时,我们的LDP的精度还提高了0.31 \%,计算成本降低了39.1 \%。
Low precision deep neural network (DNN) training is one of the most effective techniques for boosting DNNs' training efficiency, as it trims down the training cost from the finest bit level. While existing works mostly fix the model precision during the whole training process, a few pioneering works have shown that dynamic precision schedules help DNNs converge to a better accuracy while leading to a lower training cost than their static precision training counterparts. However, existing dynamic low precision training methods rely on manually designed precision schedules to achieve advantageous efficiency and accuracy trade-offs, limiting their more comprehensive practical applications and achievable performance. To this end, we propose LDP, a Learnable Dynamic Precision DNN training framework that can automatically learn a temporally and spatially dynamic precision schedule during training towards optimal accuracy and efficiency trade-offs. It is worth noting that LDP-trained DNNs are by nature efficient during inference. Furthermore, we visualize the resulting temporal and spatial precision schedule and distribution of LDP trained DNNs on different tasks to better understand the corresponding DNNs' characteristics at different training stages and DNN layers both during and after training, drawing insights for promoting further innovations. Extensive experiments and ablation studies (seven networks, five datasets, and three tasks) show that the proposed LDP consistently outperforms state-of-the-art (SOTA) low precision DNN training techniques in terms of training efficiency and achieved accuracy trade-offs. For example, in addition to having the advantage of being automated, our LDP achieves a 0.31\% higher accuracy with a 39.1\% lower computational cost when training ResNet-20 on CIFAR-10 as compared with the best SOTA method.