论文标题
学习率对深度学习模型抗噪声特性的影响
Impact of Learning Rate on Noise Resistant Property of Deep Learning Models
论文作者
论文摘要
近年来,由于其快速计算速度和出色的能源效率,对模拟计算的兴趣越来越大,这对于在深度学习推导中的边缘和IoT设备中非常重要。但是,由于模拟计算中存在固有的噪声,深度学习模型遭受了重大的性能退化,可能会限制其在关键任务应用中的使用。因此,有必要了解关键模型超参数选择对产生模型耐噪声属性的影响。这种需求至关重要,因为获得的洞察力可用于设计对模拟噪声强大的深度学习模型。在本文中,研究了学习率的影响,一个关键的设计选择对抗噪声的属性的影响。这项研究是通过使用不同学习率的初次培训深度学习模型来实现的。此后,将模型注入模拟噪声,并通过测量由于模拟噪声而导致的性能降解来检查所得模型的噪声特性。结果表明,学习率值的最佳位置在模型预测性能和抗模型噪声属性之间取得了良好的平衡。此外,还提供了观察到的现象的理论理由。
The interest in analog computation has grown tremendously in recent years due to its fast computation speed and excellent energy efficiency, which is very important for edge and IoT devices in the sub-watt power envelope for deep learning inferencing. However, significant performance degradation suffered by deep learning models due to the inherent noise present in the analog computation can limit their use in mission-critical applications. Hence, there is a need to understand the impact of critical model hyperparameters choice on the resulting model noise-resistant property. This need is critical as the insight obtained can be used to design deep learning models that are robust to analog noise. In this paper, the impact of the learning rate, a critical design choice, on the noise-resistant property is investigated. The study is achieved by first training deep learning models using different learning rates. Thereafter, the models are injected with analog noise and the noise-resistant property of the resulting models is examined by measuring the performance degradation due to the analog noise. The results showed there exists a sweet spot of learning rate values that achieves a good balance between model prediction performance and model noise-resistant property. Furthermore, the theoretical justification of the observed phenomenon is provided.