论文标题

Adasmooth:一种基于有效比率的自适应学习率方法

AdaSmooth: An Adaptive Learning Rate Method based on Effective Ratio

论文作者

Lu, Jun

论文摘要

众所周知,我们需要在动量,Adagrad,Adadelta和其他替代随机优化器中选择超参数。尽管在许多情况下,超参数是根据经验而不是科学而不是科学的经验来调整的。我们提出了一种称为Adasmooth的梯度下降的新型人均学习率方法。该方法对超参数不敏感,因此不需要对动量,Adagrad和Adadelta方法等高参数进行手动调整。与不同的卷积神经网络,多层感知器和替代机器学习任务的其他方法相比,我们显示出令人鼓舞的结果。经验结果表明,Adasmooth在实践中效果很好,并与神经网络中的其他随机优化方法进行了比较。

It is well known that we need to choose the hyper-parameters in Momentum, AdaGrad, AdaDelta, and other alternative stochastic optimizers. While in many cases, the hyper-parameters are tuned tediously based on experience becoming more of an art than science. We present a novel per-dimension learning rate method for gradient descent called AdaSmooth. The method is insensitive to hyper-parameters thus it requires no manual tuning of the hyper-parameters like Momentum, AdaGrad, and AdaDelta methods. We show promising results compared to other methods on different convolutional neural networks, multi-layer perceptron, and alternative machine learning tasks. Empirical results demonstrate that AdaSmooth works well in practice and compares favorably to other stochastic optimization methods in neural networks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源