Adasmooth：一种基于有效比率的自适应学习率方法

论文标题

Adasmooth：一种基于有效比率的自适应学习率方法

AdaSmooth: An Adaptive Learning Rate Method based on Effective Ratio

论文作者

Lu, Jun

论文摘要

众所周知，我们需要在动量，Adagrad，Adadelta和其他替代随机优化器中选择超参数。尽管在许多情况下，超参数是根据经验而不是科学而不是科学的经验来调整的。我们提出了一种称为Adasmooth的梯度下降的新型人均学习率方法。该方法对超参数不敏感，因此不需要对动量，Adagrad和Adadelta方法等高参数进行手动调整。与不同的卷积神经网络，多层感知器和替代机器学习任务的其他方法相比，我们显示出令人鼓舞的结果。经验结果表明，Adasmooth在实践中效果很好，并与神经网络中的其他随机优化方法进行了比较。

It is well known that we need to choose the hyper-parameters in Momentum, AdaGrad, AdaDelta, and other alternative stochastic optimizers. While in many cases, the hyper-parameters are tuned tediously based on experience becoming more of an art than science. We present a novel per-dimension learning rate method for gradient descent called AdaSmooth. The method is insensitive to hyper-parameters thus it requires no manual tuning of the hyper-parameters like Momentum, AdaGrad, and AdaDelta methods. We show promising results compared to other methods on different convolutional neural networks, multi-layer perceptron, and alternative machine learning tasks. Empirical results demonstrate that AdaSmooth works well in practice and compares favorably to other stochastic optimization methods in neural networks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题