论文标题

低级功能的梯度下降

Gradient Descent for Low-Rank Functions

论文作者

Cosson, Romain, Jadbabaie, Ali, Makur, Anuran, Reisizadeh, Amirhossein, Shah, Devavrat

论文摘要

最近的几项经验研究表明,重要的机器学习任务,例如训练深层神经网络,表现出低级结构,其中损耗函数仅在输入空间的几个方向上差异很大。在本文中,我们利用这种低级结构来降低基于规范梯度的方法(例如梯度下降(GD))的高计算成本。我们提出的\ emph {低排名梯度下降}(LRGD)算法找到了$ε$ - $ x的固定点,通过首先识别$ r \ r \ r \ leq p $重要方向,然后通过计算这些itseration intirederation $ r $ ry $ ry $ r $ r $ r $ r $ r $ r r $ r \ r \ r \ r \ leq p $重要方向。我们确定用于强凸和非凸目标函数的LRGD的“定向甲骨文复杂性”是$ \ Mathcal {o}(r \ log(1/ε) + rp)$和$ \ MATHCAL {O}(R/ε^2 + RP)$。当$ r \ ll p $时,这些复杂性小于$ \ MATHCAL {O}的已知复杂性(p \ log(1/ε))$和$ \ Mathcal {o}(p/ε^2)$ {\ gd}的$ {\ gd}的强度分别在强烈的凸和非键盘设置中。因此,LRGD显着降低了基于梯度的方法的计算成本,以实现足够低级别的功能。在分析过程中,我们还正式定义和表征了精确且近似级别函数的类别。

Several recent empirical studies demonstrate that important machine learning tasks, e.g., training deep neural networks, exhibit low-rank structure, where the loss function varies significantly in only a few directions of the input space. In this paper, we leverage such low-rank structure to reduce the high computational cost of canonical gradient-based methods such as gradient descent (GD). Our proposed \emph{Low-Rank Gradient Descent} (LRGD) algorithm finds an $ε$-approximate stationary point of a $p$-dimensional function by first identifying $r \leq p$ significant directions, and then estimating the true $p$-dimensional gradient at every iteration by computing directional derivatives only along those $r$ directions. We establish that the "directional oracle complexities" of LRGD for strongly convex and non-convex objective functions are $\mathcal{O}(r \log(1/ε) + rp)$ and $\mathcal{O}(r/ε^2 + rp)$, respectively. When $r \ll p$, these complexities are smaller than the known complexities of $\mathcal{O}(p \log(1/ε))$ and $\mathcal{O}(p/ε^2)$ of {\gd} in the strongly convex and non-convex settings, respectively. Thus, LRGD significantly reduces the computational cost of gradient-based methods for sufficiently low-rank functions. In the course of our analysis, we also formally define and characterize the classes of exact and approximately low-rank functions.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源