Riemannian随机近似方案的收敛分析

论文标题

Riemannian随机近似方案的收敛分析

Convergence Analysis of Riemannian Stochastic Approximation Schemes

论文作者

Durmus, Alain, Jiménez, Pablo, Moulines, Éric, Said, Salem, Wai, Hoi-To

论文摘要

本文分析了大量的Riemannian随机近似（SA）方案的收敛性，该方案旨在解决随机优化问题。特别是，我们研究的递归使用了考虑的歧管（地球方案）的指数图或更一般的缩回函数（缩回方案）作为指数图的代理。这样的近似值引起了人们的极大兴趣，因为它们是地理方案的低复杂性替代品。在假设SA的平均场与光滑的Lyapunov函数（可能是非convex）的梯度相关的假设，我们表明上述Riemannian SA方案找到了$ {\ Mathcal {o}}}（b_ \ \ f_ \ f_ \ f_ \ floty + log n / \ n / \ sqrt {n} n} n} $ - in Inter in Inter inction in Inter（in Inter） $ {\ MATHCAL {O}}（n）$迭代，其中$ b_ \ infty \ geq 0 $是渐近偏差。与以前的工作相比，我们得出的条件要较温和。首先，我们所有的分析都是全球性的，因为我们不认为迭代为A-Priori有限。其次，我们研究有偏见的SA方案。更具体地说，我们考虑了只能估算出较小偏见的平均场函数和/或从受控的马尔可夫链中绘制样品的情况。第三，确保相关SA方案收敛所需的缩回条件较弱，并且对于众所周知的例子。我们在三个机器学习问题上说明了结果。

This paper analyzes the convergence for a large class of Riemannian stochastic approximation (SA) schemes, which aim at tackling stochastic optimization problems. In particular, the recursions we study use either the exponential map of the considered manifold (geodesic schemes) or more general retraction functions (retraction schemes) used as a proxy for the exponential map. Such approximations are of great interest since they are low complexity alternatives to geodesic schemes. Under the assumption that the mean field of the SA is correlated with the gradient of a smooth Lyapunov function (possibly non-convex), we show that the above Riemannian SA schemes find an ${\mathcal{O}}(b_\infty + \log n / \sqrt{n})$-stationary point (in expectation) within ${\mathcal{O}}(n)$ iterations, where $b_\infty \geq 0$ is the asymptotic bias. Compared to previous works, the conditions we derive are considerably milder. First, all our analysis are global as we do not assume iterates to be a-priori bounded. Second, we study biased SA schemes. To be more specific, we consider the case where the mean-field function can only be estimated up to a small bias, and/or the case in which the samples are drawn from a controlled Markov chain. Third, the conditions on retractions required to ensure convergence of the related SA schemes are weak and hold for well-known examples. We illustrate our results on three machine learning problems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题