关于与马尔可夫噪声和梯度偏见的共识算法的收敛性

论文标题

关于与马尔可夫噪声和梯度偏见的共识算法的收敛性

On the Convergence of Consensus Algorithms with Markovian Noise and Gradient Bias

论文作者

Wai, Hoi-To

论文摘要

本文为分散的随机近似（SA）方案提供了有限的时间收敛分析。该方案概括了几种用于分散的机器学习和多代理增强学习的算法。我们的证明技术涉及将迭代分离为各自共识的部分和共识错误。共识错误是根据共识部分的平稳性限制的，而共识部分的更新可以分析为扰动的SA方案。在马尔可夫噪声和时间变化的通信图假设下，分散的SA方案的预期收敛速率为$ {\ cal o}（\ log t/ \ sqrt {t}）$，其中$ t $是迭代的数量，就非线性SA的平方范围而言，具有非线性SA的平方规范，但非线性SA的平方范围却是平稳的，但非convex成本的范围。该速率与具有非凸电势函数的集中设置中SA的最著名性能相当。

This paper presents a finite time convergence analysis for a decentralized stochastic approximation (SA) scheme. The scheme generalizes several algorithms for decentralized machine learning and multi-agent reinforcement learning. Our proof technique involves separating the iterates into their respective consensual parts and consensus error. The consensus error is bounded in terms of the stationarity of the consensual part, while the updates of the consensual part can be analyzed as a perturbed SA scheme. Under the Markovian noise and time varying communication graph assumptions, the decentralized SA scheme has an expected convergence rate of ${\cal O}(\log T/ \sqrt{T} )$, where $T$ is the iteration number, in terms of squared norms of gradient for nonlinear SA with smooth but non-convex cost function. This rate is comparable to the best known performances of SA in a centralized setting with a non-convex potential function.

下载PDF全文

下载文献需遵守相关版权规定

论文标题