论文标题

通过反馈图,几乎最佳的最佳世界学习算法用于在线学习

Nearly Optimal Best-of-Both-Worlds Algorithms for Online Learning with Feedback Graphs

论文作者

Ito, Shinji, Tsuchiya, Taira, Honda, Junya

论文摘要

这项研究考虑了使用一般的有向反馈图的在线学习。对于这个问题,我们提出了最佳世界算法,这些算法几乎可以为对抗环境以及随机环境的多同性恋遗憾界定。如Alon等。 [2015]表明,紧密的遗憾界限取决于反馈图的结构:可观察到的图形产生$ \tildeθ(α^{1/2} t^{1/2})$的最小值遗憾,而弱观察到的图形可诱导$ \tildeθ($ \tildeθ)$ \tildeθ(Δ $δ$分别代表图形的独立数和图形特定部分的支配数。我们提出的强烈可观察图的算法的遗憾为$ \ tilde {o}(α^{1/2} t^{1/2})$用于对抗环境,以及$ {o}(o}(o})(\ frac {\ frac {α(\ ln t)^3} $} $} $} $} $δ_ {\ min} $表示最小次优差距。该结果解决了Erez和Koren [2021]提出的一个公开问题。我们还为弱可观察的图提供了一种算法,该算法实现了$ \ tilde {o}(δ^{1/3} t^{2/3})$的遗憾,以实现对抗环境和对随机环境的多洛格里斯遗憾。所提出的算法基于以下规范化的领导方法,并结合新设计的学习率更新规则。

This study considers online learning with general directed feedback graphs. For this problem, we present best-of-both-worlds algorithms that achieve nearly tight regret bounds for adversarial environments as well as poly-logarithmic regret bounds for stochastic environments. As Alon et al. [2015] have shown, tight regret bounds depend on the structure of the feedback graph: strongly observable graphs yield minimax regret of $\tildeΘ( α^{1/2} T^{1/2} )$, while weakly observable graphs induce minimax regret of $\tildeΘ( δ^{1/3} T^{2/3} )$, where $α$ and $δ$, respectively, represent the independence number of the graph and the domination number of a certain portion of the graph. Our proposed algorithm for strongly observable graphs has a regret bound of $\tilde{O}( α^{1/2} T^{1/2} ) $ for adversarial environments, as well as of $ {O} ( \frac{α(\ln T)^3 }{Δ_{\min}} ) $ for stochastic environments, where $Δ_{\min}$ expresses the minimum suboptimality gap. This result resolves an open question raised by Erez and Koren [2021]. We also provide an algorithm for weakly observable graphs that achieves a regret bound of $\tilde{O}( δ^{1/3}T^{2/3} )$ for adversarial environments and poly-logarithmic regret for stochastic environments. The proposed algorithms are based on the follow-the-regularized-leader approach combined with newly designed update rules for learning rates.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源