论文标题

具有异质曲率的自适应强盗凸凸优化

Adaptive Bandit Convex Optimization with Heterogeneous Curvature

论文作者

Luo, Haipeng, Zhang, Mengxiao, Zhao, Peng

论文摘要

我们考虑了对抗性匪徒凸优化的问题,也就是说,通过一系列任意凸损耗函数在线学习,每个函数都只能评估其中一个功能。尽管所有先前的作品都在这些损失功能上都假设已知且均匀的曲率,但我们研究了一个异质环境,其中每个功能都有自己的曲率,仅在学习者做出决定之后才揭示。我们开发了一种有效的算法,能够即时适应曲率。具体而言,我们的算法不仅可以恢复或\ emph {甚至改进}几种同质设置的现有结果,而且还会导致某些异质设置的令人惊讶的结果 - 例如,Hazan和Levy(2014)表明$ \ widetilde {o}(o}(o}(d^o}(d^{3/2} {3/2} $ suense a $ sune cons una $ nake $ $ pance $ n $我们的算法强烈凸出$ d $维功能,即使它们的$ t^{3/4} $并没有强烈凸出,即使它们中的不断分数并不强烈,也可以实现同样的算法。我们的方法灵感来自Bartlett等人的框架。 (2007年)研究了类似的异质环境,但具有更强的梯度反馈。将其框架扩展到匪徒反馈设置需要新颖的想法,例如提升可行的域名并使用对数均匀的自信屏障正常化程序。

We consider the problem of adversarial bandit convex optimization, that is, online learning over a sequence of arbitrary convex loss functions with only one function evaluation for each of them. While all previous works assume known and homogeneous curvature on these loss functions, we study a heterogeneous setting where each function has its own curvature that is only revealed after the learner makes a decision. We develop an efficient algorithm that is able to adapt to the curvature on the fly. Specifically, our algorithm not only recovers or \emph{even improves} existing results for several homogeneous settings, but also leads to surprising results for some heterogeneous settings -- for example, while Hazan and Levy (2014) showed that $\widetilde{O}(d^{3/2}\sqrt{T})$ regret is achievable for a sequence of $T$ smooth and strongly convex $d$-dimensional functions, our algorithm reveals that the same is achievable even if $T^{3/4}$ of them are not strongly convex, and sometimes even if a constant fraction of them are not strongly convex. Our approach is inspired by the framework of Bartlett et al. (2007) who studied a similar heterogeneous setting but with stronger gradient feedback. Extending their framework to the bandit feedback setting requires novel ideas such as lifting the feasible domain and using a logarithmically homogeneous self-concordant barrier regularizer.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源