无限损失的无重格学习：对数池的情况

论文标题

无限损失的无重格学习：对数池的情况

No-Regret Learning with Unbounded Losses: The Case of Logarithmic Pooling

论文作者

Neyman, Eric, Roughgarden, Tim

论文摘要

对于$ t $ time步骤中的每一个，$ M $专家报告概率分布超过$ n $结果；我们希望学会以一种无需重新保证的方式汇总这些预测。我们关注的基本和实用的聚合方法称为对数池（加权平均值赔率），这在某种意义上是合并方法的最佳选择，如果人们有兴趣最大程度地减少日志损失（因为我们认为是我们的损失函数）。我们考虑在线对抗环境中学习最佳参数（即专家权重）的问题。我们假设（必要）认为，结果和预测的对抗性选择是一致的，从某种意义上说，专家报告了校准的预测。施加这种约束会创造出（据我们所知）一种新颖的半逆转环境，在这种情况下，对手保留了很大的灵活性。在这种情况下，我们提出了一种基于在线镜像下降的算法，该算法以达到$ o（\ sqrt {t} \ log t）$预期的遗憾的方式学习专家权重，而与事后看来的最佳权重相比。

For each of $T$ time steps, $m$ experts report probability distributions over $n$ outcomes; we wish to learn to aggregate these forecasts in a way that attains a no-regret guarantee. We focus on the fundamental and practical aggregation method known as logarithmic pooling -- a weighted average of log odds -- which is in a certain sense the optimal choice of pooling method if one is interested in minimizing log loss (as we take to be our loss function). We consider the problem of learning the best set of parameters (i.e. expert weights) in an online adversarial setting. We assume (by necessity) that the adversarial choices of outcomes and forecasts are consistent, in the sense that experts report calibrated forecasts. Imposing this constraint creates a (to our knowledge) novel semi-adversarial setting in which the adversary retains a large amount of flexibility. In this setting, we present an algorithm based on online mirror descent that learns expert weights in a way that attains $O(\sqrt{T} \log T)$ expected regret as compared with the best weights in hindsight.

下载PDF全文

下载文献需遵守相关版权规定

论文标题