简单随机游戏的乐观和拓扑价值迭代

论文标题

简单随机游戏的乐观和拓扑价值迭代

Optimistic and Topological Value Iteration for Simple Stochastic Games

论文作者

Azeem, Muqsit, Evangelidis, Alexandros, Křetínský, Jan, Slivinskiy, Alexander, Weininger, Maximilian

论文摘要

虽然价值迭代（VI）是简单随机游戏（SSG）的标准解决方案方法，但它由于缺乏停止标准而受苦。最近，已经出现了几种解决方案，其中包括“乐观” VI（OVI）。但是，OVI仅适用于无最终组件的单人SSG。我们提升了这两个假设，可将其提供给一般的SSG。此外，我们在拓扑VI的背景下利用了这个想法，在那里我们提供了有效的精确解决方案。为了将新算法与最新的状态进行比较，我们不仅使用标准基准，而且还设计了一个随机的SSG生成器，这些生成器可能会偏向各种模型，从而有助于理解SSGS上不同算法的优势。

While value iteration (VI) is a standard solution approach to simple stochastic games (SSGs), it suffered from the lack of a stopping criterion. Recently, several solutions have appeared, among them also "optimistic" VI (OVI). However, OVI is applicable only to one-player SSGs with no end components. We lift these two assumptions, making it available to general SSGs. Further, we utilize the idea in the context of topological VI, where we provide an efficient precise solution. In order to compare the new algorithms with the state of the art, we use not only the standard benchmarks, but we also design a random generator of SSGs, which can be biased towards various types of models, aiding in understanding the advantages of different algorithms on SSGs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题