论文标题
零和平衡发现的稀疏线性编程
Sparsified Linear Programming for Zero-Sum Equilibrium Finding
论文作者
论文摘要
大型零和广泛形式不完美的信息游戏中的计算平衡发现导致了最近的AI突破。该问题最快的算法是反事实遗憾最小化的新形式[Brown and Sandholm,2019年]。在本文中,我们为问题提供了一种完全不同的方法,该方法具有竞争力,通常比以前的最新状态更好。均衡调查问题可以作为线性程序(LP)提出[Koller等,1994],但是由于LP求解器的内存需求,将其作为LP求解不可扩展,而LP求解器的内存需求通常比基于CFR的算法更差。我们给出了一种有效的实用算法,该算法将大量的回报矩阵转化为两个矩阵的产物,这些矩阵通常更稀疏。这使我们能够将平衡调查问题表示为一个线性程序,其大小仅比CFR差的对数因子,因此允许线性程序求解器可以在此类游戏上运行。通过在扑克最终游戏上进行实验,我们在实践中首次证明,现代线性程序求解器在解决大型广泛形式的大型游戏方面甚至与游戏特定的现代变体都具有竞争力,并且可以用于计算与CFR这样的迭代算法不同的精确解决方案。
Computational equilibrium finding in large zero-sum extensive-form imperfect-information games has led to significant recent AI breakthroughs. The fastest algorithms for the problem are new forms of counterfactual regret minimization [Brown and Sandholm, 2019]. In this paper we present a totally different approach to the problem, which is competitive and often orders of magnitude better than the prior state of the art. The equilibrium-finding problem can be formulated as a linear program (LP) [Koller et al., 1994], but solving it as an LP has not been scalable due to the memory requirements of LP solvers, which can often be quadratically worse than CFR-based algorithms. We give an efficient practical algorithm that factors a large payoff matrix into a product of two matrices that are typically dramatically sparser. This allows us to express the equilibrium-finding problem as a linear program with size only a logarithmic factor worse than CFR, and thus allows linear program solvers to run on such games. With experiments on poker endgames, we demonstrate in practice, for the first time, that modern linear program solvers are competitive against even game-specific modern variants of CFR in solving large extensive-form games, and can be used to compute exact solutions unlike iterative algorithms like CFR.