在多构型次试验优化中拒绝观察的成本

论文标题

在多构型次试验优化中拒绝观察的成本

The Cost of Denied Observation in Multiagent Submodular Optimization

论文作者

Grimsman, David, Seaton, Joshua H., Marden, Jason R., Brown, Philip N.

论文摘要

多种控制控制的一种流行形式主义应用了游戏理论中的工具，将多种决策问题抛弃为合作风格的游戏，在该游戏中，各个代理商在当地选择以优化自己的本地公用事业功能，以响应其他代理商的可观察选择。当系统级目标是次管的最大化时，众所周知，如果每个代理都可以观察所有其他代理的行动选择，那么大型游戏的所有NASH平衡都在最佳$ 2 $的$ 2倍以下；也就是说，无政府状态的价格为$ 1/2 $。但是，鲜为人知的是代理人是否无法观察其他相关代理的作用选择。为了研究这一点，我们将标准的游戏理论模型扩展到了该模型，其中一部分代理变为\ emph {blind}（无法观察他人的选择）或\ emph {隔离{孤立}（盲人，也不是其他代理人看不见的），并且我们以无政府状态的函数而证明了精确的表达方式，因为它是符合合格的代理人的功能。当$ k $代理被妥协（在盲人或孤立的任何组合中）时，我们表明，大型公用事业功能的无政府状态价格正好为$ 1/（2+k）$。然后，我们表明，如果代理商使用边际成本实用程序功能，至少$ 1 $的折衷代理人是盲人（而不是孤立的），则无政府状态的价格将提高到$ 1/（1+k）$。我们还提供了模拟结果，证明了这些观察否认在动态环境中的影响。

A popular formalism for multiagent control applies tools from game theory, casting a multiagent decision problem as a cooperation-style game in which individual agents make local choices to optimize their own local utility functions in response to the observable choices made by other agents. When the system-level objective is submodular maximization, it is known that if every agent can observe the action choice of all other agents, then all Nash equilibria of a large class of resulting games are within a factor of $2$ of optimal; that is, the price of anarchy is $1/2$. However, little is known if agents cannot observe the action choices of other relevant agents. To study this, we extend the standard game-theoretic model to one in which a subset of agents either become \emph{blind} (unable to observe others' choices) or \emph{isolated} (blind, and also invisible to other agents), and we prove exact expressions for the price of anarchy as a function of the number of compromised agents. When $k$ agents are compromised (in any combination of blind or isolated), we show that the price of anarchy for a large class of utility functions is exactly $1/(2+k)$. We then show that if agents use marginal-cost utility functions and at least $1$ of the compromised agents is blind (rather than isolated), the price of anarchy improves to $1/(1+k)$. We also provide simulation results demonstrating the effects of these observation denials in a dynamic setting.

下载PDF全文

下载文献需遵守相关版权规定

论文标题