论文标题
有限理性代理商之间的决策
Decision-Making Among Bounded Rational Agents
论文作者
论文摘要
当机器人与其他智能代理(例如其他机器人或人类)共享相同的工作空间时,他们必须能够在完成指定任务的同时推理其邻近代理的行为。实际上,由于其计算资源有限,代理通常不会表现出绝对的理性行为。因此,预测最佳代理行为是不可取的(因为它需要过度的计算资源)和不希望的(因为预测可能是错误的)。在这一观察过程中,我们删除了完全理性的代理的假设,并提出将有限理性的概念从信息理论观点纳入游戏理论框架。这使机器人可以推论其他代理的亚最佳行为,并在其计算约束下采取相应的行动。具体而言,有限的理性直接建模了代理的信息处理能力,该信息被称为名义和优化的随机策略之间的KL差异,并且可以通过有效的重要性采样方法来获得对有限 - 最佳策略的解决方案。使用多机器人导航任务中的模拟和现实世界实验,我们证明了所得框架使机器人可以对其他代理的不同级别的理性行为进行推理,并根据其计算约束计算合理的策略。
When robots share the same workspace with other intelligent agents (e.g., other robots or humans), they must be able to reason about the behaviors of their neighboring agents while accomplishing the designated tasks. In practice, frequently, agents do not exhibit absolutely rational behavior due to their limited computational resources. Thus, predicting the optimal agent behaviors is undesirable (because it demands prohibitive computational resources) and undesirable (because the prediction may be wrong). Motivated by this observation, we remove the assumption of perfectly rational agents and propose incorporating the concept of bounded rationality from an information-theoretic view into the game-theoretic framework. This allows the robots to reason other agents' sub-optimal behaviors and act accordingly under their computational constraints. Specifically, bounded rationality directly models the agent's information processing ability, which is represented as the KL-divergence between nominal and optimized stochastic policies, and the solution to the bounded-optimal policy can be obtained by an efficient importance sampling approach. Using both simulated and real-world experiments in multi-robot navigation tasks, we demonstrate that the resulting framework allows the robots to reason about different levels of rational behaviors of other agents and compute a reasonable strategy under its computational constraint.