纳什神经网络：从最佳行为推断公用事业

论文标题

纳什神经网络：从最佳行为推断公用事业

Nash Neural Networks : Inferring Utilities from Optimal Behaviour

论文作者

Molina, John J., Schnyder, Simon K., Turner, Matthew S., Yamamoto, Ryoichi

论文摘要

我们建议NASH神经网络（$ n^3 $）作为一种新型的物理学知情的神经网络，能够从对纳什平衡中理性个人在差异游戏中的行为方式的观察来推断基础实用程序。我们假设人口和个人的动态都是已知的，而不是收益函数，该功能指定了处于任何特定状态的单位时间的成本。我们以一种方式构建我们的网络，使得满足相应的最佳控制问题的Euler-Lagrange方程，并且最佳控制是自一确定的。通过这种方式，我们能够以无监督的方式学习未知的收益功能。我们已经应用了$ n^3 $来研究流行期间的最佳行为，其中个人可以根据大流行状态和被感染的成本选择社交距离。培训我们的网络针对简单的SIR模型进行综合数据的培训，我们表明，可以通过尊重游戏动力学的方式准确地重现隐藏的收益功能。我们的方法将具有深远的应用程序，因为它允许人们从行为数据中推断公用事业，因此可以应用于研究科学，工程，经济学和政府计划中的各种问题。

We propose Nash Neural Networks ($N^3$) as a new type of Physics Informed Neural Network that is able to infer the underlying utility from observations of how rational individuals behave in a differential game with a Nash equilibrium. We assume that the dynamics for both the population and the individual are known, but not the payoff function, which specifies the cost per unit time of being in any particular state. We construct our network in such a way that the Euler-Lagrange equations of the corresponding optimal control problem are satisfied and the optimal control is self-consistently determined. In this way, we are able to learn the unknown payoff function in an unsupervised manner. We have applied the $N^3$ to study the optimal behaviour during epidemics, in which individuals can choose to socially distance depending on the state of the pandemic and the cost of being infected. Training our network against synthetic data for a simple SIR model, we showed that it is possible to accurately reproduce the hidden payoff function, in such a way that the game dynamics are respected. Our approach will have far-reaching applications, as it allows one to infer utilities from behavioural data, and can thus be applied to study a wide array of problems in science, engineering, economics and government planning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题