从长期迭代囚犯的困境实验中的观察中推断策略

论文标题

从长期迭代囚犯的困境实验中的观察中推断策略

Inferring Strategies from Observations in Long Iterated Prisoner's Dilemma Experiments

论文作者

Montero-Porras, Eladio, Grujic, Jelena, Fernandez-Domingos, Elias, Lenaerts, Tom

论文摘要

尽管许多理论研究揭示了可能导致和维持迭代囚犯困境中合作并维持合作的策略，但对人类参与者在这场比赛中的实际工作以及在每回合与匿名伙伴面对面时的策略如何变化。先前的尝试使用了简短的实验，对可能的策略做出了不同的假设，并得出了非常不同的结论。我们在这里提出了两种长期治疗方法，这些治疗方法在使用的合作伙伴匹配策略（即固定或洗牌伙伴）上有所不同。在这里，我们使用无监督的方法根据玩家的动作来聚集这些玩家，然后隐藏Markov模型来推断每个集群中的这些策略是什么。对推断策略的分析表明，固定的伙伴相互作用会导致行为自组织。洗牌的合作伙伴产生了持续纠缠的策略的子组，显然阻止了自我选择过程，从而使参与者完全合作在固定伙伴治疗中。更详细地分析后者表明，可以观察到Allc，Alld，TFT-和WSLS样行为。这项研究还表明，需要长期处理，因为少于25轮的实验主要捕获了学习阶段参与者在这类实验中进行的。

While many theoretical studies have revealed the strategies that could lead to and maintain cooperation in the Iterated Prisoner's Dilemma, less is known about what human participants actually do in this game and how strategies change when being confronted with anonymous partners in each round. Previous attempts used short experiments, made different assumptions of possible strategies, and led to very different conclusions. We present here two long treatments that differ in the partner matching strategy used, i.e. fixed or shuffled partners. Here we use unsupervised methods to cluster the players based on their actions and then Hidden Markov Model to infer what are those strategies in each cluster. Analysis of the inferred strategies reveals that fixed partner interaction leads to a behavioral self-organization. Shuffled partners generate subgroups of strategies that remain entangled, apparently blocking the self-selection process that leads to fully cooperating participants in the fixed partner treatment. Analyzing the latter in more detail shows that AllC, AllD, TFT- and WSLS-like behavior can be observed. This study also reveals that long treatments are needed as experiments less than 25 rounds capture mostly the learning phase participants go through in these kinds of experiments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题