最佳行为先验：用于改进人类协作的数据有效的人类模型

论文标题

最佳行为先验：用于改进人类协作的数据有效的人类模型

Optimal Behavior Prior: Data-Efficient Human Models for Improved Human-AI Collaboration

论文作者

Yang, Mesut, Carroll, Micah, Dragan, Anca

论文摘要

旨在与人们合作的AI代理人从模型中受益，使他们能够预期人类的行为。但是，现实的模型倾向于需要大量的人类数据，这通常很难收集。良好的先验或初始化可以进行更多的数据有效培训，但是是什么使人们对人类行为进行了良好的先验呢？我们的工作利用了一个非常简单的假设：人们通常比随机机会更接近最佳。我们表明，使用最佳行为作为人类模型的先验，使这些模型更加数据效率高，并能够推广到新环境。我们的直觉是，这样的先验使培训能够将自己的宝贵现实数据集中在捕获人类次级临时性的微妙细微差别上，而不是首先是如何完成任务的基础知识。我们还表明，与仅基于实际人类数据的模型相比，使用这些改进的人类模型通常会导致更好的人类协作绩效。

AI agents designed to collaborate with people benefit from models that enable them to anticipate human behavior. However, realistic models tend to require vast amounts of human data, which is often hard to collect. A good prior or initialization could make for more data-efficient training, but what makes for a good prior on human behavior? Our work leverages a very simple assumption: people generally act closer to optimal than to random chance. We show that using optimal behavior as a prior for human models makes these models vastly more data-efficient and able to generalize to new environments. Our intuition is that such a prior enables the training to focus one's precious real-world data on capturing the subtle nuances of human suboptimality, instead of on the basics of how to do the task in the first place. We also show that using these improved human models often leads to better human-AI collaboration performance compared to using models based on real human data alone.

下载PDF全文

下载文献需遵守相关版权规定

论文标题