（何时）强化学习的对比解释有帮助吗？

论文标题

（何时）强化学习的对比解释有帮助吗？

(When) Are Contrastive Explanations of Reinforcement Learning Helpful?

论文作者

Narayanan, Sanjana, Lage, Isaac, Doshi-Velez, Finale

论文摘要

强化学习（RL）的预期行为的全球解释可以使部署更安全。但是，由于许多RL政策的复杂性质，这种解释通常很难理解。有效的人类解释通常是对比的，它指的是已知的对比（政策）以减少冗余。同时，这些解释还需要额外的努力来引用评估解释时对比的对比。我们进行了一项用户研究，以了解是否以及何时可能比不需要参考对比的完整解释更好地解释。我们发现，当它们的大小相同或小于相同政策的对比解释时，通常的解释通常会更有效，而当它们更大时，它们不会更糟。这表明对比解释不足以解决有效解释强化学习政策的问题，并且需要在这种情况下进行更多的仔细研究。

Global explanations of a reinforcement learning (RL) agent's expected behavior can make it safer to deploy. However, such explanations are often difficult to understand because of the complicated nature of many RL policies. Effective human explanations are often contrastive, referencing a known contrast (policy) to reduce redundancy. At the same time, these explanations also require the additional effort of referencing that contrast when evaluating an explanation. We conduct a user study to understand whether and when contrastive explanations might be preferable to complete explanations that do not require referencing a contrast. We find that complete explanations are generally more effective when they are the same size or smaller than a contrastive explanation of the same policy, and no worse when they are larger. This suggests that contrastive explanations are not sufficient to solve the problem of effectively explaining reinforcement learning policies, and require additional careful study for use in this context.

下载PDF全文

下载文献需遵守相关版权规定

论文标题