从共同训练的潜在空间中的增强型学习代理的结果引导的反事实

论文标题

从共同训练的潜在空间中的增强型学习代理的结果引导的反事实

Outcome-Guided Counterfactuals for Reinforcement Learning Agents from a Jointly Trained Generative Latent Space

论文作者

Yeh, Eric, Sequeira, Pedro, Hostetler, Jesse, Gervasio, Melinda

论文摘要

我们提出了一种新颖的生成方法，用于根据表征剂的行为的结果变量来生成增强剂（RL）剂的看不见和合理的反事实示例。我们的方法使用差异自动编码器来训练潜在空间，该空间共同编码有关与代理商行为有关的观测和结果变量的信息。反事实是使用该潜在空间中的遍历生成的，通过梯度驱动的更新以及对从示例池中抽出的情况的潜在插值生成。其中包括提高生成示例的可能性的更新，从而提高了产生的反事实的合理性。从三个RL环境中的实验中，我们表明这些方法产生的反事实是与纯粹的结果驱动或基于病例的基准相比，这些方法更合理且与它们的查询更接近。最后，我们表明，经过联合训练的潜在训练，可以重建输入观测和行为结果变量，从而在训练有素的潜在现象中产生更高质量的反事实，仅重建观察输入。

We present a novel generative method for producing unseen and plausible counterfactual examples for reinforcement learning (RL) agents based upon outcome variables that characterize agent behavior. Our approach uses a variational autoencoder to train a latent space that jointly encodes information about the observations and outcome variables pertaining to an agent's behavior. Counterfactuals are generated using traversals in this latent space, via gradient-driven updates as well as latent interpolations against cases drawn from a pool of examples. These include updates to raise the likelihood of generated examples, which improves the plausibility of generated counterfactuals. From experiments in three RL environments, we show that these methods produce counterfactuals that are more plausible and proximal to their queries compared to purely outcome-driven or case-based baselines. Finally, we show that a latent jointly trained to reconstruct both the input observations and behavioral outcome variables produces higher-quality counterfactuals over latents trained solely to reconstruct the observation inputs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题