深入的强化学习在线对话状态跟踪

论文标题

深入的强化学习在线对话状态跟踪

Deep Reinforcement Learning for On-line Dialogue State Tracking

论文作者

Chen, Zhi, Chen, Lu, Zhou, Xiang, Yu, Kai

论文摘要

对话状态跟踪（DST）是对话管理中的关键模块。它通常是作为有监督的培训问题而施放的，这对于在线优化而言并不方便。在本文中，提出了基于新颖的伴侣教学的深入强化学习（DRL）在线DST优化的框架。据我们所知，这是在DRL框架中优化DST模块的首次努力，以用于在线任务的口语对话系统。此外，对话政策可以进一步共同更新。实验表明，在线DST优化可以有效地提高对话管理器的性能，同时保持使用预定义策略的灵活性。 DST和政策的联合培训可以进一步提高绩效。

Dialogue state tracking (DST) is a crucial module in dialogue management. It is usually cast as a supervised training problem, which is not convenient for on-line optimization. In this paper, a novel companion teaching based deep reinforcement learning (DRL) framework for on-line DST optimization is proposed. To the best of our knowledge, this is the first effort to optimize the DST module within DRL framework for on-line task-oriented spoken dialogue systems. In addition, dialogue policy can be further jointly updated. Experiments show that on-line DST optimization can effectively improve the dialogue manager performance while keeping the flexibility of using predefined policy. Joint training of both DST and policy can further improve the performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题