论文标题
深入的强化学习在线对话状态跟踪
Deep Reinforcement Learning for On-line Dialogue State Tracking
论文作者
论文摘要
对话状态跟踪(DST)是对话管理中的关键模块。它通常是作为有监督的培训问题而施放的,这对于在线优化而言并不方便。在本文中,提出了基于新颖的伴侣教学的深入强化学习(DRL)在线DST优化的框架。据我们所知,这是在DRL框架中优化DST模块的首次努力,以用于在线任务的口语对话系统。此外,对话政策可以进一步共同更新。实验表明,在线DST优化可以有效地提高对话管理器的性能,同时保持使用预定义策略的灵活性。 DST和政策的联合培训可以进一步提高绩效。
Dialogue state tracking (DST) is a crucial module in dialogue management. It is usually cast as a supervised training problem, which is not convenient for on-line optimization. In this paper, a novel companion teaching based deep reinforcement learning (DRL) framework for on-line DST optimization is proposed. To the best of our knowledge, this is the first effort to optimize the DST module within DRL framework for on-line task-oriented spoken dialogue systems. In addition, dialogue policy can be further jointly updated. Experiments show that on-line DST optimization can effectively improve the dialogue manager performance while keeping the flexibility of using predefined policy. Joint training of both DST and policy can further improve the performance.