论文标题
多个阶段过程
Adversarial Encoder-Multi-Task-Decoder for Multi-Stage Processes
论文作者
论文摘要
在多阶段过程中,决策以有序的阶段顺序进行。早期阶段通常具有更多的一般信息(更容易收集/更便宜)的观察结果,而后期阶段的观察结果更少,但更具体的数据。这种情况可以由双重漏斗结构来表示,其中样本量从一个阶段到另一个阶段减少,而信息增加。在这种情况下,培训分类器是具有挑战性的,因为早期阶段的信息可能不包含不同的学习模式(拟合不足)。相比之下,以后阶段的小样本量可能会导致过度拟合。我们通过引入一个结合对抗性自动编码器(AAE),多任务学习(MTL)和多标签半监督学习(MLSSL)的框架来解决这两种情况。我们用MTL组件改进了AAE的解码器,因此它可以共同重建原始输入并使用特征网以预测下一个阶段的功能。我们还在MLSSL分类器的输出中引入了序列约束,以确保预测中的顺序模式。使用来自不同领域的现实世界数据(选择过程,医学诊断),我们表明我们的方法的表现优于其他最新方法。
In multi-stage processes, decisions occur in an ordered sequence of stages. Early stages usually have more observations with general information (easier/cheaper to collect), while later stages have fewer observations but more specific data. This situation can be represented by a dual funnel structure, in which the sample size decreases from one stage to the other while the information increases. Training classifiers in this scenario is challenging since information in the early stages may not contain distinct patterns to learn (underfitting). In contrast, the small sample size in later stages can cause overfitting. We address both cases by introducing a framework that combines adversarial autoencoders (AAE), multi-task learning (MTL), and multi-label semi-supervised learning (MLSSL). We improve the decoder of the AAE with an MTL component so it can jointly reconstruct the original input and use feature nets to predict the features for the next stages. We also introduce a sequence constraint in the output of an MLSSL classifier to guarantee the sequential pattern in the predictions. Using real-world data from different domains (selection process, medical diagnosis), we show that our approach outperforms other state-of-the-art methods.