论文标题
降低浅CNN模型的模型,以可靠地从医疗报告中部署信息提取
Model Reduction of Shallow CNN Model for Reliable Deployment of Information Extraction from Medical Reports
论文作者
论文摘要
浅卷积神经网络(CNN)是从癌症病理报告中提取信息的时间测试工具。浅CNN竞争此任务对包括Bert在内的其他深度学习模型进行了竞争性执行,该模型拥有许多NLP任务的最先进。这种古怪现象背后的主要见解是,从癌症病理学报告中提取的信息仅需要少数特定领域的文本段来执行任务,从而充分利用了该任务过多的文本和上下文。浅CNN模型非常适合从标记的培训集中确定这些关键的短文本段;但是,已确定的文本段对人类仍然晦涩难懂。在这项研究中,我们通过开发一个模型减少工具来填补这一空白,以通过丢弃虚假连接来建立CNN过滤器和相关文本段之间的可靠连接。我们通过在变换权重上以非负和稀疏性的n-gram存在表示形式的线性转换来近似浅层CNN表示的复杂性,以获得可解释的模型。我们的方法弥合了一侧的准确性与另一侧的解释性之间的常规感知权衡边界之间的差距。
Shallow Convolution Neural Network (CNN) is a time-tested tool for the information extraction from cancer pathology reports. Shallow CNN performs competitively on this task to other deep learning models including BERT, which holds the state-of-the-art for many NLP tasks. The main insight behind this eccentric phenomenon is that the information extraction from cancer pathology reports require only a small number of domain-specific text segments to perform the task, thus making the most of the texts and contexts excessive for the task. Shallow CNN model is well-suited to identify these key short text segments from the labeled training set; however, the identified text segments remain obscure to humans. In this study, we fill this gap by developing a model reduction tool to make a reliable connection between CNN filters and relevant text segments by discarding the spurious connections. We reduce the complexity of shallow CNN representation by approximating it with a linear transformation of n-gram presence representation with a non-negativity and sparsity prior on the transformation weights to obtain an interpretable model. Our approach bridge the gap between the conventionally perceived trade-off boundary between accuracy on the one side and explainability on the other by model reduction.