论文标题
通过混合抽取模型的平行分支和句子压缩的符合范围编程方法,用于句子压缩
A Difference-of-Convex Programming Approach With Parallel Branch-and-Bound For Sentence Compression Via A Hybrid Extractive Model
论文作者
论文摘要
句子压缩是自然语言处理中的一个重要问题,该问题在文本摘要,搜索引擎和人类交互系统中进行了广泛的应用。在本文中,我们设计了一个混合提取句子压缩模型,结合了概率语言模型和一个解析树语言模型,以通过保证压缩结果的句法来压缩句子。我们的压缩模型被称为整数线性编程问题,可以根据确切的惩罚技术将其重写为符合范围(DC)编程问题。我们使用众所周知的有效DC算法-DCA来处理本地最佳解决方案的惩罚问题。然后,将DCA与平行分支和结合框架(即PDCABB)组合的混合全局优化算法用于查找全局最佳解决方案。数值结果表明,我们的句子压缩模型可以提供通过F-SCORE评估的出色压缩结果,并表明PDCABB是解决我们的句子压缩模型的有希望的算法。
Sentence compression is an important problem in natural language processing with wide applications in text summarization, search engine and human-AI interaction system etc. In this paper, we design a hybrid extractive sentence compression model combining a probability language model and a parse tree language model for compressing sentences by guaranteeing the syntax correctness of the compression results. Our compression model is formulated as an integer linear programming problem, which can be rewritten as a Difference-of-Convex (DC) programming problem based on the exact penalty technique. We use a well-known efficient DC algorithm -- DCA to handle the penalized problem for local optimal solutions. Then a hybrid global optimization algorithm combining DCA with a parallel branch-and-bound framework, namely PDCABB, is used for finding global optimal solutions. Numerical results demonstrate that our sentence compression model can provide excellent compression results evaluated by F-score, and indicate that PDCABB is a promising algorithm for solving our sentence compression model.