紧凑型：公开信息提取中的紧凑事实

论文标题

紧凑型：公开信息提取中的紧凑事实

CompactIE: Compact Facts in Open Information Extraction

论文作者

Bayat, Farima Fatahi, Bhutani, Nikita, Jagadish, H. V.

论文摘要

现代神经开放式系统和基准的主要缺点是，它们优先考虑萃取中的信息高于其成分的紧凑性。这严重限制了开放式提取物在许多下游任务中的有用性。如果提取是紧凑和共享成分，则可以改善提取的效用。为此，我们研究了使用基于神经的方法鉴定紧凑提取的问题。我们提出了一种使用新颖的管道方法来产生具有重叠成分的紧凑型提取物的开放式系统。它首先检测到提取的成分，然后将其链接到构建提取物。我们通过处理现有基准获得的紧凑提取物来训练系统。我们在CARB和WIEL57数据集上的实验表明，紧凑型比以前的系统高精度更紧凑的提取物，并具有很高的精度，可以在OpenIE中建立新的最先进的性能。

A major drawback of modern neural OpenIE systems and benchmarks is that they prioritize high coverage of information in extractions over compactness of their constituents. This severely limits the usefulness of OpenIE extractions in many downstream tasks. The utility of extractions can be improved if extractions are compact and share constituents. To this end, we study the problem of identifying compact extractions with neural-based methods. We propose CompactIE, an OpenIE system that uses a novel pipelined approach to produce compact extractions with overlapping constituents. It first detects constituents of the extractions and then links them to build extractions. We train our system on compact extractions obtained by processing existing benchmarks. Our experiments on CaRB and Wire57 datasets indicate that CompactIE finds 1.5x-2x more compact extractions than previous systems, with high precision, establishing a new state-of-the-art performance in OpenIE.

下载PDF全文

下载文献需遵守相关版权规定

论文标题