常识知识提取的高级语义

论文标题

常识知识提取的高级语义

Advanced Semantics for Commonsense Knowledge Extraction

论文作者

Nguyen, Tuan-Phong, Razniewski, Simon, Weikum, Gerhard

论文摘要

关于概念及其属性的常识知识（CSK）对于诸如强大的聊天机器人之类的AI应用程序很有用。诸如ConceptNet，Tuplekb和其他人之类的先前作品汇编了大型的CSK集合，但在其表现力上限制了主题呈现的对象（SPO）（SPO）三倍（SPO），其中S和P和O的简单概念是P和O的简单概念。此外，这些项目还优先考虑精确或召回，但很难调和这些互补的目标。本文提出了一种称为Ascent的方法，以自动建立一个大规模的知识库（KB）的CSK断言，具有先进的表现力和更好的精度和回忆，而不是先前的工作。 Ascent通过捕获具有子组和方面的复合概念，以及用语义方面的主张来超越三倍。后者对于表达主张的时间和空间有效性和进一步的预选赛很重要。 Ascent使用语言模型将开放信息提取与明智的清洁结合在一起。内在评估显示了上升KB的较高规模和质量，QA支持任务的外部评估强调了上升的好处。可以在https://ascent.mpi-inf.mpg.de/上找到Web界面，数据和代码。

Commonsense knowledge (CSK) about concepts and their properties is useful for AI applications such as robust chatbots. Prior works like ConceptNet, TupleKB and others compiled large CSK collections, but are restricted in their expressiveness to subject-predicate-object (SPO) triples with simple concepts for S and monolithic strings for P and O. Also, these projects have either prioritized precision or recall, but hardly reconcile these complementary goals. This paper presents a methodology, called Ascent, to automatically build a large-scale knowledge base (KB) of CSK assertions, with advanced expressiveness and both better precision and recall than prior works. Ascent goes beyond triples by capturing composite concepts with subgroups and aspects, and by refining assertions with semantic facets. The latter are important to express temporal and spatial validity of assertions and further qualifiers. Ascent combines open information extraction with judicious cleaning using language models. Intrinsic evaluation shows the superior size and quality of the Ascent KB, and an extrinsic evaluation for QA-support tasks underlines the benefits of Ascent. A web interface, data and code can be found at https://ascent.mpi-inf.mpg.de/.

下载PDF全文

下载文献需遵守相关版权规定

论文标题