论文标题
知识组织系统(KOS)的表现力和机器的加工性:概念和关系的分析
Expressiveness and machine processability of Knowledge Organization Systems (KOS): An analysis of concepts and relations
论文作者
论文摘要
这项研究考虑了不同类型的知识组织系统(KOS)的表现力(即表达能力或表现力),并讨论了其在语义网络背景下可以进行机器处理的潜力。为此,根据主题权威数据(FRSAD)和简单知识组织系统(SKOS)的功能要求提出的概念化,对KO的理论基础进行了审查;还实施了自然语言处理技术。在应用比较分析的情况下,数据集包括词库(Eurovoc),主题标题系统(LCSH)和分类方案(DDC)。这些与本体论(CIDOC-CRM)相比,通过关注它们如何定义和处理概念和关系。据观察,LCSH和DDC专注于字符串的形式主义(名称),而不是语义的建模。他们对构成概念的定义非常模糊,它们构成了许多复杂的概念。相比之下,词库对构成概念的构成有一个连贯的定义,并将系统的方法应用于关系的建模。本体论明确定义了各种类型的关系,并且可以从本质上进行机器处理。本文得出结论,每个KO的表现力和机器加工性的潜力都受其结构规则的广泛调节。很难将主题标题和分类方案表示为具有节点和弧的语义网络,而词库更适合这种表示。此外,揭示了范式转变,该范式侧重于概念之间的关系建模,而不是概念本身。
This study considers the expressiveness (that is the expressive power or expressivity) of different types of Knowledge Organization Systems (KOS) and discusses its potential to be machine-processable in the context of the Semantic Web. For this purpose, the theoretical foundations of KOS are reviewed based on conceptualizations introduced by the Functional Requirements for Subject Authority Data (FRSAD) and the Simple Knowledge Organization System (SKOS); natural language processing techniques are also implemented. Applying a comparative analysis, the dataset comprises a thesaurus (Eurovoc), a subject headings system (LCSH) and a classification scheme (DDC). These are compared with an ontology (CIDOC-CRM) by focusing on how they define and handle concepts and relations. It was observed that LCSH and DDC focus on the formalism of character strings (nomens) rather than on the modelling of semantics; their definition of what constitutes a concept is quite fuzzy, and they comprise a large number of complex concepts. By contrast, thesauri have a coherent definition of what constitutes a concept, and apply a systematic approach to the modelling of relations. Ontologies explicitly define diverse types of relations, and are by their nature machine-processable. The paper concludes that the potential of both the expressiveness and machine processability of each KOS is extensively regulated by its structural rules. It is harder to represent subject headings and classification schemes as semantic networks with nodes and arcs, while thesauri are more suitable for such a representation. In addition, a paradigm shift is revealed which focuses on the modelling of relations between concepts, rather than the concepts themselves.