论文标题

复杂性措施如何衡量?相关和验证基于语料库的形态复杂性的度量

What do complexity measures measure? Correlating and validating corpus-based measures of morphological complexity

论文作者

Çöltekin, Çağrı, Rama, Taraka

论文摘要

我们对用于量化自然语言形态复杂性的八种措施进行了分析。我们研究的措施是基于语料库的形态复杂性的度量,对语料库注释的要求有所不同。我们通过视觉和相关分析以及它们与相关类型变量的关系呈现这些措施之间的相似性和差异。我们的分析侧重于这些“测量”是否是相同基础变量的度量,还是它们测量形态复杂性的一个维度以上。主成分分析表明,第一个主要成分在八种措施中解释了92.62%的变化,表明所研究的复杂度度量之间有很强的线性依赖性。

We present an analysis of eight measures used for quantifying morphological complexity of natural languages. The measures we study are corpus-based measures of morphological complexity with varying requirements for corpus annotation. We present similarities and differences between these measures visually and through correlation analyses, as well as their relation to the relevant typological variables. Our analysis focuses on whether these `measures' are measures of the same underlying variable, or whether they measure more than one dimension of morphological complexity. The principal component analysis indicates that the first principal component explains 92.62 % of the variation in eight measures, indicating a strong linear dependence between the complexity measures studied.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源