论文标题

文本分类的双曲线质心计算

Hyperbolic Centroid Calculations for Text Classification

论文作者

Gerek, Aydın, Ferahlar, Cüneyt, Sert, Bilge Şipal, Yüney, Mehmet Can, Taşdemir, Onur, Kalafat, Zeynep Billur, Kelkit, Mert, Ganiz, Murat Can

论文摘要

NLP的新发展是双曲单词嵌入的构建。与欧几里得对应物相反,双曲线的嵌入不是由向量代表,而是由双曲线空间中的点表示。这是构建文档表示形式的最常见基本方案,即在双曲线设置中毫无意义的单词向量的平均值。我们重新解释了矢量的平均值为矢量代表的点的质心,并研究了各种双曲线质心方案及其在文本分类中的有效性。

A new development in NLP is the construction of hyperbolic word embeddings. As opposed to their Euclidean counterparts, hyperbolic embeddings are represented not by vectors, but by points in hyperbolic space. This makes the most common basic scheme for constructing document representations, namely the averaging of word vectors, meaningless in the hyperbolic setting. We reinterpret the vector mean as the centroid of the points represented by the vectors, and investigate various hyperbolic centroid schemes and their effectiveness at text classification.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源