论文标题
使用语言嵌入和条件概率预测WALS中的类型特征:úfal提交sigtyp 2020共享任务
Predicting Typological Features in WALS using Language Embeddings and Conditional Probabilities: ÚFAL Submission to the SIGTYP 2020 Shared Task
论文作者
论文摘要
我们将提交给SIGTYP 2020的共享任务,以预测类型学特征。我们提交一个受约束的系统,仅根据WALS数据库预测类型学功能。我们研究了两种方法。两者的简单是基于通过计算条件概率和共同信息来估计语言中特征值相关的系统。第二种方法是训练基于WALS特征的预报语言嵌入的神经预测因子。我们提交的系统将两种方法基于它们的自估计置信度得分结合在一起。我们在测试数据上达到70.7%的准确性,在共享任务中排名第一。
We present our submission to the SIGTYP 2020 Shared Task on the prediction of typological features. We submit a constrained system, predicting typological features only based on the WALS database. We investigate two approaches. The simpler of the two is a system based on estimating correlation of feature values within languages by computing conditional probabilities and mutual information. The second approach is to train a neural predictor operating on precomputed language embeddings based on WALS features. Our submitted system combines the two approaches based on their self-estimated confidence scores. We reach the accuracy of 70.7% on the test data and rank first in the shared task.