复杂分配中的关键短语分类

论文标题

复杂分配中的关键短语分类

Key Phrase Classification in Complex Assignments

论文作者

Ravikiran, Manikandan

论文摘要

复杂的任务通常包括在课堂和在线研究生课程的背景下，具有大而多样的内容的开放式问题。随着这些程序的庞大规模，在同伴和专家反馈中出现了各种问题，包括流氓评论。因此，为了确定审查所需的重要内容，在这项工作中，我们介绍了有关关键短语分类的第一项工作，并采用了有关传统和最新语言建模方法的详细经验研究。从这项研究中，我们发现，关键短语分类的任务在人类层面上是模棱两可的，在新数据集中产生Cohen的Kappa为0.77。审计的语言模型和简单的TFIDF SVM分类器都会产生相似的结果，以前产生的平均值比后者高0.6 f1。最终，我们从广泛的经验和模型解释性结果中获得了对未来教育报告关键短语分类感兴趣的人的实用建议。

Complex assignments typically consist of open-ended questions with large and diverse content in the context of both classroom and online graduate programs. With the sheer scale of these programs comes a variety of problems in peer and expert feedback, including rogue reviews. As such with the hope of identifying important contents needed for the review, in this work we present a very first work on key phrase classification with a detailed empirical study on traditional and most recent language modeling approaches. From this study, we find that the task of classification of key phrases is ambiguous at a human level producing Cohen's kappa of 0.77 on a new data set. Both pretrained language models and simple TFIDF SVM classifiers produce similar results with a former producing average of 0.6 F1 higher than the latter. We finally derive practical advice from our extensive empirical and model interpretability results for those interested in key phrase classification from educational reports in the future.

下载PDF全文

下载文献需遵守相关版权规定

论文标题