堆叠自动短答案评分的神经网络模型

论文标题

堆叠自动短答案评分的神经网络模型

Stacking Neural Network Models for Automatic Short Answer Scoring

论文作者

Rajagede, Rian Adam, Hastuti, Rochana Prih

论文摘要

自动简短答案评分是文本分类问题之一，可以自动评估学生的答案。在制造自动简短答案评分系统时可能会出现一些挑战，其中之一是数据的数量和质量。数据标记过程并不容易，因为它需要一个专家的人类注释者。此外，数据不平衡过程也是一个挑战，因为正确答案的标签数量总是比错误的答案要少得多。在本文中，我们提出了基于神经网络和XGBoost的堆叠模型的使用，用于嵌入句子的分类过程。我们还建议使用数据上采样方法来处理不平衡类和超参数优化算法以自动找到健壮的模型。我们使用UKARA 1.0挑战数据集，我们的最佳模型获得的F1得分为0.821，超过了同一数据集的先前工作。

Automatic short answer scoring is one of the text classification problems to assess students' answers during exams automatically. Several challenges can arise in making an automatic short answer scoring system, one of which is the quantity and quality of the data. The data labeling process is not easy because it requires a human annotator who is an expert in their field. Further, the data imbalance process is also a challenge because the number of labels for correct answers is always much less than the wrong answers. In this paper, we propose the use of a stacking model based on neural network and XGBoost for classification process with sentence embedding feature. We also propose to use data upsampling method to handle imbalance classes and hyperparameters optimization algorithm to find a robust model automatically. We use Ukara 1.0 Challenge dataset and our best model obtained an F1-score of 0.821 exceeding the previous work at the same dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题