论文标题

OLGA:一种基于本体和基于LSTM的方法,用于生成转移类型的算术单词问题(AWP)

OLGA : An Ontology and LSTM-based approach for generating Arithmetic Word Problems (AWPs) of transfer type

论文作者

Kumar, Suresh, Kumar, P Sreenivasa

论文摘要

机器的生成算术单词问题(AWP)是具有挑战性的,因为它们表达了数量和数学关系,需要保持一致。 ML-Solvers需要大量注释的培训集,这些培训与语言变化有关。需要利用域知识来进行一致性检查,而基于LSTM的方法非常适合产生具有语言变化的文本。结合这些系统,我们建议一个系统OLGA,以产生TC(转移案例)类型的一致单词问题,涉及代理之间的对象转移。尽管我们提供了一个一致的2代理TC问题进行培训的数据集,但仅发现基于LSTM的发电机的输出的大约36%是一致的。我们使用以前提出的TC-主体学的扩展,以确定问题的一致性。在其余的64%中,大约40%的错误犯错了,我们使用相同的本体来修复这些错误。为了检查一致性和维修过程,我们构建了一个自动生成问题的特定实例表示(ABOX)。我们为此任务使用句子分类器和BERT模型。这些LMS的培训集是问题文本,其中句子部分用本体班级名称注释。随着三个代理问题的时间更长,基于LSTM的方法产生的一致问题的百分比进一步下降。因此,我们提出了一种基于本体的方法,该方法将一致的2代理问题扩展到一致的三个代理问题。总体而言,我们的方法会产生大量涉及2或3个代理的一致的TC型AWP。由于Abox具有所有问题的信息,因此也可以生成任何注释。采用拟议的方法来生成其他类型的AWP是有趣的未来工作。

Machine generation of Arithmetic Word Problems (AWPs) is challenging as they express quantities and mathematical relationships and need to be consistent. ML-solvers require a large annotated training set of consistent problems with language variations. Exploiting domain-knowledge is needed for consistency checking whereas LSTM-based approaches are good for producing text with language variations. Combining these we propose a system, OLGA, to generate consistent word problems of TC (Transfer-Case) type, involving object transfers among agents. Though we provide a dataset of consistent 2-agent TC-problems for training, only about 36% of the outputs of an LSTM-based generator are found consistent. We use an extension of TC-Ontology, proposed by us previously, to determine the consistency of problems. Among the remaining 64%, about 40% have minor errors which we repair using the same ontology. To check consistency and for the repair process, we construct an instance-specific representation (ABox) of an auto-generated problem. We use a sentence classifier and BERT models for this task. The training set for these LMs is problem-texts where sentence-parts are annotated with ontology class-names. As three-agent problems are longer, the percentage of consistent problems generated by an LSTM-based approach drops further. Hence, we propose an ontology-based method that extends consistent 2-agent problems into consistent 3-agent problems. Overall, our approach generates a large number of consistent TC-type AWPs involving 2 or 3 agents. As ABox has all the information of a problem, any annotations can also be generated. Adopting the proposed approach to generate other types of AWPs is interesting future work.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源