从天然语言话语中生成复杂的数据库查询和API调用

论文标题

从天然语言话语中生成复杂的数据库查询和API调用

Generation of complex database queries and API calls from natural language utterances

论文作者

Kelkar, Amol, Rajpurohit, Nachiketa, Mittal, Utkarsh, Relan, Peter

论文摘要

生成与自然语言问题相对应的查询是一个长期存在的问题。传统方法缺乏语言灵活性，而较新的顺序到序列模型则需要大量数据。对于特定模式，可以使用小数据集对特定模式进行微调，但这些模型的精度相对较低，可以对序列序列到序列模型进行微调。我们提出了一种将查询生成问题转化为意图分类和插槽填充问题的方法。此方法可以使用小数据集使用。对于类似于培训数据集中的问题，它以高准确性产生复杂的查询。对于其他问题，它可以使用基于模板的方法或预测查询来构建查询，但仍比序列到序列模型更高的精度。在现实世界数据集上，架构微调的最新生成模型具有60 \％的查询生成任务的精确匹配度，而我们的方法使92 \％的精确匹配精度。

Generating queries corresponding to natural language questions is a long standing problem. Traditional methods lack language flexibility, while newer sequence-to-sequence models require large amount of data. Schema-agnostic sequence-to-sequence models can be fine-tuned for a specific schema using a small dataset but these models have relatively low accuracy. We present a method that transforms the query generation problem into an intent classification and slot filling problem. This method can work using small datasets. For questions similar to the ones in the training dataset, it produces complex queries with high accuracy. For other questions, it can use a template-based approach or predict query pieces to construct the queries, still at a higher accuracy than sequence-to-sequence models. On a real-world dataset, a schema fine-tuned state-of-the-art generative model had 60\% exact match accuracy for the query generation task, while our method resulted in 92\% exact match accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题