论文标题

克拉特:为爱沙尼亚国家图书馆开发自动主题索引工具

Kratt: Developing an Automatic Subject Indexing Tool for The National Library of Estonia

论文作者

Asula, Marit, Makke, Jane, Freienthal, Linda, Kuulmets, Hele-Andra, Sirel, Raul

论文摘要

在库中索引的手动主题是一个耗时且昂贵的过程,分配主题的质量受到目录中包含的特定主题的知识的影响。试图解决这些问题,我们利用了人工智能开发Kratt产生的机会:自动主题索引工具的原型。克拉特(Kratt)能够将一本独立于其范围和流派的书进行索引,并在爱沙尼亚主题中存在一组关键字。克拉特大约需要1分钟才能索引一本书,超过人类10-15次。尽管所产生的关键字不被目录者认为令人满意,但是一小部分常规图书馆用户的评分表现出了更多的希望。我们还认为,可以通过包括更大的训练模型并应用更仔细的预处理技术来增强结果。

Manual subject indexing in libraries is a time-consuming and costly process and the quality of the assigned subjects is affected by the cataloguer's knowledge on the specific topics contained in the book. Trying to solve these issues, we exploited the opportunities arising from artificial intelligence to develop Kratt: a prototype of an automatic subject indexing tool. Kratt is able to subject index a book independent of its extent and genre with a set of keywords present in the Estonian Subject Thesaurus. It takes Kratt approximately 1 minute to subject index a book, outperforming humans 10-15 times. Although the resulting keywords were not considered satisfactory by the cataloguers, the ratings of a small sample of regular library users showed more promise. We also argue that the results can be enhanced by including a bigger corpus for training the model and applying more careful preprocessing techniques.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源