论文标题
采矿文档以提取超参数模式
Mining Documentation to Extract Hyperparameter Schemas
论文作者
论文摘要
AI自动化工具需要机器可读的超参数模式来定义其搜索空间。同时,AI库通常配备了良好的人类可读文档。尽管此类文档包含大多数必要的信息,但不幸的是,它尚未准备好使用工具消费。本文介绍了如何在AI库中自动开采Python Docstrings,以提取其超参数的JSON模式。我们评估了来自三个不同库的119个变压器和估计器的方法,发现它有效地提取机器可读模式。我们的愿景是减轻手动创建和维护AI自动化工具的模式的负担,并扩大对较大库和更丰富模式的自动化范围。
AI automation tools need machine-readable hyperparameter schemas to define their search spaces. At the same time, AI libraries often come with good human-readable documentation. While such documentation contains most of the necessary information, it is unfortunately not ready to consume by tools. This paper describes how to automatically mine Python docstrings in AI libraries to extract JSON Schemas for their hyperparameters. We evaluate our approach on 119 transformers and estimators from three different libraries and find that it is effective at extracting machine-readable schemas. Our vision is to reduce the burden to manually create and maintain such schemas for AI automation tools and broaden the reach of automation to larger libraries and richer schemas.