论文标题
朝着鱼样图像的柔性元数据管道
Toward a Flexible Metadata Pipeline for Fish Specimen Images
论文作者
论文摘要
灵活的元数据管道对于支持公平数据原则至关重要。尽管需要这种需求,但研究人员很少报告他们确定支持最佳灵活性的元数据标准和协议的方法。本文报告了一项针对弹性元数据管道开发的计划,该计划包含300,000多个数字鱼样品图像,这些图像从多个数据存储库和收藏品中收获。图像及其相关的元数据被用于与AI相关的科学研究,涉及自动化物种鉴定,分割和特质提取。本文提供了上下文背景,随后介绍了一种基于四个方法的方法,涉及:1。问题评估,第2。解决方案的调查,3。实施和4。改进。这项工作是NSF利用数据革命,生物学指导神经网络(NSF/HDR-BGNN)项目和HDR Imageomics Institute的一部分。提出了RDF图原型管道,然后讨论研究含义和结论,总结了结果。
Flexible metadata pipelines are crucial for supporting the FAIR data principles. Despite this need, researchers seldom report their approaches for identifying metadata standards and protocols that support optimal flexibility. This paper reports on an initiative targeting the development of a flexible metadata pipeline for a collection containing over 300,000 digital fish specimen images, harvested from multiple data repositories and fish collections. The images and their associated metadata are being used for AI-related scientific research involving automated species identification, segmentation and trait extraction. The paper provides contextual background, followed by the presentation of a four-phased approach involving: 1. Assessment of the Problem, 2. Investigation of Solutions, 3. Implementation, and 4. Refinement. The work is part of the NSF Harnessing the Data Revolution, Biology Guided Neural Networks (NSF/HDR-BGNN) project and the HDR Imageomics Institute. An RDF graph prototype pipeline is presented, followed by a discussion of research implications and conclusion summarizing the results.