论文标题

建模元数据的社区标准作为模板使数据公平

Modeling community standards for metadata as templates makes data FAIR

论文作者

Musen, Mark A., O'Connor, Martin J., Schultes, Erik, Martinez-Romero, Marcos, Hardi, Josef, Graybeal, John

论文摘要

确定数据集是否可以找到,可访问,可互操作和可重复使用(公平)是一个挑战,因为公平的指导原则是指有关用于注释数据集的元数据的高度特质标准。具体而言,公平原则要求元数据“丰富”,并遵守“与领域相关”的社区标准。科学社区应该能够为元数据定义自己的机器可笑模板,以编码这些“丰富”,特定于学科的元素。我们已经在两个软件系统的上下文中探索了这种基于模板的方法。一个系统是Cedar Workbench,调查人员用它来撰写新的元数据。另一个是壁炉工作台,它评估了存档数据集的元数据,以遵守社区标准。当元数据模板成为管理在线数据集的工具生态系统中的中心元素时,收益会产生的累积 - 因为模板是构成公平数据的社区参考,并且因为它们以可以在各种软件应用程序之间分布的形式体现该观点,以帮助数据管理和数据共享。

It is challenging to determine whether datasets are findable, accessible, interoperable, and reusable (FAIR) because the FAIR Guiding Principles refer to highly idiosyncratic criteria regarding the metadata used to annotate datasets. Specifically, the FAIR principles require metadata to be "rich" and to adhere to "domain-relevant" community standards. Scientific communities should be able to define their own machine-actionable templates for metadata that encode these "rich," discipline-specific elements. We have explored this template-based approach in the context of two software systems. One system is the CEDAR Workbench, which investigators use to author new metadata. The other is the FAIRware Workbench, which evaluates the metadata of archived datasets for their adherence to community standards. Benefits accrue when templates for metadata become central elements in an ecosystem of tools to manage online datasets--both because the templates serve as a community reference for what constitutes FAIR data, and because they embody that perspective in a form that can be distributed among a variety of software applications to assist with data stewardship and data sharing.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源