论文标题
Objtables:结构化电子表格,促进数据质量,再利用和集成
ObjTables: structured spreadsheets that promote data quality, reuse, and integration
论文作者
论文摘要
科学中的一个核心挑战是了解系统行为如何从复杂的网络中出现。这通常需要汇总,重复使用和整合异质信息。文章的补充电子表格是关键数据源。电子表格之所以受欢迎,是因为它们易于读写。但是,电子表格通常很难重新分析,因为它们在没有模式的情况下捕获数据临时,从而定义了它们所代表的对象,关系和属性。为了帮助研究人员重复使用和撰写电子表格,我们开发了Objtables,这是一种工具包,该工具包可以通过将电子表格与图架和对象相关的映射系统相结合,从而使电子表格可阅读。 Objtables包括用于模式的格式;标记用于指示每个电子表格和列表示的类和属性;多种数据类型用于科学信息;以及用于使用模式读取,写作,验证,比较,合并,修订和分析电子表格的高级软件。通过使电子表格更易于重复使用,Objtables可以实现前所未有的次级荟萃分析。通过使为新类型的数据构建新格式和相关软件变得容易,Objtables也可以加速新兴的科学领域。
A central challenge in science is to understand how systems behaviors emerge from complex networks. This often requires aggregating, reusing, and integrating heterogeneous information. Supplementary spreadsheets to articles are a key data source. Spreadsheets are popular because they are easy to read and write. However, spreadsheets are often difficult to reanalyze because they capture data ad hoc without schemas that define the objects, relationships, and attributes that they represent. To help researchers reuse and compose spreadsheets, we developed ObjTables, a toolkit that makes spreadsheets human- and machine-readable by combining spreadsheets with schemas and an object-relational mapping system. ObjTables includes a format for schemas; markup for indicating the class and attribute represented by each spreadsheet and column; numerous data types for scientific information; and high-level software for using schemas to read, write, validate, compare, merge, revision, and analyze spreadsheets. By making spreadsheets easier to reuse, ObjTables could enable unprecedented secondary meta-analyses. By making it easy to build new formats and associated software for new types of data, ObjTables can also accelerate emerging scientific fields.