论文标题
以数据为中心AI(DCAI)的原则
The Principles of Data-Centric AI (DCAI)
论文作者
论文摘要
数据是人工智能(AI)系统学习方式的关键基础架构。但是,迄今为止,这些系统在很大程度上以模型为中心,以牺牲数据质量为代价对模型进行溢价。数据质量问题困扰了AI系统的性能,尤其是在下游部署和现实世界中。以数据为中心的AI(DCAI)作为新兴概念将数据,其质量和动力学带到了AI系统,通过迭代和系统的方法考虑到了最前沿。作为第一个概述之一,本文汇集了以数据为中心的观点和概念来概述DCAI的基础。它专门为研究人员和从业人员制定了六项指导原则,并为将来的DCAI提供了指导。
Data is a crucial infrastructure to how artificial intelligence (AI) systems learn. However, these systems to date have been largely model-centric, putting a premium on the model at the expense of the data quality. Data quality issues beset the performance of AI systems, particularly in downstream deployments and in real-world applications. Data-centric AI (DCAI) as an emerging concept brings data, its quality and its dynamism to the forefront in considerations of AI systems through an iterative and systematic approach. As one of the first overviews, this article brings together data-centric perspectives and concepts to outline the foundations of DCAI. It specifically formulates six guiding principles for researchers and practitioners and gives direction for future advancement of DCAI.