论文标题
实时流媒体和事件驱动的科学实验控制
Real-Time Streaming and Event-driven Control of Scientific Experiments
论文作者
论文摘要
科学仪器传感器和连接设备的进步为正在进行的实验提供了前所未有的见解,并为控制,优化和转向提供了新的机会。但是,传感器的多样性和数据的异质性导致充分实现这些新机会的挑战。在近实时的时间内组织和合成各种数据流需要丰富的自动化和机器学习(ML)。为了在实验过程中有效利用ML,必须解决整个ML生命周期,包括精炼实验配置,重新训练模型以及应用决策任务,这些任务要求将跨越集中的HPC的计算资源范围同样多样化,将其用于边缘的加速器。在这里,我们介绍制造数据和机器学习平台(MDML)。 MDML旨在通过提供网络基础架构来整合网络物理系统中的传感器数据流和AI,以在网络物理系统中整合用于现场分析的高级数据分析和启用ML的自动化过程优化的研究和操作环境。为了实现这一目标,MDML提供了一种面料,以接收和汇总IoT数据,并同时在整个计算连续体上进行远程计算。在本文中,我们描述了MDML,并展示了如何在高级制造中使用它来对物联网数据和编排分布式ML来指导实验。
Advancements in scientific instrument sensors and connected devices provide unprecedented insight into ongoing experiments and present new opportunities for control, optimization, and steering. However, the diversity of sensors and heterogeneity of their data result in make it challenging to fully realize these new opportunities. Organizing and synthesizing diverse data streams in near-real-time requires both rich automation and Machine Learning (ML). To efficiently utilize ML during an experiment, the entire ML lifecycle must be addressed, including refining experiment configurations, retraining models, and applying decisions-tasks that require an equally diverse array of computational resources spanning centralized HPC to the accelerators at the edge. Here we present the Manufacturing Data and Machine Learning platform (MDML). The MDML is designed to standardize the research and operational environment for advanced data analytics and ML-enabled automated process optimization by providing the cyberinfrastructure to integrate sensor data streams and AI in cyber-physical systems for in-situ analysis. To achieve this, the MDML provides a fabric to receive and aggregate IoT data and simultaneously orchestrate remote computation across the computing continuum. In this paper we describe the MDML and show how it is used in advanced manufacturing to act on IoT data and orchestrate distributed ML to guide experiments.