论文标题

相关性平滑:HPC监视数据的轻量化知识提取

Correlation-wise Smoothing: Lightweight Knowledge Extraction for HPC Monitoring Data

论文作者

Netti, Alessio, Tafani, Daniele, Ott, Michael, Schulz, Martin

论文摘要

现代的高性能计算(HPC)和数据中心运营商越来越多地依赖数据分析技术来提高其操作的效率和可靠性。他们采用摄入时间序列监视传感器数据并将其转换为系统调整的可行知识的模型:一种称为操作数据分析(ODA)的过程。但是,监视数据具有很高的维度,与硬件有关,难以解释。加上ODA的严格要求,使大多数传统的数据挖掘方法不切实际,进而使这种类型的数据变得繁琐。大多数当前的ODA解决方案都使用无通用的临时处理方法,对于传感器的功能明智并且不适合可视化。 在本文中,我们提出了一种称为“相关平滑(CS)”的新方法,以从时间序列监视数据以通用且轻量级的方式提取描述性签名。我们的CS方法利用数据维度之间的相关性形成组并产生类似图像的特征,这些特征可以轻松地操纵,可视化和比较。我们在HPC-IDA上评估了CS方法,这是我们使用这项工作发布的数据集的集合,并表明它与大多数最先进的方法相同,同时产生的签名速度较小,高达十倍,同时获得可视化性,同时获得可视化性,跨系统范围和清晰级别的可视化性。

Modern High-Performance Computing (HPC) and data center operators rely more and more on data analytics techniques to improve the efficiency and reliability of their operations. They employ models that ingest time-series monitoring sensor data and transform it into actionable knowledge for system tuning: a process known as Operational Data Analytics (ODA). However, monitoring data has a high dimensionality, is hardware-dependent and difficult to interpret. This, coupled with the strict requirements of ODA, makes most traditional data mining methods impractical and in turn renders this type of data cumbersome to process. Most current ODA solutions use ad-hoc processing methods that are not generic, are sensible to the sensors' features and are not fit for visualization. In this paper we propose a novel method, called Correlation-wise Smoothing (CS), to extract descriptive signatures from time-series monitoring data in a generic and lightweight way. Our CS method exploits correlations between data dimensions to form groups and produces image-like signatures that can be easily manipulated, visualized and compared. We evaluate the CS method on HPC-ODA, a collection of datasets that we release with this work, and show that it leads to the same performance as most state-of-the-art methods while producing signatures that are up to ten times smaller and up to ten times faster, while gaining visualizability, portability across systems and clear scaling properties.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源