论文标题
透明地捕获对异常检测的请求执行路径
Transparently Capturing Request Execution Path for Anomaly Detection
论文作者
论文摘要
随着云系统和大数据分析平台的规模和复杂性的增加,了解和诊断在此类分布式平台中处理服务请求的处理变得越来越具有挑战性。有助于解决此问题的一种方法是准确地捕获所有相关组件之间的完整端到端执行路径。本文介绍了Reptrace,这是一种以透明方式捕获此类执行路径的通用方法。我们分析了执行方案的全面列表,并提出了用于生成所有方案端到端请求执行路径的原理和算法。此外,本文提出了一种异常检测方法,利用请求执行路径来检测请求处理期间的执行异常。具有不同工作负载的四个流行分布式平台上的实验表明,Reptrace可以透明地捕获具有合理的延迟和可忽略不计的网络开销的准确请求执行路径。故障注射实验表明,使用高召回率检测到执行异常(96%)。
With the increasing scale and complexity of cloud systems and big data analytics platforms, it is becoming more and more challenging to understand and diagnose the processing of a service request in such distributed platforms. One way that helps to deal with this problem is to capture the complete end-to-end execution path of service requests among all involved components accurately. This paper presents REPTrace, a generic methodology for capturing such execution paths in a transparent fashion. We analyze a comprehensive list of execution scenarios, and propose principles and algorithms for generating the end-to-end request execution path for all the scenarios. Moreover, this paper presents an anomaly detection approach exploiting request execution paths to detect anomalies of the execution during request processing. The experiments on four popular distributed platforms with different workloads show that REPTrace can transparently capture the accurate request execution path with reasonable latency and negligible network overhead. Fault injection experiments show that execution anomalies are detected with high recall (96%).