集装箱：大数据ML用例的自主云节点范围

论文标题

集装箱：大数据ML用例的自主云节点范围

ContainerStress: Autonomous Cloud-Node Scoping Framework for Big-Data ML Use Cases

论文作者

Wang, Guang Chao, Gross, Kenny, Subramaniam, Akshay

论文摘要

在云环境中部署大数据机器学习（ML）服务对云供应商的云供应商对任何给定的客户用例都提出了挑战。 Oraclelabs开发了一个自动化的框架，该框架使用嵌套环蒙特卡洛模拟自动扩展云CPU-GPU“形状”范围内的任何尺寸的客户ML用例（CPU和/或GPU的配置在最终客户可用的云容器中的配置））。此外，Oraclelabs和Nvidia的作者还合作进行了ML基准研究，该研究分析了任何ML预后算法的计算成本和GPU加速度，并评估了包括常规CPU和NVIDIA GPU的云容器中的计算降低。

Deploying big-data Machine Learning (ML) services in a cloud environment presents a challenge to the cloud vendor with respect to the cloud container configuration sizing for any given customer use case. OracleLabs has developed an automated framework that uses nested-loop Monte Carlo simulation to autonomously scale any size customer ML use cases across the range of cloud CPU-GPU "Shapes" (configurations of CPUs and/or GPUs in Cloud containers available to end customers). Moreover, the OracleLabs and NVIDIA authors have collaborated on a ML benchmark study which analyzes the compute cost and GPU acceleration of any ML prognostic algorithm and assesses the reduction of compute cost in a cloud container comprising conventional CPUs and NVIDIA GPUs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题