论文标题
迈向云数据库的动态和安全配置调整
Towards Dynamic and Safe Configuration Tuning for Cloud Databases
论文作者
论文摘要
数据库系统的配置旋钮对于达到高吞吐量和低潜伏期至关重要。最近,与经验丰富的数据库管理员(DBA)相比,使用机器学习方法(ML)的自动调整系统已显示出更好的配置。但是,仍然存在差距可以在生产环境中应用现有系统,尤其是在云中。首先,他们在有限的时间窗口内为给定的工作量进行调整,而忽略工作负载和数据的动态性。其次,他们依靠复制的实例,并且在采样配置时不考虑数据库的可用性,使调整昂贵,延迟和不安全。为了填补这些空白,我们提出了OnlineTune,该元素可以在更改云环境中安全地调整在线数据库。为了适应动态性,Onlinetune将环境因素嵌入到上下文特征,并通过上下文进行了上下文优化,并通过上下文空间分区适应性地优化数据库。为了在调整过程中追求安全性,我们利用黑框和白盒知识来评估配置的安全性,并通过子空间适应提出安全的勘探策略。%,大大降低了应用不良配置的风险。我们对基准和实际工作负载的动态工作负载进行评估。与最先进的方法相比,Onlinetune的累积性能提高了14.4%〜165.3%,同时减少91.0%〜99.5%的不安全配置建议。
Configuration knobs of database systems are essential to achieve high throughput and low latency. Recently, automatic tuning systems using machine learning methods (ML) have shown to find better configurations compared to experienced database administrators (DBAs). However, there are still gaps to apply the existing systems in production environments, especially in the cloud. First, they conduct tuning for a given workload within a limited time window and ignore the dynamicity of workloads and data. Second, they rely on a copied instance and do not consider the availability of the database when sampling configurations, making the tuning expensive, delayed, and unsafe. To fill these gaps, we propose OnlineTune, which tunes the online databases safely in changing cloud environments. To accommodate the dynamicity, OnlineTune embeds the environmental factors as context feature and adopts contextual Bayesian Optimization with context space partition to optimize the database adaptively and scalably. To pursue safety during tuning, we leverage the black-box and the white-box knowledge to evaluate the safety of configurations and propose a safe exploration strategy via subspace adaptation.%, greatly decreasing the risks of applying bad configurations. We conduct evaluations on dynamic workloads from benchmarks and real-world workloads. Compared with the state-of-the-art methods, OnlineTune achieves 14.4%~165.3% improvement on cumulative performance while reducing 91.0%~99.5% unsafe configuration recommendations.