论文标题
深层采样:选择性估计,具有预测的误差和响应时间
DeepSampling: Selectivity Estimation with Predicted Error and Response Time
论文作者
论文摘要
空间数据的快速增长敦促研究界找到有效的处理技术,以在大量数据上进行交互式查询。近似查询处理(AQP)是最突出的技术,可以基于随机样本为临时查询提供实时答案。不幸的是,由于样本量,查询参数,数据分布和结果准确性之间的复杂关系,现有的AQP方法在不提供任何精确度指标的情况下提供了答案。本文提出了DeepSmpling,这是一个基于深度学习的模型,该模型可以预测基于样本的样本量,输入分布和查询参数的基于样本的AQP算法的准确性,特别是选择性估计。该模型也可以逆转以测量产生所需准确性的样本量。 DeepSmpling是第一个为现有空间数据库提供可靠工具以控制AQP准确性的系统。
The rapid growth of spatial data urges the research community to find efficient processing techniques for interactive queries on large volumes of data. Approximate Query Processing (AQP) is the most prominent technique that can provide real-time answer for ad-hoc queries based on a random sample. Unfortunately, existing AQP methods provide an answer without providing any accuracy metrics due to the complex relationship between the sample size, the query parameters, the data distribution, and the result accuracy. This paper proposes DeepSampling, a deep-learning-based model that predicts the accuracy of a sample-based AQP algorithm, specially selectivity estimation, given the sample size, the input distribution, and query parameters. The model can also be reversed to measure the sample size that would produce a desired accuracy. DeepSampling is the first system that provides a reliable tool for existing spatial databases to control the accuracy of AQP.