论文标题
计算与加入计数查询的本地敏感性
Computing Local Sensitivities of Counting Queries with Joins
论文作者
论文摘要
查询Q的局部灵敏度给定数据库实例D,即,当元组添加到D或从D中删除时,输出Q(D)的变化发生了多少,具有许多应用程序,包括查询分析,离群值检测和差异隐私。但是,即使是针对无环的查询,也可以从查询的大小上找到结合查询的局部灵敏度。尽管当查询大小固定时,复杂性是多项式的,但对于涉及多个连接的大型数据库和查询,幼稚算法并不有效。在本文中,我们提出了一种新颖的方法来计算通过跟踪和汇总元组敏感性来计算涉及加入操作的局部敏感性 - 当添加或删除时,元组可能会导致元组引起元素的最大变化。我们为完整的无环的敏感性问题提供了算法,使用树木的敏感性问题,这些查询在数据库的大小和查询中都以多项式时间运行,并查询了一个有趣的查询子类,我们将其称为“ doubly acyclic查询”,其中包括路径查询,以及在最大程度的复杂性中,在最大程度上是在同一范围内的综合度。我们的算法可以使用通用的HyperTree分解来扩展到某些非囊泡查询。我们通过实验评估我们的方法,并显示我们算法的应用,以通过数量级获得更好的差异隐私结果。
Local sensitivity of a query Q given a database instance D, i.e. how much the output Q(D) changes when a tuple is added to D or deleted from D, has many applications including query analysis, outlier detection, and in differential privacy. However, it is NP-hard to find local sensitivity of a conjunctive query in terms of the size of the query, even for the class of acyclic queries. Although the complexity is polynomial when the query size is fixed, the naive algorithms are not efficient for large databases and queries involving multiple joins. In this paper, we present a novel approach to compute local sensitivity of counting queries involving join operations by tracking and summarizing tuple sensitivities -- the maximum change a tuple can cause in the query result when it is added or removed. We give algorithms for the sensitivity problem for full acyclic join queries using join trees, that run in polynomial time in both the size of the database and query for an interesting sub-class of queries, which we call 'doubly acyclic queries' that include path queries, and in polynomial time in combined complexity when the maximum degree in the join tree is bounded. Our algorithms can be extended to certain non-acyclic queries using generalized hypertree decompositions. We evaluate our approach experimentally, and show applications of our algorithms to obtain better results for differential privacy by orders of magnitude.