论文标题
估计恒定空间中的熵,样品复杂性提高
Estimation of Entropy in Constant Space with Improved Sample Complexity
论文作者
论文摘要
Acharya等人的最新工作。 (Neurips 2019)展示了如何估计分布$ \ Mathcal d $的熵,这是大小$ k $的字母,最高$ \pmε$添加性错误,通过$(k/ε^3)\ cdot \ cdot \ cdot \ text {polylog}(polylog}(1/ε)$ i.i.d.示例,仅使用$ o(1)$单词的内存单词。在这项工作中,我们给出了一种新的常数内存方案,将样本复杂性降低到$(k/ε^2)\ cdot \ text {polylog}(1/ε)$。我们猜想这是最佳的最佳$ \ text {polylog}(1/ε)$因子。
Recent work of Acharya et al. (NeurIPS 2019) showed how to estimate the entropy of a distribution $\mathcal D$ over an alphabet of size $k$ up to $\pmε$ additive error by streaming over $(k/ε^3) \cdot \text{polylog}(1/ε)$ i.i.d. samples and using only $O(1)$ words of memory. In this work, we give a new constant memory scheme that reduces the sample complexity to $(k/ε^2)\cdot \text{polylog}(1/ε)$. We conjecture that this is optimal up to $\text{polylog}(1/ε)$ factors.