样本 - 最佳，有效地学习树模型

论文标题

样本 - 最佳，有效地学习树模型

Sample-Optimal and Efficient Learning of Tree Ising models

论文作者

Daskalakis, Constantinos, Pan, Qinxuan

论文摘要

我们表明，可以从最佳$ o（n \ ln n/ε^2）$样品中计算出$ n $可变性的树结构化的ising模型，以计算到总变化距离$ε$之内，其中$ o（\ cdot）$隐藏了绝对常数，重要的是，重要的是，它不依赖于该模型的范围，并不依赖于它的范围，并且在其上都没有限制它的优势，我们的范围都不限制。实际上，我们对著名的Chow-Liu [1968]算法的保证使用了插件估算器来估算互信息。尽管该算法（或任何其他）算法可能无法从有限样本中正确识别基础模型的结构，但我们表明，它仍将学习一个树结构的模型，该模型是$ε$ - 在总变化距离中，这是一个称为“适当学习”的保证。我们的保证并不遵循Chow-Liu算法的已知结果以及随之而来的有关学习图形模型的文献，包括最近在这项学习挑战中的算法复兴，仅产生渐近一致性结果，或样本 - 呈现和/或时间偿还算法，除非在图形模型上进行了``超级''的强度，否则在图形模型上进行了``超级''的强度。虽然我们为一种众所周知且简单的算法确定保证，但该算法成功并且是样本最佳的分析非常复杂，需要将边缘分类为具有不同的重建保证的层，具体取决于它们的强度，并根据其强度，并结合了Squared Hellinger距离距离差异模型的细微距离，以使其结合使用，以控制误差误差累积累积的差异。

We show that $n$-variable tree-structured Ising models can be learned computationally-efficiently to within total variation distance $ε$ from an optimal $O(n \ln n/ε^2)$ samples, where $O(\cdot)$ hides an absolute constant which, importantly, does not depend on the model being learned - neither its tree nor the magnitude of its edge strengths, on which we place no assumptions. Our guarantees hold, in fact, for the celebrated Chow-Liu [1968] algorithm, using the plug-in estimator for estimating mutual information. While this (or any other) algorithm may fail to identify the structure of the underlying model correctly from a finite sample, we show that it will still learn a tree-structured model that is $ε$-close to the true one in total variation distance, a guarantee called "proper learning." Our guarantees do not follow from known results for the Chow-Liu algorithm and the ensuing literature on learning graphical models, including a recent renaissance of algorithms on this learning challenge, which only yield asymptotic consistency results, or sample-inefficient and/or time-inefficient algorithms, unless further assumptions are placed on the graphical model, such as bounds on the "strengths" of the model's edges/hyperedges. While we establish guarantees for a widely known and simple algorithm, the analysis that this algorithm succeeds and is sample-optimal is quite complex, requiring a hierarchical classification of the edges into layers with different reconstruction guarantees, depending on their strength, combined with delicate uses of the subadditivity of the squared Hellinger distance over graphical models to control the error accumulation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题