功能和非参数功能的后验分布和贝叶斯估计量的收敛速率等效

论文标题

功能和非参数功能的后验分布和贝叶斯估计量的收敛速率等效

Equivalence of Convergence Rates of Posterior Distributions and Bayes Estimators for Functions and Nonparametric Functionals

论文作者

Liu, Zejian, Li, Meng

论文摘要

我们研究了贝叶斯方法的后部收缩率，该方法在非参数回归中使用高斯工艺先验及其针对差异操作员的插入式特性。对于一般的内核类别，我们建立了回归函数及其衍生物的后验度量的收敛速率，它们都是最小值的最佳功能，直至某些类别功能的对数因子。我们的计算表明，回归函数及其衍生物的速率最佳估计共享相同的超参数选择，表明贝叶斯程序非常适应衍生物的顺序，并享有将实现功能的广义插件属性扩展到函数引起的功能性。这导致了一种实际上简单的方法来估计回归函数及其衍生物，其使用模拟评估其有限样本性能。我们的证明表明，在某些条件下，那里的贝叶斯估计量的任何收敛速率对应于后验分布的相同收敛速率（即后部收缩率），反之亦然。在$ l_2 $和$ l _ {\ infty} $ norms下，此等效性适用于一般类别的高斯进程，并涵盖回归函数及其衍生功能。除了连接贝叶斯和非绑架方案中的这两个基本样本特性之外，这种等效性还可以通过计算非参数估计量的收敛速率来建立后期收缩率。我们论点的核心是用于内核脊回归和等效内核技术的操作框架。我们得出了一系列尖锐的非反应界限，这些范围在建立非参数点估计量和等价理论的收敛速率方面至关重要，这可能具有独立的兴趣。

We study the posterior contraction rates of a Bayesian method with Gaussian process priors in nonparametric regression and its plug-in property for differential operators. For a general class of kernels, we establish convergence rates of the posterior measure of the regression function and its derivatives, which are both minimax optimal up to a logarithmic factor for functions in certain classes. Our calculation shows that the rate-optimal estimation of the regression function and its derivatives share the same choice of hyperparameter, indicating that the Bayes procedure remarkably adapts to the order of derivatives and enjoys a generalized plug-in property that extends real-valued functionals to function-valued functionals. This leads to a practically simple method for estimating the regression function and its derivatives, whose finite sample performance is assessed using simulations. Our proof shows that, under certain conditions, to any convergence rate of Bayes estimators there corresponds the same convergence rate of the posterior distributions (i.e., posterior contraction rate), and vice versa. This equivalence holds for a general class of Gaussian processes and covers the regression function and its derivative functionals, under both the $L_2$ and $L_{\infty}$ norms. In addition to connecting these two fundamental large sample properties in Bayesian and non-Bayesian regimes, such equivalence enables a new routine to establish posterior contraction rates by calculating convergence rates of nonparametric point estimators. At the core of our argument is an operator-theoretic framework for kernel ridge regression and equivalent kernel techniques. We derive a range of sharp non-asymptotic bounds that are pivotal in establishing convergence rates of nonparametric point estimators and the equivalence theory, which may be of independent interest.

下载PDF全文

下载文献需遵守相关版权规定

论文标题