论文标题
双变量空间数据的优先采样
Preferential Sampling for Bivariate Spatial Data
论文作者
论文摘要
优先采样提供了形式的建模规范,以捕获一组采样位置对推理的效果,当使用地理模型来解释采样位置的观察到的响应时。特别是,它可以修改针对偏差调整的空间预测。它在文献中的最初介绍涉及对这种采样偏见的存在的评估,同时遵循着重点是回归规范的工作,以改善这种偏见的空间插值。迄今为止,文献中的所有工作都考虑了每个位置的单变量响应变量的情况,无论是连续的还是通过潜在的连续变量建模的。这里的贡献是将优先采样的概念扩展到每个位置的双变量响应情况。这揭示了采样方案,其中在给定位置观察到两个响应以及在某些位置仅记录其中一个响应的方案。也就是说,一种响应可能存在不同的采样偏差。它导致评估这种偏见对共同策略的影响。它还揭示了优先采样可能会偏向一个位置响应之间的依赖性的可能性。我们通过各种模型规范开发了双变量优先采样的想法,并说明了这些规格对预测和依赖行为的影响。我们通过模拟示例以及林业数据集进行此操作,该数据集可作为胸高(MDBH)的平均直径和每公顷树木(TPH)作为点引用的双变量响应。
Preferential sampling provides a formal modeling specification to capture the effect of bias in a set of sampling locations on inference when a geostatistical model is used to explain observed responses at the sampled locations. In particular, it enables modification of spatial prediction adjusted for the bias. Its original presentation in the literature addressed assessment of the presence of such sampling bias while follow on work focused on regression specification to improve spatial interpolation under such bias. All of the work in the literature to date considers the case of a univariate response variable at each location, either continuous or modeled through a latent continuous variable. The contribution here is to extend the notion of preferential sampling to the case of bivariate response at each location. This exposes sampling scenarios where both responses are observed at a given location as well as scenarios where, for some locations, only one of the responses is recorded. That is, there may be different sampling bias for one response than for the other. It leads to assessing the impact of such bias on co-kriging. It also exposes the possibility that preferential sampling can bias inference regarding dependence between responses at a location. We develop the idea of bivariate preferential sampling through various model specifications and illustrate the effect of these specifications on prediction and dependence behavior. We do this both through simulation examples as well as with a forestry dataset that provides mean diameter at breast height (MDBH) and trees per hectare (TPH) as the point-referenced bivariate responses.