论文标题
在实践中,如何在不同的贝叶斯后索指数之间选择假设检验
How to choose between different Bayesian posterior indices for hypothesis testing in practice
论文作者
论文摘要
假设检验是心理学和认知科学的基本统计方法。传统的无效假设显着性检验(NHST)的问题已被广泛讨论,并且在不当使用显着性测试和$ p $值引起的复制问题的拟议解决方案中,是向贝叶斯数据分析的转变。然而,贝叶斯假设检验与各种后索指数有关,以达到显着性和效果的大小。实际上,这使贝叶斯假设检验复杂化,因为多种贝叶斯替代方案的传统$ p $ - 价值会引起混淆,而这些贝叶斯替代品会选择一个贝叶斯的替代方案。在本文中,我们比较了文献中提出的各种贝叶斯后指数,并讨论了它们的好处和局限性。我们的比较表明,从概念上讲,并非所有提议的贝叶斯替代方案和$ p $值是有益的,并且某些指数的有用性在很大程度上取决于研究设计和研究目标。但是,我们的比较还表明,在可用的贝叶斯后索指数中至少有两名候选人,这些指数具有有吸引力的理论特性,据我们所知,在心理学家中广泛使用。
Hypothesis testing is an essential statistical method in psychology and the cognitive sciences. The problems of traditional null hypothesis significance testing (NHST) have been discussed widely, and among the proposed solutions to the replication problems caused by the inappropriate use of significance tests and $p$-values is a shift towards Bayesian data analysis. However, Bayesian hypothesis testing is concerned with various posterior indices for significance and the size of an effect. This complicates Bayesian hypothesis testing in practice, as the availability of multiple Bayesian alternatives to the traditional $p$-value causes confusion which one to select and why. In this paper, we compare various Bayesian posterior indices which have been proposed in the literature and discuss their benefits and limitations. Our comparison shows that conceptually not all proposed Bayesian alternatives to NHST and $p$-values are beneficial, and the usefulness of some indices strongly depends on the study design and research goal. However, our comparison also reveals that there exist at least two candidates among the available Bayesian posterior indices which have appealing theoretical properties and are, to our best knowledge, widely underused among psychologists.