重复分类评级的统计模型：R包评估者

论文标题

重复分类评级的统计模型：R包评估者

Statistical Models for Repeated Categorical Ratings: The R Package rater

论文作者

Pullin, Jeffrey M., Gurrin, Lyle C., Vukcevic, Damjan

论文摘要

在许多学科中，一个常见的问题是需要将一组项目分配为具有已知标签的类别或类。这通常是由一个或多个专家评估者或有时由自动化过程来完成的。如果这些任务或“评级”难以准确地进行，那么一种常见的策略是由不同的评估者重复它们，甚至在不同情况下多次由同一评估者重复它们。我们提出了Cran上可用的R软件包“评估者”，该软件包实现了几种统计模型的贝叶斯版本，用于分析重复的分类评级数据。每个项目的真实基础（潜在）类别以及每个评估者的准确性是可能的推论。这些模型是DAWID-SKENE模型的扩展和包括的扩展，我们使用Stan概率编程语言实现了它们。我们通过一些示例说明了“评估者”的使用。我们还详细讨论了边缘化和条件的技术，这些技术对于这些模型是必需的，但也更普遍地适用于Stan中实施的其他模型。

A common problem in many disciplines is the need to assign a set of items into categories or classes with known labels. This is often done by one or more expert raters, or sometimes by an automated process. If these assignments or `ratings' are difficult to make accurately, a common tactic is to repeat them by different raters, or even by the same rater multiple times on different occasions. We present an R package `rater`, available on CRAN, that implements Bayesian versions of several statistical models for analysis of repeated categorical rating data. Inference is possible for the true underlying (latent) class of each item, as well as the accuracy of each rater. The models are extensions of, and include, the Dawid-Skene model, and we implemented them using the Stan probabilistic programming language. We illustrate the use of `rater` through a few examples. We also discuss in detail the techniques of marginalisation and conditioning, which are necessary for these models but also apply more generally to other models implemented in Stan.

下载PDF全文

下载文献需遵守相关版权规定

论文标题