多元分类响应回归的条件概率张量分解

论文标题

多元分类响应回归的条件概率张量分解

Conditional probability tensor decompositions for multivariate categorical response regression

论文作者

Molstad, Aaron J., Zhang, Xin

论文摘要

在许多现代回归应用中，该响应由多个分类随机变量组成，其概率质量是一组通用预测变量的函数。在本文中，我们提出了一种在响应变量数量，每个响应的类别数以及预测变量尺寸的设置中建模这种概率质量函数的新方法。我们的方法依赖于功能概率张量分解：张量值函数的分解，使其范围是一组受限的低级别概率张量。这种分解是由响应的条件独立性或缺乏响应的概率张量等级之间的联系。我们表明，这种低级别功能概率张量分解所隐含的模型可以用回归的混合物来解释，因此可以使用最大似然拟合。我们得出了一种有效且可扩展的惩罚性期望最大化算法，以适应该模型并检查其统计特性。我们通过模拟研究和对基因功能类别进行建模的应用来证明我们方法的令人鼓舞的性能。

In many modern regression applications, the response consists of multiple categorical random variables whose probability mass is a function of a common set of predictors. In this article, we propose a new method for modeling such a probability mass function in settings where the number of response variables, the number of categories per response, and the dimension of the predictor are large. Our method relies on a functional probability tensor decomposition: a decomposition of a tensor-valued function such that its range is a restricted set of low-rank probability tensors. This decomposition is motivated by the connection between the conditional independence of responses, or lack thereof, and their probability tensor rank. We show that the model implied by such a low-rank functional probability tensor decomposition can be interpreted in terms of a mixture of regressions and can thus be fit using maximum likelihood. We derive an efficient and scalable penalized expectation maximization algorithm to fit this model and examine its statistical properties. We demonstrate the encouraging performance of our method through both simulation studies and an application to modeling the functional classes of genes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题