高维协变量的高斯图形回归的多任务学习

论文标题

高维协变量的高斯图形回归的多任务学习

Multi-task Learning for Gaussian Graphical Regressions with High Dimensional Covariates

论文作者

Zhang, Jingfei, Li, Yi

论文摘要

高斯图形回归是一种强大的手段，可回归协变量上高斯图形模型的精确矩阵，允许响应变量和协变量的数量远远超过样本量。模型拟合通常是通过单独的节点套索回归进行的，忽略了这些回归中网络诱导的结构。因此，错误率很高，尤其是当节点数量较大时。我们提出了一个多任务学习估计器，用于拟合高斯图形回归模型。我们设计了一个跨任务组的稀疏性惩罚和在任务元素内部的稀疏性惩罚，该惩罚分别控制着主动协变量的稀疏性及其对图表的影响。对于计算，我们考虑了一种有效的增强拉格朗日算法，该算法通过半平滑的牛顿方法解决子问题。对于理论上，我们表明，基于多任务学习的估计值的错误率对单独的节点套件估计值的错误率有很大改善，因为交叉任务惩罚借用了跨任务的信息。为了解决任务纠缠在复杂的相关结构中的主要挑战，我们建立了一个新的尾巴概率，该尾巴概率限制为具有任意相关结构的相关的重尾（亚指定）变量，这本身就是有用的理论结果。最后，通过模拟以及对与脑癌患者的基因共表达网络研究的应用，我们的方法的实用性得到了证明。

Gaussian graphical regression is a powerful means that regresses the precision matrix of a Gaussian graphical model on covariates, permitting the numbers of the response variables and covariates to far exceed the sample size. Model fitting is typically carried out via separate node-wise lasso regressions, ignoring the network-induced structure among these regressions. Consequently, the error rate is high, especially when the number of nodes is large. We propose a multi-task learning estimator for fitting Gaussian graphical regression models; we design a cross-task group sparsity penalty and a within task element-wise sparsity penalty, which govern the sparsity of active covariates and their effects on the graph, respectively. For computation, we consider an efficient augmented Lagrangian algorithm, which solves subproblems with a semi-smooth Newton method. For theory, we show that the error rate of the multi-task learning based estimates has much improvement over that of the separate node-wise lasso estimates, because the cross-task penalty borrows information across tasks. To address the main challenge that the tasks are entangled in a complicated correlation structure, we establish a new tail probability bound for correlated heavy-tailed (sub-exponential) variables with an arbitrary correlation structure, a useful theoretical result in its own right. Finally, the utility of our method is demonstrated through simulations as well as an application to a gene co-expression network study with brain cancer patients.

下载PDF全文

下载文献需遵守相关版权规定

论文标题