具有不同类型的数据的1因子测量模型的因果聚类

论文标题

具有不同类型的数据的1因子测量模型的因果聚类

Causal Clustering for 1-Factor Measurement Models on Data with Various Types

论文作者

Wang, Shuyan

论文摘要

Tetrad约束是一种条件，其满意度标志着协方差子序列的等级降低，并用于设计可因果发现算法，该算法检测出潜在（未测量）变量的存在，例如FOFC。最初，这种算法仅适用于测量和潜在变量都是高斯并具有线性关系（高斯高斯病例）的情况。已经表明，一级潜在变量模型意味着当测量和潜在变量都是二进制的（二进制二进制案例）时，会限制二元。本文证明，当测得的变量是混合数据类型的混合数据时，以及测量的变量离散并且潜在共同原因是连续的，这也可能需要进行tetrad约束，这意味着任何依赖于此约束的聚类算法都可以在这些情况下使用。每个情况都显示了一个例子和证明。 FOFC在混合数据上的性能通过仿真研究显示，并与一些功能相似的算法进行了比较。

The tetrad constraint is a condition of which the satisfaction signals a rank reduction of a covariance submatrix and is used to design causal discovery algorithms that detects the existence of latent (unmeasured) variables, such as FOFC. Initially such algorithms only work for cases where the measured and latent variables are all Gaussian and have linear relations (Gaussian-Gaussian Case). It has been shown that a unidimentional latent variable model implies tetrad constraints when the measured and latent variables are all binary (Binary-Binary case). This paper proves that the tetrad constraint can also be entailed when the measured variables are of mixed data types and when the measured variables are discrete and the latent common causes are continuous, which implies that any clustering algorithm relying on this constraint can work on those cases. Each case is shown with an example and a proof. The performance of FOFC on mixed data is shown by simulation studies and is compared with some algorithms with similar functions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题