论文标题
分布式矩阵乘法的稀疏随机Khatri-Rao产品代码
Sparse Random Khatri-Rao Product Codes for Distributed Matrix Multiplication
论文作者
论文摘要
我们将两个概括介绍给使用随机Khatri-Rao产品(RKRP)代码进行分布式矩阵乘法的范式。我们首先引入了一类称为稀疏随机Khatri-Rao产品(SRKRP)代码的代码,该代码具有稀疏发生器矩阵。当输入矩阵稀疏时,SRKRP代码比RKRP代码较低,计算和通信成本较低,而它们的数值稳定性与其他最先进的方案相似。我们从经验上研究了随机选择的SRKRP代码的生成器矩阵(仅限于非stragglers的集合)的概率之间的关系是等级缺陷与编码方案的各种参数,包括生成器矩阵的稀疏程度和非stragglers的数量。其次,我们表明,如果主节点除了工人执行的计算外,可以执行少量的矩阵产品计算,则可以显着提高故障概率。
We introduce two generalizations to the paradigm of using Random Khatri-Rao Product (RKRP) codes for distributed matrix multiplication. We first introduce a class of codes called Sparse Random Khatri-Rao Product (SRKRP) codes which have sparse generator matrices. SRKRP codes result in lower encoding, computation and communication costs than RKRP codes when the input matrices are sparse, while they exhibit similar numerical stability to other state of the art schemes. We empirically study the relationship between the probability of the generator matrix (restricted to the set of non-stragglers) of a randomly chosen SRKRP code being rank deficient and various parameters of the coding scheme including the degree of sparsity of the generator matrix and the number of non-stragglers. Secondly, we show that if the master node can perform a very small number of matrix product computations in addition to the computations performed by the workers, the failure probability can be substantially improved.