乘数的交替方向方法用于量化

论文标题

乘数的交替方向方法用于量化

Alternating Direction Method of Multipliers for Quantization

论文作者

Huang, Tianjian, Singhania, Prajwal, Sanjabi, Maziar, Mitra, Pabitra, Razaviyayn, Meisam

论文摘要

诸如深神经网络之类的机器学习模型参数的量化需要解决约束的优化问题，其中约束集由许多简单离散集的笛卡尔产品形成。对于此类优化问题，我们研究了乘数的交替方向方法的量化方向方法（$ \ texttt {admm-q} $）算法，该算法是应用于我们离散优化问题的广泛使用的ADMM方法的变体。我们将$ \ texttt {admm-q} $的迭代物与某些$ \ textit {stastary points} $建立了收敛。据我们所知，这是对具有离散变量/约束问题的问题的ADMM类型方法的首次分析。基于我们的理论见解，我们开发了一些$ \ texttt {admm-q} $的变体，这些变体可以处理不精确的更新规则，并通过使用“软投影”和“对算法的随机性注入随机性”来提高性能”。我们从经验上评估了我们提出的方法的功效。

Quantization of the parameters of machine learning models, such as deep neural networks, requires solving constrained optimization problems, where the constraint set is formed by the Cartesian product of many simple discrete sets. For such optimization problems, we study the performance of the Alternating Direction Method of Multipliers for Quantization ($\texttt{ADMM-Q}$) algorithm, which is a variant of the widely-used ADMM method applied to our discrete optimization problem. We establish the convergence of the iterates of $\texttt{ADMM-Q}$ to certain $\textit{stationary points}$. To the best of our knowledge, this is the first analysis of an ADMM-type method for problems with discrete variables/constraints. Based on our theoretical insights, we develop a few variants of $\texttt{ADMM-Q}$ that can handle inexact update rules, and have improved performance via the use of "soft projection" and "injecting randomness to the algorithm". We empirically evaluate the efficacy of our proposed approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题