NonConvex本地和凸全局约束的在线决策

论文标题

NonConvex本地和凸全局约束的在线决策

Online Decision Making with Nonconvex Local and Convex Global Constraints

论文作者

Chen, Rui, Gunluk, Oktay, Lodi, Andrea, Wang, Guanyi

论文摘要

我们将在线决策问题（ODMP）研究为在线线性编程的自然概括。在ODMP中，单个决策者在$ t $时间步长上执行了一系列决策。在每个时间步骤中，决策者根据到达这一点的可用信息做出了本地可行的决定。目的是最大化累积的奖励，同时满足一些称为目标约束的凸全球约束。在每个步骤中做出的决定都会产生$ M $维的矢量，代表了该本地决策对目标限制的贡献。在在线环境中，这些目标限制是可以适度侵犯的软限制。为了处理ODMP中潜在的非概念性和非线性，我们提出了一种基于双重双重的在线算法。在每个时间步骤中，该算法需要在本地可行集合上解决潜在的非凸优化问题，并在目标集上解决凸优化问题。在某些随机输入模型下，我们表明该算法可以确定地违反目标约束$ o（\ sqrt {mt}）$，并且$ \ tilde {o}（\ sqrt {mt}）$ recarter在预期的奖励中遗憾。进行了关于在线背包问题和分类优化问题的数值实验，以证明我们提出的在线算法的潜力。

We study the online decision making problem (ODMP) as a natural generalization of online linear programming. In ODMP, a single decision maker undertakes a sequence of decisions over $T$ time steps. At each time step, the decision maker makes a locally feasible decision based on information available up to that point. The objective is to maximize the accumulated reward while satisfying some convex global constraints called goal constraints. The decision made at each step results in an $m$-dimensional vector that represents the contribution of this local decision to the goal constraints. In the online setting, these goal constraints are soft constraints that can be violated moderately. To handle potential nonconvexity and nonlinearity in ODMP, we propose a Fenchel dual-based online algorithm. At each time step, the algorithm requires solving a potentially nonconvex optimization problem over the local feasible set and a convex optimization problem over the goal set. Under certain stochastic input models, we show that the algorithm achieves $O(\sqrt{mT})$ goal constraint violation deterministically, and $\tilde{O}(\sqrt{mT})$ regret in expected reward. Numerical experiments on an online knapsack problem and an assortment optimization problem are conducted to demonstrate the potential of our proposed online algorithm.

下载PDF全文

下载文献需遵守相关版权规定

论文标题