在几个成本评估标准下，随机最佳控制对近似扩散模型的鲁棒性

论文标题

在几个成本评估标准下，随机最佳控制对近似扩散模型的鲁棒性

Robustness of Stochastic Optimal Control to Approximate Diffusion Models under Several Cost Evaluation Criteria

论文作者

Pradhan, Somnath, Yuksel, Serdar

论文摘要

在控制理论中，通常假定基于设计最佳控制然后应用于实际（真）系统的标称模型。这导致由于真实模型与假定模型之间的不匹配而导致的性能丧失问题。在这种情况下，一个鲁棒性问题是表明，由于假设模型接近真实模型，因此由于真实模型和假定模型之间的不匹配而导致的误差降低至零。当系统的状态动态受控制的扩散过程管辖时，我们研究了这个问题。特别是，我们将讨论有限的地平线和无限 - 摩恩$α$ discounted/ergodic最佳控制问题的连续性和鲁棒性特性，用于一类非分类控制的扩散过程，以及最佳控制时间的最佳控制。在模型上的一般假设和收敛标准下，我们首先确定近似模型的最佳值会收敛到真实模型的最佳值。然后，我们确定，由于不正确的模型接近真实的模型而，由于应用于错误估计的模型而设计的控制策略发生的不匹配引起的错误降低到零。我们将看到，与离散时间设置的相关结果相比，连续的时间理论将使我们通过均匀椭圆形PDE的理论利用解决方案（HJB）方程的强大规律性特性，以达到较强的连续性和稳健性。

In control theory, typically a nominal model is assumed based on which an optimal control is designed and then applied to an actual (true) system. This gives rise to the problem of performance loss due to the mismatch between the true model and the assumed model. A robustness problem in this context is to show that the error due to the mismatch between a true model and an assumed model decreases to zero as the assumed model approaches the true model. We study this problem when the state dynamics of the system are governed by controlled diffusion processes. In particular, we will discuss continuity and robustness properties of finite horizon and infinite-horizon $α$-discounted/ergodic optimal control problems for a general class of non-degenerate controlled diffusion processes, as well as for optimal control up to an exit time. Under a general set of assumptions and a convergence criterion on the models, we first establish that the optimal value of the approximate model converges to the optimal value of the true model. We then establish that the error due to mismatch that occurs by application of a control policy, designed for an incorrectly estimated model, to a true model decreases to zero as the incorrect model approaches the true model. We will see that, compared to related results in the discrete-time setup, the continuous-time theory will let us utilize the strong regularity properties of solutions to optimality (HJB) equations, via the theory of uniformly elliptic PDEs, to arrive at strong continuity and robustness properties.

下载PDF全文

下载文献需遵守相关版权规定

论文标题