论文标题

通过深入的增强学习和增强的Kalman滤波器设计实验,以校准历史依赖模型

Design of experiments for the calibration of history-dependent models via deep reinforcement learning and an enhanced Kalman filter

论文作者

Villarreal, Ruben, Vlassis, Nikolaos N., Phan, Nhon N., Catanach, Tommie A., Jones, Reese E., Trask, Nathaniel A., Kramer, Sharlotte L. B., Sun, WaiChing

论文摘要

实验数据是昂贵的,因此很难校准复杂模型。对于许多型号而言,鉴于有限的实验预算,可以产生最佳校准的实验设计并不明显。本文介绍了用于设计实验设计的深钢筋学习(RL)算法,该算法通过Kalman Filter(KF)获得了通过Kullback-Leibler(KL)差异测量的信息增益。这种组合实现了用于快速在线实验的实验设计,在这种实验中,传统方法太昂贵了。我们将实验的可能配置作为决策树和马尔可夫决策过程(MDP),其中每个增量步骤都有有限的操作选择。一旦采取了动作,就会使用各种测量来更新实验状态。这些新数据导致KF对参数进行贝叶斯更新,该参数用于增强状态表示。与NASH-SUTCLIFFE效率(NSE)指数相反,该指数需要额外的抽样来测试假设进行正向预测,KF可以通过直接估计通过其他操作获得的新数据值来降低实验的成本。在这项工作中,我们的应用集中在材料的机械测试上。使用复杂的,与历史相关的模型的数值实验用于验证RL设计实验的性能并基准测试实现。

Experimental data is costly to obtain, which makes it difficult to calibrate complex models. For many models an experimental design that produces the best calibration given a limited experimental budget is not obvious. This paper introduces a deep reinforcement learning (RL) algorithm for design of experiments that maximizes the information gain measured by Kullback-Leibler (KL) divergence obtained via the Kalman filter (KF). This combination enables experimental design for rapid online experiments where traditional methods are too costly. We formulate possible configurations of experiments as a decision tree and a Markov decision process (MDP), where a finite choice of actions is available at each incremental step. Once an action is taken, a variety of measurements are used to update the state of the experiment. This new data leads to a Bayesian update of the parameters by the KF, which is used to enhance the state representation. In contrast to the Nash-Sutcliffe efficiency (NSE) index, which requires additional sampling to test hypotheses for forward predictions, the KF can lower the cost of experiments by directly estimating the values of new data acquired through additional actions. In this work our applications focus on mechanical testing of materials. Numerical experiments with complex, history-dependent models are used to verify the implementation and benchmark the performance of the RL-designed experiments.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源