论文标题
由于AI系统的了解不完整而避免负面影响
Avoiding Negative Side Effects due to Incomplete Knowledge of AI Systems
论文作者
论文摘要
在现实世界中作用的自主剂通常会根据忽略环境某些方面的模型运行。由于对复杂现实世界中的任何建模技术的实际限制,任何给定模型的不完整(手工制作或机器)都是不可避免的。由于其模型的保真度有限,代理商的行动在执行过程中可能会产生意外的,不良的后果。学会识别并避免代理人行动的这种负面影响对于提高自治系统的安全性和可靠性至关重要。减轻负面影响是一个新兴的研究主题,由于AI系统的部署及其广泛的社会影响,引起了人们的关注。本文详细概述了不同形式的负面副作用以及最近解决这些问题的研究工作。我们确定了负面影响的关键特征,强调了避免负面影响的挑战,并讨论了最近开发的方法,将其益处和局限性对比。本文最后讨论了未来研究方向的开放问题和建议。
Autonomous agents acting in the real-world often operate based on models that ignore certain aspects of the environment. The incompleteness of any given model -- handcrafted or machine acquired -- is inevitable due to practical limitations of any modeling technique for complex real-world settings. Due to the limited fidelity of its model, an agent's actions may have unexpected, undesirable consequences during execution. Learning to recognize and avoid such negative side effects of an agent's actions is critical to improve the safety and reliability of autonomous systems. Mitigating negative side effects is an emerging research topic that is attracting increased attention due to the rapid growth in the deployment of AI systems and their broad societal impacts. This article provides a comprehensive overview of different forms of negative side effects and the recent research efforts to address them. We identify key characteristics of negative side effects, highlight the challenges in avoiding negative side effects, and discuss recently developed approaches, contrasting their benefits and limitations. The article concludes with a discussion of open questions and suggestions for future research directions.