人工智能研究考虑人类生存安全（拱门）

论文标题

人工智能研究考虑人类生存安全（拱门）

AI Research Considerations for Human Existential Safety (ARCHES)

论文作者

Critch, Andrew, Krueger, David

论文摘要

本报告以积极的态度构建，研究了如何以一种更加专注于人类作为一种物种生存的长期前景的方式来指导技术AI研究。简而言之，我们询问人类在下个世纪可能会面临什么存在的存在风险，以及通过当代技术研究的原则来解决这些风险。引入了假设AI技术的关键特性，称为\ emph {profotence}，这对于从人工智能中划出各种潜在的存在风险也很有用，即使随着AI范式的变化也可能发生变化。然后检查一组\ auxref {dirtot}当代研究\方向，以确保其潜在的好处。每个研究方向都以场景驱动的动机以及现有工作的示例来解释。研究指示向社会带来了自身的风险和利益，如果在没有充分的预感和监督的情况下部署其中的重大发展，则可能会在各种影响范围内发生，尤其不能保证会受益。因此，每个方向都伴随着潜在的负副作用的考虑。

Framed in positive terms, this report examines how technical AI research might be steered in a manner that is more attentive to humanity's long-term prospects for survival as a species. In negative terms, we ask what existential risks humanity might face from AI development in the next century, and by what principles contemporary technical research might be directed to address those risks. A key property of hypothetical AI technologies is introduced, called \emph{prepotence}, which is useful for delineating a variety of potential existential risks from artificial intelligence, even as AI paradigms might shift. A set of \auxref{dirtot} contemporary research \directions are then examined for their potential benefit to existential safety. Each research direction is explained with a scenario-driven motivation, and examples of existing work from which to build. The research directions present their own risks and benefits to society that could occur at various scales of impact, and in particular are not guaranteed to benefit existential safety if major developments in them are deployed without adequate forethought and oversight. As such, each direction is accompanied by a consideration of potentially negative side effects.

下载PDF全文

下载文献需遵守相关版权规定

论文标题