论文标题

自然语言系统中的秘密不安全文本

Mitigating Covertly Unsafe Text within Natural Language Systems

论文作者

Mei, Alex, Kabir, Anisha, Levy, Sharon, Subbiah, Melanie, Allaway, Emily, Judge, John, Patton, Desmond, Bimber, Bruce, McKeown, Kathleen, Wang, William Yang

论文摘要

对于智能技术而言,越来越普遍的问题是文本安全性,因为不受控制的系统可能会向其用户提出建议,从而导致伤害或威胁生命的后果。但是,可能导致身体伤害的生成陈述的显性程度各不相同。在本文中,我们区分了可能导致身体伤害的文本类型,并建立一个特别不受欢迎的类别:秘密不安全的文本。然后,我们就系统的信息进一步分解了这一类别,并讨论解决方案以减轻每个子类别中的文本生成。最终,我们的工作定义了秘密不安全语言的问题,这会造成身体伤害,并认为利益相关者和监管机构需要优先考虑这个微妙而危险的问题。我们强调缓解策略,以激发未来的研究人员解决这个具有挑战性的问题并帮助改善智能系统内的安全性。

An increasingly prevalent problem for intelligent technologies is text safety, as uncontrolled systems may generate recommendations to their users that lead to injury or life-threatening consequences. However, the degree of explicitness of a generated statement that can cause physical harm varies. In this paper, we distinguish types of text that can lead to physical harm and establish one particularly underexplored category: covertly unsafe text. Then, we further break down this category with respect to the system's information and discuss solutions to mitigate the generation of text in each of these subcategories. Ultimately, our work defines the problem of covertly unsafe language that causes physical harm and argues that this subtle yet dangerous issue needs to be prioritized by stakeholders and regulators. We highlight mitigation strategies to inspire future researchers to tackle this challenging problem and help improve safety within smart systems.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源