推进人类互补性：用户专业知识和算法调整对联合决策的影响

论文标题

推进人类互补性：用户专业知识和算法调整对联合决策的影响

Advancing Human-AI Complementarity: The Impact of User Expertise and Algorithmic Tuning on Joint Decision Making

论文作者

Inkpen, Kori, Chappidi, Shreya, Mallari, Keri, Nushi, Besmira, Ramesh, Divya, Michelucci, Pietro, Mandava, Vani, Vepřek, Libuše Hannah, Quinn, Gabrielle

论文摘要

人为决策的合作旨在实现超出人类或人工智能表现的团队绩效。但是，许多因素都会影响人类团队的成功，包括用户的领域专业知识，AI系统的心理模型，对建议的信任等等。这项工作研究了用户与三种模拟算法模型的互动，所有这些模型的精度都相似，但对其真正的正面和真实负率进行了不同的调整。我们的研究检查了在非平凡的血管标记任务中的用户性能，参与者表明给定的血管是流动还是停滞。我们的结果表明，尽管AI助剂的建议可以帮助用户决策，但用户相对于AI的基线性能和AI错误类型的补充调整等因素会显着影响整体团队的整体绩效。新手用户有所改善，但不能达到AI的准确级别。高度熟练的用户通常能够识别何时应遵循AI建议，并通常保持或提高其性能。与AI相似的准确性水平的中期者在AI建议有助于或损害其性能方面是最大的。此外，我们发现用户对AI的性能亲戚的看法也对给出AI建议时的准确性是否有所提高产生重大影响。这项工作提供了有关与人类协作有关的因素的复杂性的见解，并提供了有关如何开发以人为中心的AI算法来补充用户在决策任务中的建议。

Human-AI collaboration for decision-making strives to achieve team performance that exceeds the performance of humans or AI alone. However, many factors can impact success of Human-AI teams, including a user's domain expertise, mental models of an AI system, trust in recommendations, and more. This work examines users' interaction with three simulated algorithmic models, all with similar accuracy but different tuning on their true positive and true negative rates. Our study examined user performance in a non-trivial blood vessel labeling task where participants indicated whether a given blood vessel was flowing or stalled. Our results show that while recommendations from an AI-Assistant can aid user decision making, factors such as users' baseline performance relative to the AI and complementary tuning of AI error types significantly impact overall team performance. Novice users improved, but not to the accuracy level of the AI. Highly proficient users were generally able to discern when they should follow the AI recommendation and typically maintained or improved their performance. Mid-performers, who had a similar level of accuracy to the AI, were most variable in terms of whether the AI recommendations helped or hurt their performance. In addition, we found that users' perception of the AI's performance relative on their own also had a significant impact on whether their accuracy improved when given AI recommendations. This work provides insights on the complexity of factors related to Human-AI collaboration and provides recommendations on how to develop human-centered AI algorithms to complement users in decision-making tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题