论文标题
DeepFake文本检测:局限性和机会
Deepfake Text Detection: Limitations and Opportunities
论文作者
论文摘要
语言生成模型的最新进展使创建令人信服的合成文本或深层文本。先前的工作证明了滥用Deepfake文本的潜力来误导内容消费者。因此,深层文本检测,歧视人类和机器生成的文本的任务变得越来越重要。已经提出了几种防御措施以进行深层文本检测。但是,我们对他们的现实世界的适用性缺乏透彻的了解。在本文中,我们从4种在线服务中收集了DeepFake文本,该文本由基于变压器的工具提供动力,以评估野外内容防御能力的概括能力。我们开发了几种低成本的对抗攻击,并研究了针对自适应攻击者的现有防御能力的鲁棒性。我们发现,与其原始声称的绩效相比,在评估方案下,许多防御能力在绩效中显示出明显的降解。我们的评估表明,在文本内容中利用语义信息是提高深击文本检测方案的稳健性和概括性能的有前途的方法。
Recent advances in generative models for language have enabled the creation of convincing synthetic text or deepfake text. Prior work has demonstrated the potential for misuse of deepfake text to mislead content consumers. Therefore, deepfake text detection, the task of discriminating between human and machine-generated text, is becoming increasingly critical. Several defenses have been proposed for deepfake text detection. However, we lack a thorough understanding of their real-world applicability. In this paper, we collect deepfake text from 4 online services powered by Transformer-based tools to evaluate the generalization ability of the defenses on content in the wild. We develop several low-cost adversarial attacks, and investigate the robustness of existing defenses against an adaptive attacker. We find that many defenses show significant degradation in performance under our evaluation scenarios compared to their original claimed performance. Our evaluation shows that tapping into the semantic information in the text content is a promising approach for improving the robustness and generalization performance of deepfake text detection schemes.