论文标题

大型语言模型中的私人解码

Differentially Private Decoding in Large Language Models

论文作者

Majmudar, Jimit, Dupuy, Christophe, Peris, Charith, Smaili, Sami, Gupta, Rahul, Zemel, Richard

论文摘要

最近的大规模自然语言处理(NLP)系统使用预先培训的大型语言模型(LLM),以大规模和多样化的语料库作为销售。实际上,预训练的模型通过对特定于任务的数据集进行微调来适应各种任务。 LLMS虽然有效,但已被证明可以记住培训数据的实例,从而有可能揭示在预训练期间处理的私人信息。潜在的泄漏可能会进一步传播到LLM经过微调的下游任务。另一方面,保存隐私的算法通常涉及从头开始的重新培训,这对LLM来说非常昂贵。在这项工作中,我们提出了一个简单,易于解释的,并且在解码阶段将其轻巧的扰动机制应用于已经训练的模型。我们的扰动机制是模型不可能的,可以与任何LLM结合使用。我们提供的理论分析表明,提出的机制是私人的,实验结果显示了隐私 - 私人权衡权衡。

Recent large-scale natural language processing (NLP) systems use a pre-trained Large Language Model (LLM) on massive and diverse corpora as a headstart. In practice, the pre-trained model is adapted to a wide array of tasks via fine-tuning on task-specific datasets. LLMs, while effective, have been shown to memorize instances of training data thereby potentially revealing private information processed during pre-training. The potential leakage might further propagate to the downstream tasks for which LLMs are fine-tuned. On the other hand, privacy-preserving algorithms usually involve retraining from scratch, which is prohibitively expensive for LLMs. In this work, we propose a simple, easy to interpret, and computationally lightweight perturbation mechanism to be applied to an already trained model at the decoding stage. Our perturbation mechanism is model-agnostic and can be used in conjunction with any LLM. We provide theoretical analysis showing that the proposed mechanism is differentially private, and experimental results showing a privacy-utility trade-off.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源