论文标题
光谱探测
Spectral Probing
论文作者
论文摘要
语言信息在不同的时间尺度(子字,短语等)和沟通级别(例如语法和语义)中编码。类似地发现上下文化的嵌入在独特的层和频率下捕获这些现象。利用这些发现,我们开发了一个完全可学习的频率过滤器,以确定任何给定任务的光谱概要。它比先前的手工过滤器更具颗粒状分析,并提高了效率。在在单语环境中证明了对手动过滤器进行光谱探测的信息性之后,我们研究了其在六种语言中七个不同的NLP任务中的多语言特征。我们的分析确定了独特的光谱曲线,以语言直观的方式量化了交叉任务相似性,同时在语言上保持一致,以呈强大,轻量级的任务描述符。
Linguistic information is encoded at varying timescales (subwords, phrases, etc.) and communicative levels, such as syntax and semantics. Contextualized embeddings have analogously been found to capture these phenomena at distinctive layers and frequencies. Leveraging these findings, we develop a fully learnable frequency filter to identify spectral profiles for any given task. It enables vastly more granular analyses than prior handcrafted filters, and improves on efficiency. After demonstrating the informativeness of spectral probing over manual filters in a monolingual setting, we investigate its multilingual characteristics across seven diverse NLP tasks in six languages. Our analyses identify distinctive spectral profiles which quantify cross-task similarity in a linguistically intuitive manner, while remaining consistent across languages-highlighting their potential as robust, lightweight task descriptors.