论文标题

通过AI挑选的房间声学增强音乐的音频感知

Enhancing Audio Perception of Music By AI Picked Room Acoustics

论文作者

Verma, Prateek, Berger, Jonathan

论文摘要

我们听到的每种声音都是连续的卷积操作的结果(例如房间声学,麦克风特征,仪器本身的共振特性,更不用说声音复制系统的特征和局限性了)。在这项工作中,我们试图确定使用AI执行特定作品的最佳空间。此外,我们使用房间声学作为增强给定声音的感知品质的一种方式。从历史上看,房间(尤其是教堂和音乐厅)旨在主持和提供特定的音乐功能。在某些情况下,建筑声学品质增强了那里的音乐。我们试图通过指定房间冲动响应来模仿这一步骤,这些响应与为特定音乐产生增强的音质相关。首先,对卷积架构进行了培训,可以采用音频样本,并模仿各种仪器家族准确性约78%的专家的评分,并具有感知品质的笔记。这为我们提供了任何音频样本的评分功能,该功能可以自动评估音符的感知愉悦度。现在,通过模仿各种房间,材料等的大约60,000个合成冲动响应的库,我们使用简单的卷积操作来改变声音,就好像它在特定的房间里播放一样。感知评估者用于对音乐声音进行排名,并产生“最佳房间或音乐厅”来播放声音。作为副产品,它还可以使用房间声学将质量差的声音变成“好”声音。

Every sound that we hear is the result of successive convolutional operations (e.g. room acoustics, microphone characteristics, resonant properties of the instrument itself, not to mention characteristics and limitations of the sound reproduction system). In this work we seek to determine the best room in which to perform a particular piece using AI. Additionally, we use room acoustics as a way to enhance the perceptual qualities of a given sound. Historically, rooms (particularly Churches and concert halls) were designed to host and serve specific musical functions. In some cases the architectural acoustical qualities enhanced the music performed there. We try to mimic this, as a first step, by designating room impulse responses that would correlate to producing enhanced sound quality for particular music. A convolutional architecture is first trained to take in an audio sample and mimic the ratings of experts with about 78 % accuracy for various instrument families and notes for perceptual qualities. This gives us a scoring function for any audio sample which can rate the perceptual pleasantness of a note automatically. Now, via a library of about 60,000 synthetic impulse responses mimicking all kinds of room, materials, etc, we use a simple convolution operation, to transform the sound as if it was played in a particular room. The perceptual evaluator is used to rank the musical sounds, and yield the "best room or the concert hall" to play a sound. As a byproduct it can also use room acoustics to turn a poor quality sound into a "good" sound.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源