Deep Learning for Black-Box Modeling of Audio Effects

被引:25
作者
Ramirez, Marco A. Martinez [1 ]
Benetos, Emmanouil [1 ]
Reiss, Joshua D. [1 ]
机构
[1] Queen Mary Univ London, Ctr Digital Mus, Mile End Rd, London E1 4NS, England
来源
APPLIED SCIENCES-BASEL | 2020年 / 10卷 / 02期
基金
英国工程与自然科学研究理事会;
关键词
black-box modeling; nonlinear; time-varying; audio effects; deep learning; tube amplifier; transistor-based limiter; Leslie speaker;
D O I
10.3390/app10020638
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Virtual analog modeling of audio effects consists of emulating the sound of an audio processor reference device. This digital simulation is normally done by designing mathematical models of these systems. It is often difficult because it seeks to accurately model all components within the effect unit, which usually contains various nonlinearities and time-varying components. Most existing methods for audio effects modeling are either simplified or optimized to a very specific circuit or type of audio effect and cannot be efficiently translated to other types of audio effects. Recently, deep neural networks have been explored as black-box modeling strategies to solve this task, i.e., by using only input-output measurements. We analyse different state-of-the-art deep learning models based on convolutional and recurrent neural networks, feedforward WaveNet architectures and we also introduce a new model based on the combination of the aforementioned models. Through objective perceptual-based metrics and subjective listening tests we explore the performance of these models when modeling various analog audio effects. Thus, we show virtual analog models of nonlinear effects, such as a tube preamplifier; nonlinear effects with memory, such as a transistor-based limiter and nonlinear time-varying effects, such as the rotating horn and rotating woofer of a Leslie speaker cabinet.
引用
收藏
页数:25
相关论文
共 61 条
[51]  
Reiss J. D., 2014, AUDIO EFFECTS THEORY
[52]   U-Net: Convolutional Networks for Biomedical Image Segmentation [J].
Ronneberger, Olaf ;
Fischer, Philipp ;
Brox, Thomas .
MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION, PT III, 2015, 9351 :234-241
[53]   Integrated Substrate Gap Waveguide for 5G Microwave and Millimeter-Wave Components [J].
Shen, Dongya ;
Wang, Ke ;
Zhang, Xiupu ;
Chen, Jianpei ;
Lin, Liangjie ;
You, Dandan ;
Ruan, Zhidong .
2019 INTERNATIONAL CONFERENCE ON MICROWAVE AND MILLIMETER WAVE TECHNOLOGY (ICMMT 2019), 2019,
[54]  
Smith J., 2002, P 5 INT C DIG AUD EF
[55]  
Smith J.O, 2010, Physical Audio Signal Processing: For Virtual Musical Instruments and Audio Effects
[56]   Modulation-scale analysis for content identification [J].
Sukittanon, S ;
Atlas, LE ;
Pitton, JW .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2004, 52 (10) :3023-3035
[57]  
v.d. Oord A., 2016, WAVENET GENERATIVE M
[58]   Numerical methods for simulation of guitar distortion circuits [J].
Yeh, David T. ;
Abei, Jonathan S. ;
Vladimirescu, Andrei ;
Smith, Julius O. .
COMPUTER MUSIC JOURNAL, 2008, 32 (02) :23-42
[59]   Automated Physical Modeling of Nonlinear Audio Circuits for Real-Time Audio Effects-Part II: BJT and Vacuum Tube Examples [J].
Yeh, David T. .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (04) :1207-1216
[60]   Automated Physical Modeling of Nonlinear Audio Circuits For Real-Time Audio Effects-Part I: Theoretical Development [J].
Yeh, David T. ;
Abel, Jonathan S. ;
Smith, Julius O., III .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (04) :728-737