Reverse engineering of a recording mix with differentiable digital signal processinga)

被引:10
作者
Colonel, Joseph T. [1 ]
Reiss, Joshua [1 ]
机构
[1] Queen Mary Univ London, Ctr Digital Mus, London, England
关键词
TIMBRE; DIMENSIONS; SEPARATION;
D O I
10.1121/10.0005622
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A method to retrieve the parameters used to create a multitrack mix using only raw tracks and the stereo mixdown is presented. This method is able to model linear time-invariant effects such as gain, pan, equalisation, delay, and reverb. Nonlinear effects, such as distortion and compression, are not considered in this work. The optimization procedure used is the stochastic gradient descent with the aid of differentiable digital signal processing modules. This method allows for a fully interpretable representation of the mixing signal chain by explicitly modelling the audio effects rather than using differentiable blackbox modules. Two reverb module architectures are proposed, a "stereo reverb" model and an "individual reverb" model, and each is discussed. Objective feature measures are taken of the outputs of the two architectures when tasked with estimating a target mix and compared against a stereo gain mix baseline. A listening study is performed to measure how closely the two architectures can perceptually match a reference mix when compared to a stereo gain mix. Results show that the stereo reverb model performs best on objective measures and there is no statistically significant difference between the participants' perception of the stereo reverb model and reference mixes.
引用
收藏
页码:608 / 619
页数:12
相关论文
共 39 条
[1]  
Barchiesi D, 2010, J AUDIO ENG SOC, V58, P563
[2]   A blind source separation technique using second-order statistics [J].
Belouchrani, A ;
AbedMeraim, K ;
Cardoso, JF ;
Moulines, E .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1997, 45 (02) :434-444
[3]  
Bogdanov D., 2013, P INT SOC MUS INF RE, P493, DOI [DOI 10.1145/2502081.2502229, DOI 10.5281/ZENODO.1415016]
[4]   Acoustic correlates of timbre space dimensions: A confirmatory study using synthetic tones [J].
Caclin, A ;
McAdams, S ;
Smith, BK ;
Winsberg, S .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2005, 118 (01) :471-482
[5]  
Chen Shuo, 2019, INT C LEARN REPR
[6]   Multiresolution spectrotemporal analysis of complex sounds [J].
Chi, T ;
Ru, PW ;
Shamma, SA .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2005, 118 (02) :887-906
[7]  
Choi W., 2021, ARXIV210413553
[8]  
Colonel J., 2019, AUDIO ENG SOC CONVEN
[9]  
Darken Christian, 1992, NEURAL NETWORKS SIGN, V2
[10]   Acoustic structure of the five perceptual dimensions of timbre in orchestral instrument tones [J].
Elliott, Taffeta M. ;
Hamilton, Liberty S. ;
Theunissen, Frederic E. .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2013, 133 (01) :389-404