Model-Based Deep Learning for Music Information Research: Leveraging diverse knowledge sources to enhance explainability, controllability, and resource efficiency [Special Issue On Model-Based and Data-Driven Audio Signal Processing]

被引:1
作者
Richard, Gaeel [1 ]
Lostanlen, Vincent [2 ,3 ]
Yang, Yi-Hsuan [4 ]
Mueller, Meinard [5 ]
机构
[1] Telecom Paris, F-91120 Palaiseau, France
[2] Ctr Natl Rech Sci, F-44000 Nantes, France
[3] Lab Sci Numer Nantes, Nantes, France
[4] Natl Taiwan Univ, Taipei, Taiwan
[5] Int Audio Labs Erlangen, Erlangen, Germany
基金
欧洲研究理事会;
关键词
Deep learning; Knowledge engineering; Special issues and sections; Computational modeling; Neural networks; Knowledge based systems; Music; Production; Machine listening; Multiple signal classification;
D O I
10.1109/MSP.2024.3415569
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this article, we investigate the notion of model-based deep learning in the realm of music information research (MIR). Loosely speaking, we refer to the term model-based deep learning for approaches that combine traditional knowledge-based methods with data-driven techniques, especially those based on deep learning, within a differentiable computing framework. In music, prior knowledge for instance related to sound production, music perception or music composition theory can be incorporated into the design of neural networks and associated loss functions. We outline three specific scenarios to illustrate the application of model-based deep learning in MIR, demonstrating the implementation of such concepts and their potential.
引用
收藏
页码:51 / 59
页数:9
相关论文
共 33 条
[1]  
Shlezinger N., Whang J., Eldar Y.C., Dimakis A.G., Model-based deep learning, Proc. IEEE, 111, 5, pp. 465-499, (2023)
[2]  
Muller M., Ellis D.P.W., Klapuri A., Richard G., Signal processing for music analysis, IEEE J. Sel. Topics Signal Process., 5, 6, pp. 1088-1110, (2011)
[3]  
Bittner R.M., McFee B., Salamon J., Li P., Bello J.P., Deep salience representations for F0 tracking in polyphonic music, Proc. Int. Soc. Music Inf. Retrieval Conf. (ISMIR), pp. 63-70, (2017)
[4]  
Peeters G., Richard G., Deep learning for audio and music, Multi-Faceted Deep Learning: Models and Data, pp. 231-266, (2021)
[5]  
Durand S., Bello J.P., David B., Richard G., Robust downbeat tracking using an ensemble of convolutional networks, IEEE/ACM Trans. Audio Speech Lang. Process., 25, 1, pp. 76-89, (2017)
[6]  
Wu Y.-K., Chiu C.-Y., Yang Y.-H., JukeDrummer: Conditional beat-aware audio-domain drum accompaniment generation via Transformer VQ-VAE, Proc. Int. Soc. Music Inf. Retrieval Conf. (ISMIR), pp. 193-200, (2022)
[7]  
Schwar S., Muller M., Multiscale spectral loss revisited, IEEE Signal Process. Lett., 30, pp. 1712-1716, (2023)
[8]  
Wright A., Valimaki V., Perceptual loss function for neural modeling of audio systems, Proc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), pp. 251-255, (2020)
[9]  
Muller M., Pardo B., Mysore G.J., Valimaki V., Recent advances in music signal processing, IEEE Signal Process. Mag., 36, 1, pp. 17-19, (2019)
[10]  
Daw A., Karpatne A., Watkins W., Read J., Kumar V., Physics-guided neural networks (PGNN): An application in lake temperature modeling, (2017)