Multi-task learning-based temporal pattern matching network for guitar tablature transcription

被引:0
作者
Kim, Taehyeon [1 ]
Kim, Man-Je [2 ]
Ahn, Chang Wook [1 ,3 ]
机构
[1] Artificial Intelligence Graduate School, Gwangju Institute of Science and Technology, 123 Cheomdangwagi-ro, Buk-gu, Gwangju
[2] Convergence of AI, Chonnam National University, 77 Yongbong-ro, Buk-gu, Gwangju
[3] GIST Institute for Artificial Intelligence, Gwangju Institute of Science and Technology, 123 Cheomdangwagi-ro, Buk-gu, Gwangju
基金
新加坡国家研究基金会;
关键词
Automatic music transcription; Guitar tablature transcription; Multi-task learning; Temporal pattern matching;
D O I
10.1007/s00521-025-11148-y
中图分类号
学科分类号
摘要
Guitar tablature transcription poses unique challenges in automatic music transcription, as it requires capturing both pitch and string usage on a multi-string instrument with various expressive techniques. While guitar tablature is widely used by guitarists in the music field, neural architecture modeling for this task remains underexplored, particularly in accurately mapping pitches to their respective strings. In this work, we propose a multi-task learning-based temporal pattern-matching network (TPMNet) that effectively captures temporal information from guitar recordings, improving the alignment of predicted results. The key contribution of this work is the advancement of neural network architecture, leading to notable improvements in prediction performance for guitar tablature transcription. Additionally, we explored the optimal pooling layer selection method tailored to different tasks, addressing a long-confusing problem in the field. TPMNet’s efficacy was validated through experiments on the GuitarSet dataset, and its generalizability was confirmed via cross-evaluation with the EGDB dataset. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2025.
引用
收藏
页码:12083 / 12102
页数:19
相关论文
共 45 条
[1]  
Sarmento P., Kumar A., Carr C.J., Zukowski Z., Barthet M., Yang Y., DadaGP: A dataset of tokenized guitarpro songs for sequence models, Proceedings of the 22Nd International Society for Music Information Retrieval Conference, ISMIR, 2021, pp. 610-617, (2021)
[2]  
Meade N., Barreyre N., Lowe S.C., Oore S., Exploring conditioning for generative music systems with human-interpretable controls, Proceedings of the Tenth International Conference on Computational Creativity, ICCC, 2019, pp. 148-155, (2019)
[3]  
Huang C.A., Vaswani A., Uszkoreit J., Simon I., Hawthorne C., Shazeer N., Et al., Music transformer: Generating music with long-term structure, 7Th International Conference on Learning Representations, ICLR 2019, pp. 6-9, (2019)
[4]  
Dong H.W., Chen K., Dubnov S., McAuley J., Berg-Kirkpatrick T., Multitrack music transformer, IEEE international conference on acoustics, speech and signal processing, (2023)
[5]  
Armentano M.G., De Noni W.A., Cardoso H.F., Genre classification of symbolic pieces of music, J Intell Inf Syst, 48, pp. 579-599, (2017)
[6]  
Tsai T., Ji K., Composer style classification of piano sheet music images using language model pretraining, Proceedings of the 21th international society for music information retrieval conference, ISMIR, pp. 176-183, (2020)
[7]  
Kim S., Lee H., Park S., Lee J., Choi K., Deep Composer Classification Using Symbolic Representation. In: Late-Breaking Demo Session of the 21St International Society for Music Information Retrieval Conference, (2020)
[8]  
Simonetta F., Chacon C.E.C., Ntalampiras S., Widmer G., A convolutional approach to melody line identification in symbolic scores, Proceedings of the 20Th International Society for Music Information Retrieval Conference, pp. 924-931, (2019)
[9]  
Rafii Z., Liutkus A., Stoter F.R., Mimilakis S.I., FitzGerald D., Pardo B., An overview of lead and accompaniment separation in music, IEEE/ACM Trans Audio Speech Lang Process, 26, 8, pp. 1307-1335, (2018)
[10]  
Burlet G., Fujinaga I., Robotaba guitar tablature transcription framework, Proceedings of the 14Th International Society for Music Information Retrieval Conference, pp. 517-522, (2013)