Sheet Music Transformer: End-To-End Optical Music Recognition Beyond Monophonic Transcription

被引:0
作者
Rios-Vila, Antonio [1 ]
Calvo-Zaragoza, Jorge [1 ]
Paquet, Thierry [2 ]
机构
[1] Univ Alicante, Pattern Recognit & Artificial Intelligence Grp, San Vicente Del Raspeig, Spain
[2] Rouen Univ, LITIS Lab, EA 4108, St Etienne Du Rouvray, France
来源
DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT VI | 2024年 / 14809卷
关键词
Optical Music Recognition; SMT; Transformer; Polyphonic music transcription; GrandStaff; Quartets;
D O I
10.1007/978-3-031-70552-6_2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
State-of-the-art end-to-end Optical Music Recognition (OMR) has, to date, primarily been carried out using monophonic transcription techniques to handle complex score layouts, such as polyphony, often by resorting to simplifications or specific adaptations. Despite their efficacy, these approaches imply challenges related to scalability and limitations. This paper presents the Sheet Music Transformer (SMT), the first end-to-end OMR model designed to transcribe complex musical scores without relying solely on monophonic strategies. Our model employs a Transformer-based image-to-sequence framework that predicts score transcriptions in a standard digital music encoding format from input images. Our model has been tested on two polyphonic music datasets and has proven capable of handling these intricate music structures effectively. The experimental outcomes not only indicate the competence of the model, but also show that it is better than the state-of-the-art methods, thus contributing to advancements in end-to-end OMR transcription.
引用
收藏
页码:20 / 37
页数:18
相关论文
共 38 条
  • [1] Decoupling music notation to improve end-to-end Optical Music Recognition
    Alfaro-Contreras, Maria
    Rios-Vila, Antonio
    Valero-Mas, Jose J.
    Inesta, Jose M.
    Calvo-Zaragoza, Jorge
    [J]. PATTERN RECOGNITION LETTERS, 2022, 158 : 157 - 163
  • [2] Approaching End-to-End Optical Music Recognition for Homophonic Scores
    Alfaro-Contreras, Maria
    Calvo-Zaragoza, Jorge
    Inesta, Jose M.
    [J]. PATTERN RECOGNITION AND IMAGE ANALYSIS, IBPRIA 2019, PT II, 2019, 11868 : 147 - 158
  • [3] NEURAL AUDIO-TO-SCORE MUSIC TRANSCRIPTION FOR UNCONSTRAINED POLYPHONY USING COMPACT OUTPUT REPRESENTATIONS
    Arroyo, Victor
    Valero-Mas, Jose J.
    Calvo-Zaragoza, Jorge
    Pertusa, Antonio
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4603 - 4607
  • [4] Musigraph: Optical Music Recognition Through Object Detection and Graph Neural Network
    Baro, Arnau
    Riba, Pau
    Fornes, Alicia
    [J]. FRONTIERS IN HANDWRITING RECOGNITION, ICFHR 2022, 2022, 13639 : 171 - 184
  • [5] From Optical Music Recognition to Handwritten Music Recognition: A baseline
    Baro, Arnau
    Riba, Pau
    Calvo-Zaragoza, Jorge
    Fornes, Alicia
    [J]. PATTERN RECOGNITION LETTERS, 2019, 123 : 1 - 8
  • [6] Understanding Optical Music Recognition
    Calvo-Zaragoza, Jorge
    Hajic, Jan, Jr.
    Pacha, Alexander
    [J]. ACM COMPUTING SURVEYS, 2020, 53 (04)
  • [7] Handwritten Music Recognition for Mensural notation with convolutional recurrent neural networks
    Calvo-Zaragoza, Jorge
    Toselli, Alejandro H.
    Vidal, Enrique
    [J]. PATTERN RECOGNITION LETTERS, 2019, 128 : 115 - 121
  • [8] End-to-End Neural Optical Music Recognition of Monophonic Scores
    Calvo-Zaragoza, Jorge
    Rizo, David
    [J]. APPLIED SCIENCES-BASEL, 2018, 8 (04):
  • [9] Castellanos F.J., 2020, P 21 INT SOC MUS INF, P558
  • [10] Chiu CC, 2018, 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), P4774, DOI 10.1109/ICASSP.2018.8462105