Automatic Music Transcription An overview

被引:114
作者
Benetos, Emmanouil [1 ,2 ,3 ]
Dixon, Simon [1 ,4 ]
Duan, Zhiyao [5 ]
Ewert, Sebastian [6 ]
机构
[1] Queen Mary Univ London, Ctr Digital Mus, London, England
[2] Alan Turing Inst, London, England
[3] City Univ London, Dept Comp Sci, London, England
[4] ISMIR, London, England
[5] Univ Rochester, Elect & Comp Engn Dept, Rochester, NY USA
[6] Spotify, Luxembourg, Luxembourg
关键词
POLYPHONIC MUSIC; MULTIPITCH ESTIMATION;
D O I
10.1109/MSP.2018.2869928
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The capability of transcribing music audio into music notation is a fascinating example of human intelligence. It involves perception (analyzing complex auditory scenes), cognition (recognizing musical objects), knowledge representation (forming musical structures), and inference (testing alternative hypotheses). Automatic music transcription (AMT), i.e., the design of computational algorithms to convert acoustic music signals into some form of music notation, is a challenging task in signal processing and artificial intelligence. It comprises several subtasks, including multipitch estimation (MPE), onset and offset detection, instrument recognition, beat and rhythm tracking, interpretation of expressive timing and dynamics, and score typesetting. © 1991-2012 IEEE.
引用
收藏
页码:20 / 30
页数:11
相关论文
共 36 条
  • [1] Unsupervised analysis of polyphonic music by sparse coding
    Abdallah, SA
    Plumbley, MD
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 2006, 17 (01): : 179 - 196
  • [2] Music Information Retrieval: Recent Developments and Applications
    不详
    [J]. FOUNDATIONS AND TRENDS IN INFORMATION RETRIEVAL, 2014, 8 (2-3): : 128 - +
  • [3] [Anonymous], 2011, P 12 INT SOC MUS INF
  • [4] Multiple F0 Estimation and Source Clustering of Polyphonic Music Audio Using PLCA and HMRFs
    Arora, Vipul
    Behera, Laxmidhar
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (02) : 278 - 287
  • [5] Automatic music transcription: challenges and future directions
    Benetos, Emmanouil
    Dixon, Simon
    Giannoulis, Dimitrios
    Kirchhoff, Holger
    Klapuri, Anssi
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2013, 41 (03) : 407 - 434
  • [6] Multiple-instrument polyphonic music transcription using a temporally constrained shift-invariant model
    Benetos, Emmanouil
    Dixon, Simon
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2013, 133 (03) : 1727 - 1741
  • [7] Böck S, 2012, INT CONF ACOUST SPEE, P121, DOI 10.1109/ICASSP.2012.6287832
  • [8] Boulanger-Lewandowski N., 2012, P 29 INT C MACH LEAR, P1881, DOI DOI 10.1109/TAP.2021.3111639
  • [9] Pitch spelling: A computational model
    Cambouropoulos, E
    [J]. MUSIC PERCEPTION, 2003, 20 (04): : 411 - 429
  • [10] Context-Dependent Piano Music Transcription With Convolutional Sparse Coding
    Cogliati, Andrea
    Duan, Zhiyao
    Wohlberg, Brendt
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (12) : 2218 - 2230