Polyphonic pitch tracking with deep layered learning

被引:6
|
作者
Elowsson, Anders [1 ,2 ]
机构
[1] KTH Royal Inst Technol, Sch Elect Engn & Comp Sci, Stockholm, Sweden
[2] Univ Oslo, RITMO Ctr Interdisciplinary Studies Rhythm Time &, Oslo, Norway
来源
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA | 2020年 / 148卷 / 01期
基金
瑞典研究理事会;
关键词
FUNDAMENTAL-FREQUENCY ESTIMATION; MULTIPITCH ESTIMATION; MUSIC TRANSCRIPTION;
D O I
10.1121/10.0001468
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This article presents a polyphonic pitch tracking system that is able to extract both framewise and note-based estimates from audio. The system uses several artificial neural networks trained individually in a deep layered learning setup. First, cascading networks are applied to a spectrogram for framewise fundamental frequency (f(0)) estimation. A sparse receptive field is learned by the first network and then used as a filter kernel for parameter sharing throughout the system. Thef(0)activations are connected across time to extract pitch contours. These contours define a framework within which subsequent networks perform onset and offset detection, operating across both time and smaller pitch fluctuations at the same time. As input, the networks use, e.g., variations of latent representations from thef(0)estimation network. Finally, erroneous tentative notes are removed one by one in an iterative procedure that allows a network to classify notes within a correct context. The system was evaluated on four public test sets: MAPS, Bach10, TRIOS, and the MIREX Woodwind quintet and achieved state-of-the-art results for all four datasets. It performs well across all subtasksf(0), pitched onset, and pitched offset tracking.
引用
收藏
页码:446 / 468
页数:23
相关论文
共 50 条
  • [41] Real-Time Polyphonic Pitch Detection on Acoustic Musical Signals
    Goodman, Thomas A.
    Batten, Ian
    2018 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 2018, : 656 - 661
  • [42] A COMPARATIVE ANALYSIS OF TIME-FREQUENCY DECOMPOSITIONS IN POLYPHONIC PITCH ESTIMATION
    Canadas-Quesada, F. J.
    Vera-Candeas, P.
    Ruiz-Reyes, N.
    Carabias, J.
    Cabanas, P.
    Rodriguez, F.
    SIGMAP 2010: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MULTIMEDIA APPLICATION, 2010, : 145 - 150
  • [43] Tracking the Race Between Deep Reinforcement Learning and Imitation Learning
    Gros, Timo P.
    Hoeller, Daniel
    Hoffmann, Joerg
    Wolf, Verena
    QUANTITATIVE EVALUATION OF SYSTEMS (QEST 2020), 2020, 12289 : 11 - 17
  • [44] Deep Learning and Preference Learning for Object Tracking: A Combined Approach
    Pang, Shuchao
    Jose del Coz, Juan
    Yu, Zhezhou
    Luaces, Oscar
    Diez, Jorge
    NEURAL PROCESSING LETTERS, 2018, 47 (03) : 859 - 876
  • [45] Deep Learning and Preference Learning for Object Tracking: A Combined Approach
    Shuchao Pang
    Juan José del Coz
    Zhezhou Yu
    Oscar Luaces
    Jorge Díez
    Neural Processing Letters, 2018, 47 : 859 - 876
  • [46] Formant estimation and tracking: A deep learning approach
    Dissen, Yehoshua
    Goldberger, Jacob
    Keshet, Joseph
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2019, 145 (02): : 642 - 653
  • [47] Deep mutual learning for visual object tracking
    Zhao, Haojie
    Yang, Gang
    Wang, Dong
    Lu, Huchuan
    PATTERN RECOGNITION, 2021, 112 (112)
  • [48] Nonlinear Motion Tracking by Deep Learning Architecture
    Verma, Arnav
    Samaiya, Devesh
    Gupta, Karunesh K.
    3RD INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS (ICCS-2017), 2018, 331
  • [49] Deep learning in multiple animal tracking: A survey
    Liu, Yeqiang
    Li, Weiran
    Liu, Xue
    Li, Zhenbo
    Yue, Jun
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2024, 224
  • [50] Deep learning for multiple object tracking: a survey
    Xu, Yingkun
    Zhou, Xiaolong
    Chen, Shengyong
    Li, Fenfen
    IET COMPUTER VISION, 2019, 13 (04) : 355 - 368