Automatic lyrics alignment for Cantonese popular music

被引:11
作者
Wong, Chi Hang [1 ]
Szeto, Wai Man [1 ]
Wong, Kin Hong [1 ]
机构
[1] Chinese Univ Hong Kong, Dept Comp Engn & Sci, Shatin, Hong Kong, Peoples R China
关键词
D O I
10.1007/s00530-006-0055-8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
From lyrics-display on electronic music players and Karaoke videos to surtitles for live Chinese opera performance, one feature is common to all these everyday functionalities temporal: synchronization of the written text and its corresponding musical phrase. Our goal is to automate the process of lyrics alignment, a procedure which, to date, is still handled manually in the Cantonese popular song (Cantopop) industry. In our system, a vocal signal enhancement algorithm is developed to extract vocal signals from a CD recording in order to detect the onsets of the syllables sung and to determine the corresponding pitches. The proposed system is specifically designed for Cantonese, in which the contour of the musical melody and the tonal contour of the lyrics must match perfectly. With this prerequisite, we use a dynamic time warping algorithm to align the lyrics. The robustness of this approach is supported by experiment results. The system was evaluated with 70 twenty-second music segments and most samples have their lyrics aligned correctly.
引用
收藏
页码:307 / 323
页数:17
相关论文
共 29 条
[1]  
Abdulla WH, 2003, TENCON IEEE REGION, P1576
[2]  
BERENZWEIG AL, 2001, IEEE WORKSHOP APPL S, P1
[3]   SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION [J].
BOLL, SF .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02) :113-120
[4]  
Chan Marjorie K. M., 1987, P 13 ANN M BERK LING, P26
[5]  
CHIANG YH, CHINESE TALKING SYLL
[6]  
Chou W, 2001, INT CONF ACOUST SPEE, P865, DOI 10.1109/ICASSP.2001.941052
[7]  
Clarisse L. P., 2002, P 3 INT C MUS INF RE, P116
[8]  
Crystal D., 1997, CAMBRIDGE ENCY LANGU, V2nd
[9]   YIN, a fundamental frequency estimator for speech and music [J].
de Cheveigné, A ;
Kawahara, H .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2002, 111 (04) :1917-1930
[10]  
Dixon S., 2005, P 8 INT C DIG AUD EF, P92