Spoken language resources for Cantonese speech processing

被引:74
作者
Lee, T [1 ]
Lo, WK
Ching, PC
Meng, H
机构
[1] Chinese Univ Hong Kong, Dept Elect Engn, Shatin, Hong Kong, Peoples R China
[2] Chinese Univ Hong Kong, Dept Ind Engn & Management Syst, Shatin, Hong Kong, Peoples R China
关键词
speech databases development; Chinese dialects; Chinese phonology and phonetics; annotation of speech data; applications of speech technology; speech recognition; text-to-speech synthesis;
D O I
10.1016/S0167-6393(00)00101-1
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper describes the development of CU Corpora. a series of large-scale speech corpora for Cantonese. Cantonese is the most commonly spoken Chinese dialect in Southern China and Hong Kong. CU Corpora are the first of their kind and intended to serve as an important infrastructure for the advancement of speech recognition and synthesis technologies for this widely used Chinese dialect. They contain a large amount of speech data that cover various linguistic units of spoken Cantonese. including isolated syllables, polysyllabic words and continuous sentences. While some of the corpora are created for specific applications of common interest, the others are designed with emphasis on the coverage and distributions of different phonetic units, including the contextual ones. The speech data are annotated manually so as to provide sufficient orthographic and phonetic information for the development of different applications. Statistical analysis of the annotated data shows that CU Corpora contain rich and balanced phonetic content. The usefulness of the corpora is also demonstrated with a number of speech recognition and speech synthesis applications, (C) 2002 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:327 / 342
页数:16
相关论文
共 42 条
[1]  
[Anonymous], STUDIES YUE DIALECTS
[2]  
BIGORGNE D, 1993, P INT C AC SPEECH SI, V2, P187
[3]  
*CCDICT, 2000, DICT CHIN CHAR VERS
[4]  
CHAN C, 1998, P C PHON LANG CHIN, P13
[5]  
Chen H P, 1994, Bioorg Med Chem, V2, P1, DOI 10.1016/S0968-0896(00)82195-1
[6]  
CHING PC, 1994, P 1994 INT S SPEECH, V1, P127
[7]  
CHOU FC, 1997, P ICASSP, V2, P923
[8]  
CHOUKRI K, 1999, P 1999 OR COCOSDA WO
[9]  
CHOW KF, 1998, P 1998 INT S CHIN SP, P75
[10]  
CHU M, 1998, P 1998 INT C AC SPEE, V1, P277