A STANDARDIZATION PROGRAM OF SPEECH CORPUS COLLECTION

被引:0
作者
Yin, Zhigang [1 ]
Li, Aijun [1 ]
机构
[1] Chinese Acad Social Sci, Inst Linguist, Beijing, Peoples R China
来源
2017 20TH CONFERENCE OF THE ORIENTAL CHAPTER OF THE INTERNATIONAL COORDINATING COMMITTEE ON SPEECH DATABASES AND SPEECH I/O SYSTEMS AND ASSESSMENT (O-COCOSDA) | 2017年
关键词
standardization; speech corpus; specification;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The speech corpus is the basis of linguistic research and natural language processing. In order to make the speech corpus be collected more efficiently and be used or shared easier, it is necessary to develop the standardization scheme for speech corpus project. This paper tries to provide a standardization program that covers all aspects of data collection, annotation, and distribution. The specifications of constructing a speech corpus are also introduced in the paper. Finally, a telephone speech corpus, TSC973, be exemplified to illuminate the standardization program.
引用
收藏
页码:218 / 222
页数:5
相关论文
共 12 条
[1]  
Cole R. A, 1994, P 1994 INT C SPOK LA
[2]  
Cole R. A, 1995, P EUR C SPEECH COMM
[3]  
COLE RA, 1992, P INT C SPOK LANG PR, P891
[4]  
Lander T, 1995, P INT C PHON SCI STO
[5]  
Li A., 2004, P ICSLT O COCOSDA NE, P15
[6]  
Li A., 2006, ADV CHINESE SPOKEN L
[7]  
Maamouri M, 2004, P AR LANG RES TOOLS
[8]  
Macleod C, 2000, P 2 LANG RES EV C LR, P831
[9]  
Muthusamy Y. K, 2001, P JAPAN SOC SOUND RE, V1, P361
[10]  
Muthusamy YK, 1992, P INT C SPOK LANG PR, DOI [10.1145/3018009.3018049, DOI 10.1145/3018009.3018049]