The SWARA Speech Corpus: A Large Parallel Romanian Read Speech Dataset

被引:0
作者
Stan, Adriana [1 ]
Dinescu, Florina [2 ]
Tiple, Cristina [2 ]
Meza, Serban [1 ]
Orza, Bogdan [1 ]
Chirila, Magdalena [2 ]
Giurgiu, Mircea [1 ]
机构
[1] Tech Univ Cluj Napoca, Commun Dept, Cluj Napoca, Romania
[2] Iuliu Hatieganu Univ Med & Pharm, Dept Otorhinolaryngol, Cluj Napoca, Romania
来源
2017 INTERNATIONAL CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN-COMPUTER DIALOGUE (SPED) | 2017年
关键词
speech corpus; Romanian; phone-level annotation; read data; speech synthesis;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces one of the largest Romanian speech datasets freely available for both academic and commercial use. The dataset comprises speech data recorded over the last year from 12 speakers, along with 5 other speakers previously recorded in a separate environment. The data was manually segmented at utterance-level and semi-automatically labelled at phone-level. The resulting corpus amounts to approximately 21 hours of high-quality read speech data, split into over 19,000 utterances. The speakers read between 921 and 1493 utterances each. 880 utterances are common to all speakers and add up to over 16 hours of parallel data. We present the steps of performing the recordings and data segmentation, as well as a first use of this corpus in the context of synthetic voice development.
引用
收藏
页数:6
相关论文
共 16 条
[11]  
Richmond K, 2011, 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, P1516
[12]  
Shichiri K., 2002, P ICSLP, V1, P1269
[13]  
Stan A, 2013, INTERSPEECH, P2330
[14]   The Romanian speech synthesis (RSS) corpus: Building a high quality HMM-based speech synthesis system using a high sampling rate [J].
Stan, Adriana ;
Yamagishi, Junichi ;
King, Simon ;
Aylett, Matthew .
SPEECH COMMUNICATION, 2011, 53 (03) :442-450
[15]  
Wu Z., 2016, SSW, P202, DOI [10.21437/SSW.2016-33, DOI 10.21437/SSW.2016-33]
[16]  
Zen H., 2007, P 6 ISCA WORKSH SPEE, P294