Tone Classification in Mandarin Chinese using Convolutional Neural Networks

被引:19
作者
Chen, Charles [1 ]
Bunescu, Razvan [1 ]
Xu, Li [2 ]
Liu, Chang [1 ]
机构
[1] Ohio Univ, Elect Engn & Comp Sci, Athens, OH 45701 USA
[2] Ohio Univ, Commun Sci & Disorders, Athens, OH 45701 USA
来源
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES | 2016年
关键词
tone classification; Mandarin Chinese; feature learning; convolutional neural networks; FUNDAMENTAL-FREQUENCY CONTOURS; SPEECH; RECOGNITION; INTELLIGIBILITY; PERCEPTION;
D O I
10.21437/Interspeech.2016-528
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In tone languages, different tone patterns of the same syllable may convey different meanings. Tone perception is important for sentence recognition in noise conditions, especially for children with cochlear implants (CI). We propose a method that fully automates tone classification of syllables in Mandarin Chinese. Our model takes as input the raw tone data and uses convolutional neural networks to classify syllables into one of the four tones in Mandarin. When evaluated on syllables recorded from normal-hearing children, our method achieves substantially higher accuracy compared with previous tone classification techniques based on manually edited F-0. The new approach is also more efficient, as it does not require manual checking of Fo. The new tone classification system could have significant clinical applications in the speech evaluation of the hearing impaired population.
引用
收藏
页码:2150 / 2154
页数:5
相关论文
共 32 条
[1]   Convolutional Neural Networks for Speech Recognition [J].
Abdel-Hamid, Ossama ;
Mohamed, Abdel-Rahman ;
Jiang, Hui ;
Deng, Li ;
Penn, Gerald ;
Yu, Dong .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (10) :1533-1545
[2]   Learning Deep Architectures for AI [J].
Bengio, Yoshua .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2009, 2 (01) :1-127
[3]   The role of fundamental frequency contours in the perception of speech against interfering speech [J].
Binns, Christine ;
Culling, John F. .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2007, 122 (03) :1765-1776
[4]   Effects of Lexical Tone Contour on Mandarin Sentence Intelligibility [J].
Chen, Fei ;
Wong, Lena L. N. ;
Hu, Yi .
JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2014, 57 (01) :338-345
[5]  
CHEN SH, 1995, IEEE T SPEECH AUDI P, V3, P146
[6]  
Collobert R, 2011, J MACH LEARN RES, V12, P2493
[7]   Lexical Tone Perception with HiResolution and HiResolution 120 Sound-Processing Strategies in Pediatric Mandarin-Speaking Cochlear Implant Users [J].
Han, Demin ;
Liu, Bo ;
Zhou, Ning ;
Chen, Xueqing ;
Kong, Ying ;
Liu, Haihong ;
Zheng, Yan ;
Xu, Li .
EAR AND HEARING, 2009, 30 (02) :169-177
[8]   Reducing the dimensionality of data with neural networks [J].
Hinton, G. E. ;
Salakhutdinov, R. R. .
SCIENCE, 2006, 313 (5786) :504-507
[9]   Deep Neural Networks for Acoustic Modeling in Speech Recognition [J].
Hinton, Geoffrey ;
Deng, Li ;
Yu, Dong ;
Dahl, George E. ;
Mohamed, Abdel-rahman ;
Jaitly, Navdeep ;
Senior, Andrew ;
Vanhoucke, Vincent ;
Patrick Nguyen ;
Sainath, Tara N. ;
Kingsbury, Brian .
IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) :82-97
[10]  
Hu HB, 2014, INTERSPEECH, P1352