Research on tone recognition in Chinese spontaneous speech

被引:1
作者
Liu Zhao-Jie [1 ,2 ]
Shao Jian [2 ]
Zhang Peng-Yuan [2 ]
Zhao Qing-Wei [2 ]
Yan Yong-Hong [2 ]
Feng Ji [1 ]
机构
[1] Chinese Acad Sci, Inst Phys, Beijing 100080, Peoples R China
[2] Chinese Acad Sci, Inst Acoust, Beijing 100080, Peoples R China
关键词
tone recognition; spontaneous speech; real-context model; clustering;
D O I
10.7498/aps.56.7064
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Chinese is a tonal language, and the tone information is very important for the recognition of Chinese speech. Conventional methods generally focus on the normal tone in reading speech, rather than on the complicated spontaneous speech specifically. The real-context model is proposed as a new concept to be used in the tone unit selection. Then a method based on hierarchical clustering is performed to generate a more refined tone model. Experimental results have proved the effectiveness of the methods.
引用
收藏
页码:7064 / 7069
页数:6
相关论文
共 15 条
  • [1] Cao Yang, 2004, Acta Automatica Sinica, V30, P191
  • [2] CHANG H, 2000, IEEE INT C AC SPEECH, V3, P1523
  • [3] Chen CJ, 2001, INT CONF ACOUST SPEE, P61, DOI 10.1109/ICASSP.2001.940767
  • [4] Huang Tai-Yi, 1981, ICASSP 81. Proceedings of the 1981 IEEE International Conference on Acoustics, Speech and Signal Processing, P370
  • [5] QIAN Y, 2005, THESIS CHINESE U HON
  • [6] Talkin D., 1995, Speech coding and synthesis, V495, P518
  • [7] Complete recognition of continuous Mandarin speech for Chinese language with very large vocabulary using limited training data
    Wang, HM
    Ho, TH
    Yang, RC
    Shen, JL
    Bai, BR
    Hong, JC
    Chen, WP
    Yu, TL
    Lee, LS
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1997, 5 (02): : 195 - 200
  • [8] WU ZJ, 1982, CHIN GRAMMAR, V28, P439
  • [9] Effects of tone and focus on the formation and alignment of f0 contours
    Xu, Y
    [J]. JOURNAL OF PHONETICS, 1999, 27 (01) : 55 - 105
  • [10] Pitch targets and their realization: Evidence from Mandarin Chinese
    Xu, Y
    Wang, QE
    [J]. SPEECH COMMUNICATION, 2001, 33 (04) : 319 - 337