Artificially intelligent recognition of Arabic speaker using voice print-based local features

被引:4
作者
Mahmood, Awais [1 ]
Alsulaiman, Mansour [2 ]
Muhammad, Ghulam [2 ]
Akram, Sheeraz [3 ]
机构
[1] King Saud Univ, Dept Comp Sci, Coll Comp & Informat Sci, Al Muzahmiyyah Branch, Riyadh, Saudi Arabia
[2] King Saud Univ, Dept Comp Engn, Coll Comp & Informat Sci, Riyadh, Saudi Arabia
[3] Fdn Univ, Dept Software Engn, Islamabad, Pakistan
关键词
voice print-based local features; local features; speaker recognition system; GMM; FEATURE-EXTRACTION;
D O I
10.1080/0952813X.2015.1055827
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Local features for any pattern recognition system are based on the information extracted locally. In this paper, a local feature extraction technique was developed. This feature was extracted in the time-frequency plain by taking the moving average on the diagonal directions of the time-frequency plane. This feature captured the time-frequency events producing a unique pattern for each speaker that can be viewed as a voice print of the speaker. Hence, we referred to this technique as voice print-based local feature. The proposed feature was compared to other features including mel-frequency cepstral coefficient (MFCC) for speaker recognition using two different databases. One of the databases used in the comparison is a subset of an LDC database that consisted of two short sentences uttered by 182 speakers. The proposed feature attained 98.35% recognition rate compared to 96.7% for MFCC using the LDC subset.
引用
收藏
页码:1009 / 1020
页数:12
相关论文
共 24 条
  • [1] Alsulaiman Mansour, 2010, Journal of Computer Sciences, V6, P381, DOI 10.3844/jcssp.2010.381.385
  • [2] Alsulaiman M., 2013, P 7 EUR MOD S MATH M
  • [3] Alsulaiman M. M., 2013, Int. J. Inf., V16, P4231
  • [4] Altncay H., 2002, P CPR2002 16 INT C P
  • [5] [Anonymous], P OD SPEAK LANG REC
  • [6] Front end analysis of speech recognition: a review
    Anusuya, M.
    Katti, S.
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2011, 14 (02) : 99 - 145
  • [7] Fukuda T, 2004, IEICE T INF SYST, VE87D, P1110
  • [8] Fukuda T., 2003, 2003 AUT M AC SOC JA, VI, P1
  • [9] Evaluating Automatic Speaker Recognition systems: An overview of the NIST Speaker Recognition Evaluations (1996-2014)
    Gonzalez-Rodriguez, Joaquin
    [J]. LOQUENS, 2014, 1 (01):
  • [10] Hassan F, 2011, COMM COM INF SC, V191, P154