Speaker age classification and regression using i-vectors

被引:34
作者
Grzybowska, Joanna [1 ]
Kacprzak, Stanislaw [1 ]
机构
[1] AGH Univ Sci & Technol, Krakow, Poland
来源
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES | 2016年
关键词
speaker age recognition; regression; classification; computational paralinguistics; RECOGNITION;
D O I
10.21437/Interspeech.2016-1118
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we examine the use of i-vectors both for age regression as well as for age classification. Although i-vectors have been previously used for age regression task, we extend this approach by applying fusion of i-vectors and acoustic features regression to estimate the speaker age. By our fusion we obtain a relative improvement of 12.6% comparing to solely i-vector system. We also use i-vectors for age classification, which to our knowledge is the first attempt to do so. Our best results reach unweighted accuracy 62.9%, which is a relative improvement of 16.7% comparing to the best results obtained in age classification task at Age Sub-Challenge at Interspeech 2010.
引用
收藏
页码:1402 / 1406
页数:5
相关论文
共 21 条
  • [11] Fedorova A, 2015, 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, P3036
  • [12] Galka J, 2015, 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, P724
  • [13] Grzybowska J, 2015, P 21 NAT C APPL MATH, P57
  • [14] Hatch AO, 2006, INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, P1471
  • [15] Kockmann M., 2010, Proc. of the Interspeech, P2822
  • [16] Automatic speaker age and gender recognition using acoustic and prosodic level information fusion
    Li, Ming
    Han, Kyu J.
    Narayanan, Shrikanth
    [J]. COMPUTER SPEECH AND LANGUAGE, 2013, 27 (01) : 151 - 167
  • [17] Metre F., 2007, AC SPEECH SIGN PROC, V4
  • [18] Sadjadi S. O., 2013, SPEECH LANGUAGE PROC
  • [19] SPECHT DF, 1991, SYSTEMS MAN CYBERN A, V2, P568
  • [20] Witkowski M., ODYSSEY 201 IN PRESS