Speaker age classification and regression using i-vectors

被引:34
作者
Grzybowska, Joanna [1 ]
Kacprzak, Stanislaw [1 ]
机构
[1] AGH Univ Sci & Technol, Krakow, Poland
来源
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES | 2016年
关键词
speaker age recognition; regression; classification; computational paralinguistics; RECOGNITION;
D O I
10.21437/Interspeech.2016-1118
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we examine the use of i-vectors both for age regression as well as for age classification. Although i-vectors have been previously used for age regression task, we extend this approach by applying fusion of i-vectors and acoustic features regression to estimate the speaker age. By our fusion we obtain a relative improvement of 12.6% comparing to solely i-vector system. We also use i-vectors for age classification, which to our knowledge is the first attempt to do so. Our best results reach unweighted accuracy 62.9%, which is a relative improvement of 16.7% comparing to the best results obtained in age classification task at Age Sub-Challenge at Interspeech 2010.
引用
收藏
页码:1402 / 1406
页数:5
相关论文
共 21 条
  • [1] [Anonymous], 2006, LINGUISTICS PHONETIC
  • [2] [Anonymous], 2010, P INTERSPEECH
  • [3] [Anonymous], 2010, P 11 ANN C INT SPEEC
  • [4] [Anonymous], 2010, P LANG RES C LREC
  • [5] Speaker age estimation using i-vectors
    Bahari, Mohamad Hasan
    McLaren, Mitchell
    Hugo Van Hamme
    van Leeuwen, David A.
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2014, 34 : 99 - 108
  • [6] i-Vector Modeling of Speech Attributes for Automatic Foreign Accent Recognition
    Behravan, Hamid
    Hautamaki, Ville
    Siniscalchi, Sabato Marco
    Kinnunen, Tomi
    Lee, Chin-Hui
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (01) : 29 - 41
  • [7] Cole R., 1994, LINGUISTIC DATA CONS
  • [8] Front-End Factor Analysis for Speaker Verification
    Dehak, Najim
    Kenny, Patrick J.
    Dehak, Reda
    Dumouchel, Pierre
    Ouellet, Pierre
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 788 - 798
  • [9] Supervector Dimension Reduction for Efficient Speaker Age Estimation Based on the Acoustic Speech Signal
    Dobry, Gil
    Hecht, Ron M.
    Avigal, Mireille
    Zigel, Yaniv
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (07): : 1975 - 1985
  • [10] Eyben F., 2009, AFF COMP INT INT WOR, P1