Towards the Prediction of Human Speaker Identification Performance from Measured Speech Quality

被引：0

作者：

Gallardo, Laura Fernandez ^{[1
,2
]}

Moeller, Sebastian ^{[1
,2
]}

机构：

[1] Univ Canberra, Fac ESTeM, Canberra, ACT 2601, Australia

[2] TU Berlin, Telekom Innovat Labs, Qual & Usabil Lab, Berlin, Germany

来源：

16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年

关键词：

human speaker identification; speech quality; instrumental measures; prediction model;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Speech communication channels and their components (e.g. codecs) are generally designed for optimum perceived speech quality. However, transmission channels should also preserve principal speaker-specific characteristics that enable acceptable speaker identification performance by end listeners. This paper proposes a first step towards effective approaches for the prediction of the human speaker identification performance from instrumental quality measures. Correspondences between speech quality and speaker identification accuracy are shown by fitting linear curves to data points involving different channel transmissions. Narrowband, wideband, and super-wideband channels are considered, with other typically associated distortions. Our analyses show that Coloration, one of the perceptual quality dimensions, can be a better predictor of the human speaker identification performance than overall quality predictions in terms of Mean Opinion Scores. This suggests that the speaker specific properties of the voice are mainly impaired by the distortion of frequency components in the transmission path.

引用

页码：443 / 447

页数：5

共 13 条

[1]

[Anonymous], 2013, ITU T SG12 M CH GEN

[2]

[Anonymous], P INTERSPEECH

[3]

Beerends J. G., 2005, NEW DIRECTIONS IMPRO

[4]

Côté N, 2011, T-LAB SER TELECOMMUN, P1, DOI 10.1007/978-3-642-18463-5

[5]

Fernandez Gallardo Laura, 2012, ITG-Fachbericht, P219

[6]

Fernandez L. Gallardo, 2014, THESIS

[7]

Gallardo LF, 2013, INT CONF ACOUST SPEE, P7775, DOI 10.1109/ICASSP.2013.6639177

[8] Impairment factor framework for wide-band speech codecs [J].

Moeller, Sebastian ;

Raake, Alexander ;

Kitawaki, Nobuhiko ;

Takahashi, Akira ;

Waeltermann, Marcel .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (06) :1969-1976

[9]

Möller S, 2014, 2014 8TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ICSPCS)

[10]

Rietveld A C M, 1991, P 12 INT C PHON SCI, P46

← 1 2 →