An objective metric of human subjective audio quality optimized for a wide range of audio fidelities

被引：22

作者：

Creusere, Charles D. ^{[1
]}

Kallakuri, Kumar D. ^{[2
]}

Vanam, Rahul ^{[3
]}

机构：

[1] New Mexico State Univ, Klipsch Sch Elect & Comp Engn, Las Cruces, NM 88003 USA

[2] Hughes Network Syst, Germantown, MD 20876 USA

[3] Univ Washington, Dept Elect Engn, Seattle, WA 98195 USA

来源：

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2008年 / 16卷 / 01期

基金：

美国国家科学基金会;

关键词：

audio quality metrics; metric optimization; objective metrics; perceptual audio analysis; quality evaluation; universal quality metrics;

D O I：

10.1109/TASL.2007.907571

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The goal of this paper is to develop an audio quality metric that can accurately quantify subjective quality over audio fidelities ranging from highly impaired to perceptually lossless. As one example of its utility, such a metric would allow scalable audio coding algorithms to be easily optimized over their entire operating ranges. We have found that the ITU-recommended objective quality metric, ITU-R BS.1387, does not accurately predict subjective audio quality over the wide range of fidelity levels of interest to us. In developing the desired universal metric, we use as a starting point the model output variables (MOVs) that make up BS.1387 as well as the energy equalization truncation threshold which has been found to be particularly useful for highly impaired audio. To combine these MOVs into a single quality measure that is both accurate and robust, we have developed a hybrid least-squares/minimax optimization procedure. Our test results show that the minimax-optimized metric is up to 36% lower in maximum absolute error compared to a similar metric designed using the conventional least-squares procedure.

引用

页码：129 / 136

页数：8

共 29 条

[1]

Aggarwal A, 2002, INT CONF ACOUST SPEE, P1833

[2]

AGGARWAL A, 2002, P 112 CONV AES

[3]

BEERENDS JG, 1992, J AUDIO ENG SOC, V40, P963

[4]

BRANDENBURG K, 1987, P CONTR 82 AES CONV

[5]

Brandfonbrener AG, 2000, MED PROBL PERFORM AR, V15, P1

[6]

COLOMES C, 1995, J AUDIO ENG SOC, V43, P233

[7] Understanding perceptual distortion in MPEG scalable audio coding [J].

Creusere, CD .

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (03) :422-431

[8]

*EBU, 2003, 3296 EUB

[9]

GUARD DR, 1994, P 96 CONV AUD ENG SO

[10]

*ISO IEC, 1999, ISOIECJTC1SC29WG11

← 1 2 3 →