Speech and audio coding using temporal masking

被引:0
作者
Gunawan, TS [1 ]
Ambikairajah, E [1 ]
Senn, D [1 ]
机构
[1] Univ New S Wales, Sch Elect Engn & Telecommun, Sydney, NSW 2052, Australia
来源
SIGNAL PROCESSING FOR TELECOMMUNICATIONS AND MULTIMEDIA | 2005年 / 27卷
关键词
temporal masking model; simultaneous masking model; Gammatone filters; wavelet packet; PESQ; subjective listening test;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a comparison of three auditory temporal masking models for speech and audio coding applications. The first model was developed based upon the existing forward masking psychoacoustic data with an assumption of ail approximately 200 ms. The model's dynamic parameters were derived from this data. The previously developed second model was,: based upon the principle of an exponential decay following higher energy stimuli, where the masking effects have a relatively short duration. The existing third model best matches the previously reported forward masking, data using ail exponential curve but the effects of the Forward masking are restricted to 100-200ms. Objective assessments employing the PESQ measure reveal that these three ternporal models have potential for removing perceptually redundant information in speech and audio coding, applications. Results show that the incorporation of temporal masking along with simultaneous masking into a speech/audio coding algorithm results in a further bit rate reduction of approximately 17% compared with simultaneous masking alone. while preserving perceptual quality.
引用
收藏
页码:31 / 42
页数:12
相关论文
共 16 条
  • [1] *AES, 2001, PERC AUD COD WHAT LI
  • [2] Ambikairajah E, 2001, INT CONF ACOUST SPEE, P773, DOI 10.1109/ICASSP.2001.941029
  • [3] BLACK M, 1995, INT CONF ACOUST SPEE, P3075, DOI 10.1109/ICASSP.1995.479495
  • [4] Bosi M., 2012, Introduction to digital audio coding and standards
  • [5] BRANDENBURG K, 1994, J AUDIO ENG SOC, V42, P780
  • [6] *ITU, 1997, BS11161 ITU
  • [7] FORWARD MASKING AS A FUNCTION OF FREQUENCY, MASKER LEVEL, AND SIGNAL DELAY
    JESTEADT, W
    BACON, SP
    LEHMAN, JR
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1982, 71 (04) : 950 - 962
  • [8] LYNCH M, 1997, P 5 EUR C SPEECH COM, P1495
  • [9] Najaf-Zadeh H, 2003, P 114 CONV AUD ENG S
  • [10] Incorporation of temporal masking effects into bark spectral distortion measure
    Novorita, B
    [J]. ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 665 - 668