Using Long-Term Information to Improve Robustness in Speaker Identification

被引:0
|
作者
Lyons, James G. [1 ]
O'Connell, James G. [1 ]
Paliwal, Kuldip K. [1 ]
机构
[1] Griffith Univ, Griffith Sch Engn, Signal Proc Lab, Brisbane, Qld 4111, Australia
来源
2010 4TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ICSPCS) | 2010年
关键词
Feature averaging; analysis window duration; long window; speaker recognition; automatic speaker identification;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper we propose two new methods of improving the robustness of Automatic Speaker Identification systems. These methods rely on using long-term information in the speech signal to improve the robustness of the features. The first method involves averaging filterbank parameters from consecutive short-time frames over a longer window. The second method investigates the use of frame lengths longer than generally assumed stationary. We show that these two methods result in an improvement over standard Mel Frequency Cepstral Coefficients in the presence of additive white Gaussian noise in speaker identification applications. Furthermore, additional improvements are observed at mid-range SNR when the proposed methods are used in combination.
引用
收藏
页数:4
相关论文
共 50 条
  • [1] Speaker Characterization Using Long-Term and Temporal Information
    Huang, Chien-Lin
    Sun, Hanwu
    Ma, Bin
    Li, Haizhou
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 370 - 373
  • [2] LONG-TERM AUDITORY MEMORY - SPEAKER IDENTIFICATION
    SASLOVE, H
    YARMEY, AD
    JOURNAL OF APPLIED PSYCHOLOGY, 1980, 65 (01) : 111 - 116
  • [3] Speaker Discrimination Using Long-Term Spectrum of Speech
    Sigmund, Milan
    INFORMATION TECHNOLOGY AND CONTROL, 2019, 48 (03): : 446 - 453
  • [4] Using cohorts to improve speaker identification
    Mashao, DJ
    8TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL XIII, PROCEEDINGS: INDUSTRIAL SYSTEMS, 2004, : 261 - 266
  • [5] Sparse Ensemble Machine Learning to Improve Robustness of Long-Term Decoding in iBMIs
    Shaikh, Shoeb
    So, Rosa
    Sibindi, Tafadzwa
    Libedinsky, Camilo
    Basu, Arindam
    IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2020, 28 (02) : 380 - 389
  • [6] Novel Long-term Information based Language Identification
    Xu, Jiaxin
    Wei, Yan
    Qiu, Feng
    Sun, Bo
    ENGINEERING SOLUTIONS FOR MANUFACTURING PROCESSES, PTS 1-3, 2013, 655-657 : 1805 - 1808
  • [7] SPEAKER IDENTIFICATION BY LONG-TERM SPECTRA UNDER NORMAL, STRESS, AND DISGUISE CONDITIONS
    HOLLIEN, H
    MAJEWSKI, W
    HOLLIEN, P
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1974, 55 : S20 - S20
  • [8] CROSS-CORRELATION OF LONG-TERM SPEECH SPECTRA AS A SPEAKER IDENTIFICATION TECHNIQUE
    ZALEWSKI, J
    MAJEWSKI, W
    HOLLIEN, H
    ACUSTICA, 1975, 34 (01): : 20 - 24
  • [9] SPEAKER IDENTIFICATION BY LONG-TERM SPECTRA UNDER NORMAL AND DISTORTED SPEECH CONDITIONS
    HOLLIEN, H
    MAJEWSKI, W
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1977, 62 (04): : 975 - 980
  • [10] LONG-TERM FEATURE AVERAGING FOR SPEAKER RECOGNITION
    MARKEL, JD
    OSHIKA, BT
    GRAY, AH
    IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1977, 25 (04): : 330 - 337