Using Long-Term Information to Improve Robustness in Speaker Identification

被引：0

作者：

Lyons, James G. ^{[1
]}

O'Connell, James G. ^{[1
]}

Paliwal, Kuldip K. ^{[1
]}

机构：

[1] Griffith Univ, Griffith Sch Engn, Signal Proc Lab, Brisbane, Qld 4111, Australia

来源：

2010 4TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ICSPCS) | 2010年

关键词：

Feature averaging; analysis window duration; long window; speaker recognition; automatic speaker identification;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper we propose two new methods of improving the robustness of Automatic Speaker Identification systems. These methods rely on using long-term information in the speech signal to improve the robustness of the features. The first method involves averaging filterbank parameters from consecutive short-time frames over a longer window. The second method investigates the use of frame lengths longer than generally assumed stationary. We show that these two methods result in an improvement over standard Mel Frequency Cepstral Coefficients in the presence of additive white Gaussian noise in speaker identification applications. Furthermore, additional improvements are observed at mid-range SNR when the proposed methods are used in combination.

引用

页数：4

共 50 条

[1] Speaker Characterization Using Long-Term and Temporal Information
Huang, Chien-Lin
Sun, Hanwu
Ma, Bin
Li, Haizhou
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 370 - 373
[2] LONG-TERM AUDITORY MEMORY - SPEAKER IDENTIFICATION
SASLOVE, H
YARMEY, AD
JOURNAL OF APPLIED PSYCHOLOGY, 1980, 65 (01) : 111 - 116
[3] Speaker Discrimination Using Long-Term Spectrum of Speech
Sigmund, Milan
INFORMATION TECHNOLOGY AND CONTROL, 2019, 48 (03): : 446 - 453
[4] Using cohorts to improve speaker identification
Mashao, DJ
8TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL XIII, PROCEEDINGS: INDUSTRIAL SYSTEMS, 2004, : 261 - 266
[5] Sparse Ensemble Machine Learning to Improve Robustness of Long-Term Decoding in iBMIs
Shaikh, Shoeb
So, Rosa
Sibindi, Tafadzwa
Libedinsky, Camilo
Basu, Arindam
IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2020, 28 (02) : 380 - 389
[6] Novel Long-term Information based Language Identification
Xu, Jiaxin
Wei, Yan
Qiu, Feng
Sun, Bo
ENGINEERING SOLUTIONS FOR MANUFACTURING PROCESSES, PTS 1-3, 2013, 655-657 : 1805 - 1808
[7] SPEAKER IDENTIFICATION BY LONG-TERM SPECTRA UNDER NORMAL, STRESS, AND DISGUISE CONDITIONS
HOLLIEN, H
MAJEWSKI, W
HOLLIEN, P
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1974, 55 : S20 - S20
[8] CROSS-CORRELATION OF LONG-TERM SPEECH SPECTRA AS A SPEAKER IDENTIFICATION TECHNIQUE
ZALEWSKI, J
MAJEWSKI, W
HOLLIEN, H
ACUSTICA, 1975, 34 (01): : 20 - 24
[9] SPEAKER IDENTIFICATION BY LONG-TERM SPECTRA UNDER NORMAL AND DISTORTED SPEECH CONDITIONS
HOLLIEN, H
MAJEWSKI, W
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1977, 62 (04): : 975 - 980
[10] LONG-TERM FEATURE AVERAGING FOR SPEAKER RECOGNITION
MARKEL, JD
OSHIKA, BT
GRAY, AH
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1977, 25 (04): : 330 - 337

← 1 2 3 4 5 →