Using Approximate Entropy as a Speech Quality Measure for a Speaker Recognition System

被引：0

作者：

Metzger, Richard A. ^{[1
]}

Doherty, John F. ^{[1
]}

Jenkins, David M. ^{[2
]}

机构：

[1] Penn State Univ, Dept Elect Engn, University Pk, PA 16802 USA

[2] Appl Res Lab, University Pk, PA 16802 USA

来源：

2016 ANNUAL CONFERENCE ON INFORMATION SCIENCE AND SYSTEMS (CISS) | 2016年

关键词：

Approximate Entropy; Speaker Recognition; Voice Activity Detection; Speech Activity Detection; VOICE;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, we will show that Approximate Entropy (ApEn) can be used to detect high-quality speech frames in an otherwise distorted speech signal. By exploiting the property of quasi-periodicity in speech, ApEn is able to detect small aberrations in speech frames that would otherwise cause a decrease in the performance in an automatic speaker recognition (ASR) system. In addition, we obtain the statistics of ApEn values representative of clean speech and propose threshold bounds to obtain maximum recognition rates. When compared to other popular voice activity detector (VAD) algorithms, our simulation results showed that utilization of ApEn will outperform the other VADs in discerning clean speech from noisy speech. This ability to properly detect clean speech allows for a speaker recognition system to obtain a recognition rate close to 87%, which is close to the same performance of the system when noise is not present.

引用

页数：6

共 20 条

[1] Voice Activity Detection Using Entropy in Spectrum Domain [J].

Asgari, Meysam ;

Sayadian, Abolghasem ;

Farhadloo, Mohsen ;

Mehrizi, Elahe Abouie .

ATNAC: 2008 AUSTRALASIAN TELECOMMUNICATION NETWOKS AND APPLICATIONS CONFERENCE, 2008, :407-+

[2] ITU-T recommendation G.729 Annex B: A silence compression scheme for use with G.729 optimized for V.70 digital simultaneous voice and data applications [J].

Benyassine, A ;

Shlomot, E ;

Su, HY ;

Massaloux, D ;

Lamblin, C ;

Petit, JP .

IEEE COMMUNICATIONS MAGAZINE, 1997, 35 (09) :64-73

[3]

Childers D.G., 1999, Speech Processing, V1st

[4]

Fu L, 2009, IEEE POW ENER SOC GE, P1062

[5]

Godfrey E.H. John., 1993, Switchboard-1 Release 2 LDC97S62

[6]

Hautamaki V., 2007, P 12 INT C SPEECH CO, V2, P645

[7] The effect of time delay on Approximate & Sample Entropy calculations [J].

Kaffashi, Farhad ;

Foglyano, Ryan ;

Wilson, Christopher G. ;

Loparo, Kenneth A. .

PHYSICA D-NONLINEAR PHENOMENA, 2008, 237 (23) :3069-3074

[8] An overview of text-independent speaker recognition: From features to supervectors [J].

Kinnunen, Tomi ;

Li, Haizhou .

SPEECH COMMUNICATION, 2010, 52 (01) :12-40

[9] Transformation and Decomposition of the Speech Signal for Coding [J].

Kleijn, W. Bastiaan ;

Haagen, Jesper .

IEEE SIGNAL PROCESSING LETTERS, 1994, 1 (09) :136-138

[10] Automatic selection of the threshold value r for approximate entropy [J].

Lu, Sheng ;

Chen, Xinnian ;

Kanters, Jorgen K. ;

Solomon, Irene C. ;

Chon, Ki H. .

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2008, 55 (08) :1966-1972

← 1 2 →