On Real Time Q-Log-based Feature Normalization for Distant Speech Recognition

被引：0

作者：

Zilvan, Vicky ^{[1
]}

Ni'mah, Iftitahu ^{[1
]}

Yuliani, Asri R. ^{[1
]}

Pardede, Hilman F. ^{[1
]}

机构：

[1] Indonesian Inst Sci LIPI, Res Ctr Informat, Bandung, Indonesia

来源：

PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY SYSTEMS AND INNOVATION (ICITSI) | 2016年

关键词：

distant speech recognition; real time feature normalization; q-logarithm; non-Extensive statistics; reverberation;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The computation of long term mean in feature normalization methods requires information on future frames, and thus makes them inapplicable for real-time implementations. Previously, q-log spectral mean normalization (q-LSMN) as feature normalization method is proposed, and it shows more effective result than conventional normalization methods. However, q-LSMN has not yet been implemented on real time. In this paper, we propose a real time implementation of q-LSMN. In this method, the mean is updated recursively based on only previous frames, hence no future frame information is needed. Experiments on Aurora-5 databases showed that while real time q-LSMN achieved slightly worse performance than non real time q-LSMN as expected, it improved the recognition accuracy up to 54.22% compared to that of non-real time conventional normalization methods such as cepstral mean normalization (CMN) and Log spectral mean normalization (LSMN).

引用

页数：5

共 17 条

[1]

Du J, 2014, INTERSPEECH, P616

[2] Reverberation and Noise Robust Feature Compensation Based on IMM [J].

Han, Chang Woo ;

Kang, Shin Jae ;

Kim, Nam Soo .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (08) :1598-1611

[3]

Ito Y., 2000, P INTERSPEECH, P530

[4]

Jiao MK, 2015, 2015 12TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), P1956, DOI 10.1109/FSKD.2015.7382248

[5]

Kim C, 2012, INT CONF ACOUST SPEE, P4101, DOI 10.1109/ICASSP.2012.6288820

[6]

Li Heling, 2012, 2012 2nd International Conference on Consumer Electronics, Communications and Networks (CECNet), P1857, DOI 10.1109/CECNet.2012.6201972

[7] Hard-Mask Missing Feature Theory for Robust Speaker Recognition [J].

Lim, Shin-Cheol ;

Jang, Sei-Jin ;

Lee, Soek-Pil ;

Kim, Moo Young .

IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2011, 57 (03) :1245-1250

[8]

Pardede H. F., 2011, P INTERSPEECH, P1645

[9]

Pardede HF, 2015, I S INTELL SIG PROC, P386, DOI 10.1109/ISPACS.2015.7432801

[10] Feature normalization based on non-extensive statistics for speech recognition [J].

Pardede, Hilman F. ;

Iwano, Koji ;

Shinoda, Koichi .

SPEECH COMMUNICATION, 2013, 55 (05) :587-599

← 1 2 →