Compressed domain speech enhancement based on Gaussian mixture model

被引：0

作者：

Liang, Yan ^{[1
]}

Bao, Chang-Chun ^{[1
]}

Xia, Bing-Yin ^{[1
]}

He, Yu-Wen ^{[1
]}

Zhou, Xuan ^{[1
]}

Li, Na ^{[1
]}

机构：

[1] School of Electronic Information and Control Engineering, Beijing University of Technology, Beijing 100124, China

来源：

Tien Tzu Hsueh Pao/Acta Electronica Sinica | 2012年 / 40卷 / 10期

关键词：

Bayesian networks - Probability density function - Gaussian distribution - Mean square error - Signal to noise ratio;

D O I：

10.3969/j.issn.0372-2112.2012.10.022

中图分类号：

学科分类号：

摘要：

A Gaussian Mixture Model (GMM) based speech enhancement method in compressed domain used for ITU-T G. 722.2 wideband speech codec is proposed to take full advantage of the prior knowledge of the Immittance Spectral Frequencies (ISFs) for the clean speech. Firstly, GMM is adopted to model the joint probability density of feature vectors which are composed by the ISFs of noisy speech and clean speech with the corresponding gain scaling factor. Secondly, an optimal Bayesian estimation of feature parameters derived from clean speech is obtained under the minimum mean square error (MMSE) criterion. To be compatible with the DTX (Discontinuous Transmission) mode, the logarithmic energy is attenuated and the ISFs remain when a SID (Silence Insertion Descriptor) frame is received. Furthermore, if ao erased frame is received, the bit stream is unchanged and the proposed method is performed on the recovered parameters for the memory update. The evaluation is conducted under the ITU-T G. 160. The results indicate that, comparing with the reference method, the proposed method can produce larger amount of noise level reduction with better objective speech quality, while the SNR improvement remains acceptable.

引用

页码：2031 / 2038

共 50 条

[1] Speech enhancement based on speech spectral complex Gaussian Mixture Model
Ding, GH
Wang, X
Cao, Y
Ding, F
Tang, YZ
2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 165 - 168
[2] Subspace Based Speech Enhancement Using Gaussian Mixture Model
Kundu, Achintya
Chatterjee, Saikat
Sreenivas, T. V.
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 395 - 398
[3] Speech wideband extension based on Gaussian mixture model
ZHANG Yong HU Ruimin (National Engineering Research Center for Multimedia software
ChineseJournalofAcoustics, 2009, 28 (04) : 362 - 377
[4] SPEECH ENHANCEMENT USING A MODULATION DOMAIN KALMAN FILTER POST-PROCESSOR WITH A GAUSSIAN MIXTURE NOISE MODEL
Wang, Yu
Brookes, Mike
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[5] Speech wideband extension based on Gaussian mixture model
Zhang, Yong
Hu, Ruimin
Shengxue Xuebao/Acta Acustica, 2009, 34 (05): : 471 - 480
[6] Gaussian mixture model-based contrast enhancement
Abdoli, Mohsen
Sarikhani, Hossein
Ghanbari, Mohammad
Brault, Patrice
IET IMAGE PROCESSING, 2015, 9 (07) : 569 - 577
[7] Compressed Domain Speech Enhancement based on the Joint Modification of Codebook Gains
Xia, Bing-yin
Bao, Chang-chun
Liang, Yan
Zhou, Xuan
He, Yu-wen
Li, Ru-wei
2011 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 2011, : 207 - 211
[8] Quality enhancement of CELP coded speech by using a voicing gaussian mixture model
Raza, DG
Chan, CF
2002 6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I AND II, 2002, : 452 - 455
[9] The Research of Speech Emotion Recognition Based on Gaussian Mixture Model
Zhang, Wanli
Li, Guoxin
Gao, Wei
MECHANICAL COMPONENTS AND CONTROL ENGINEERING III, 2014, 668-669 : 1126 - +
[10] Speech Enhancement Using Gaussian Scale Mixture Models
Hao, Jiucang
Lee, Te-Won
Sejnowski, Terrence J.
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (06): : 1127 - 1136

← 1 2 3 4 5 →