ON USING SPECTRAL GRADIENT IN CONDITIONAL MAP CRITERION FOR ROBUST VOICE ACTIVITY DETECTION

被引:0
作者
Choi, Jae-Hun [1 ]
Chang, Joon-Hyuk [1 ]
机构
[1] Hanyang Univ, Sch Elect Engn, Seoul 133791, South Korea
来源
PROCEEDINGS OF THE 3RD IEEE INTERNATIONAL CONFERENCE ON NETWORK INFRASTRUCTURE AND DIGITAL CONTENT (IEEE IC-NIDC 2012) | 2012年
关键词
Voice activity detection; Spectral gradient; Conditional MAP; Likelihood ratio test;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose a novel approach to improve a statistical model-based voice activity detection (VAD) method based on a modified conditional maximum a posteriori (MAP) criterion incorporating the spectral gradient scheme. The proposed conditional MAP incorporates not only the voice activity decision in the previous frame as in Ref. [1] but also the spectral gradient of the observed spectra between the current frame and the past frames to efficiently exploit the inter-frame correlation of voice activity. As a result, the proposed VAD leads to six separate thresholds to be adaptively determined in the likelihood ratio test (LRT) depending on both the previous VAD result and the estimated spectral gradient parameter. Experimental results demonstrate that the proposed approach yields better results compared to those of the previous conditional MAP-based method.
引用
收藏
页码:370 / 374
页数:5
相关论文
共 50 条
[11]   Statistical Model-Based Voice Activity Detection Using the Second-Order Conditional Maximum a Posteriori Criterion with Adapted Threshold [J].
Kim, Sang-Kyun ;
Chang, Joon-Hyuk .
JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2010, 29 (01) :76-81
[12]   Voice activity detection based on conditional random fields using multiple features [J].
Saito, Akira ;
Nankaku, Yoshihiko ;
Lee, Akinobu ;
Tokuda, Keiichi .
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, :2086-2089
[13]   Voice Activity Detection in Presence of Transient Noise Using Spectral Clustering [J].
Mousazadeh, Saman ;
Cohen, Israel .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (06) :1261-1271
[14]   Toward Detecting Voice Activity Employing Soft Decision in Second-order Conditional MAP [J].
Kim, Sang-Kyun ;
Choi, Jae-Hun ;
Kang, Sang-Ick ;
Song, Ji-Hyun ;
Chang, Joon-Hyuk .
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, :3082-3085
[15]   A Fusion Model for Robust Voice Activity Detection [J].
Wang, Guan-Bo ;
Zhang, Wei-Qiang .
2019 IEEE 19TH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT 2019), 2019,
[16]   A MODIFIED MAP CRITERION BASED ON HIDDEN MARKOV MODEL FOR VOICE ACTIVITY DETECION [J].
Deng, Shiwen ;
Han, Jiqing ;
Zheng, Tieran ;
Zheng, Guibin .
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, :5220-5223
[17]   Robust Statistical Voice Activity Detection Using a Likelihood Ratio Sign Test [J].
Deng, Shiwen ;
Han, Jiqing .
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, :3126-3129
[18]   Robust voice activity detection directed by noise classification [J].
Saeedi, Jamal ;
Ahadi, Seyed Mohammad ;
Faez, Karim .
SIGNAL IMAGE AND VIDEO PROCESSING, 2015, 9 (03) :561-572
[19]   Adaptive regularization framework for robust voice activity detection [J].
Lu, Xugang ;
Unoki, Masashi ;
Isotani, Ryosuke ;
Kawai, Hisashi ;
Nakamura, Satoshi .
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, :2664-2667
[20]   Robust Voice Activity Detection Algorithm for Noisy Speech [J].
Verteletskaya, Ekaterina ;
Simak, Boris .
RTT 2009: 11TH INTERNATIONAL CONFERENCE RTT 2009 RESEARCH IN TELECOMMUNICATION TECHNOLOGY, CONFERENCE PROCEEDINGS, 2009, :98-101