FUSION OF STANDARD AND ALTERNATIVE ACOUSTIC SENSORS FOR ROBUST AUTOMATIC SPEECH RECOGNITION

被引：0

作者：

Heracleous, Panikos ^{[1
]}

Even, Jani ^{[1
]}

Ishi, Carlos T. ^{[1
]}

Miyashita, Takahiro ^{[1
]}

Hagita, Norihiro ^{[1
]}

机构：

[1] ATR, Intelligent Robot & Commun Labs, Tokyo, Japan

来源：

2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2012年

关键词：

Alternative sensors; ear bone microphone; throat microphone; fusion; robust speech recognition;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper focuses on the problem of environmental noises in human-human communication and in automatic speech recognition. To deal with this problem, the use of alternative acoustic sensors -which are attached to the talker and receive the uttered speech through skin or bones- is investigated. In the current study, throat microphones and ear bone microphones are integrated with standard microphones using several fusion methods. The results obtained show that the recognition rates in noisy environments are drastically increased when these sensors are integrated with standard microphones. Moreover, the system does not show any recognition degradations in clean environments. In fact, recognition rates also increase slightly in clean environments. Using late fusion to integrate a throat microphone, an ear bone microphone, and a standard microphone, we achieved a 44% relative improvement in recognition rate in a noisy environment and a 24% relative improvement in recognition rate in a clean environment.

引用

页码：4837 / 4840

页数：4

共 50 条

[21] Acoustic Feature Optimization Based on F-Ratio for Robust Speech Recognition
Sun, Yanqing
Zhou, Yu
Zhao, Qingwei
Yan, Yonghong
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (09): : 2417 - 2430
[22] ACOUSTIC MODEL ADAPTATION VIA LINEAR SPLINE INTERPOLATION FOR ROBUST SPEECH RECOGNITION
Seltzer, Michael L.
Acero, Alex
Kalgaonkar, Kaustubh
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4550 - 4553
[23] Speech parameters for the robust emotional speech recognition
Kim W.-G.
Journal of Institute of Control, Robotics and Systems, 2010, 16 (12) : 1137 - 1142
[24] Robust recognition of fast speech
Lee, Ki-Seung
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (08) : 2456 - 2459
[25] On the Jointly Unsupervised Feature Vector Normalization and Acoustic Model Compensation for Robust Speech Recognition
Buera, Luis
Miguel, Antonio
Lleida, Eduardo
Saz, Oscar
Ortega, Alfonso
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1381 - 1384
[26] Physiologically-Motivated Synchrony-Based Processing for Robust Automatic Speech Recognition
Kim, Chanwoo
Chiu, Yu-Hsiang
Stern, Richard M.
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1483 - +
[27] Automatic Complexity Control of Generalized Variable Parameter HMMs for Noise Robust Speech Recognition
Su, Rongfeng
Liu, Xunying
Wang, Lan
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (01) : 102 - 114
[28] Compensation of speech enhancement distortion for robust speech recognition
Ding, P
Cao, ZG
2002 IEEE REGION 10 CONFERENCE ON COMPUTERS, COMMUNICATIONS, CONTROL AND POWER ENGINEERING, VOLS I-III, PROCEEDINGS, 2002, : 449 - 452
[29] Histogram equalization of speech representation for robust speech recognition
de la Torre, A
Peinado, AM
Segura, JC
Pérez-Córdoba, JL
Benítez, MC
Rubio, AJ
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (03): : 355 - 366
[30] Normalization of the Speech Modulation Spectra for Robust Speech Recognition
Xiao, Xiong
Chng, Eng Siong
Li, Haizhou
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (08): : 1662 - 1674

← 1 2 3 4 5 →