Computational auditory scene analysis and its application to robot audition

被引：0

作者：

Okuno, Hiroshi G. ^{[1
]}

Nakadai, Kazuhiro ^{[2
]}

机构：

[1] Kyoto Univ, Grad Sch Informat, Sakyo Ku, Kyoto 6068501, Japan

[2] Honda Res Inst Japan Co Ltd, Saitama 3510188, Japan

来源：

2008 HANDS-FREE SPEECH COMMUNICATION AND MICROPHONE ARRAYS | 2008年

关键词：

robot audition; computational auditory scene analysis; Missing feature theory; simultaneous speakers;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Robot capability of hearing sounds, in particular, a mixture of sounds, by its own microphones, that is, robot audition, is important in improving human robot interaction. This paper presents the robot audition open-source software, called "HARK" (HRI-JP Audition for Robots with Kyoto University), which consists of primitive functions in computational auditory scene analysis; sound source localization, separation, and recognition of separated sounds. Since separated sounds suffer from spectral distortion due to separation, the HARK generates a time-spectral map of reliability, called "missing feature mask", for features of separated sounds. Then separated sounds are recognized by the Missing-Feature Theory (MFT) based ASR with missing feature masks. The HARK is implemented on the middle-ware called "FlowDesigner" to share intermediate audio data, which enables near real-time processing.

引用

页码：125 / +

页数：2

共 50 条

[21] COMPUTATIONAL AUDITORY SCENE ANALYSIS - EXPLOITING PRINCIPLES OF PERCEIVED CONTINUITY
COOKE, MP
BROWN, GJ
SPEECH COMMUNICATION, 1993, 13 (3-4) : 391 - 399
[22] On ideal binary mask as the computational goal of auditory scene analysis
Wang, DL
SPEECH SEPARATION BY HUMANS AND MACHINES, 2005, : 181 - 197
[23] A Computational Approach to the Dynamic Aspects of Primitive Auditory Scene Analysis
Kashino, Makio
Adachi, Eisuke
Hirose, Haruto
BASIC ASPECTS OF HEARING: PHYSIOLOGY AND PERCEPTION, 2013, 787 : 519 - 526
[24] Linking computational auditory scene analysis to automatic speech recognition
Cooke, M
Morris, A
Green, P
ACUSTICA, 1996, 82 : S87 - S87
[25] ROBOT AUDITION: ITS RISE AND PERSPECTIVES
Okuno, Hiroshi G.
Nakadai, Kazuhiro
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5610 - 5614
[26] Building Health Monitoring Using Computational Auditory Scene Analysis
Kawamoto, Mitsuru
Hamamoto, Takuji
16TH ANNUAL INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING IN SENSOR SYSTEMS (DCOSS 2020), 2020, : 144 - 146
[27] Computational Auditory Scene Analysis Based Voice Activity Detection
Tu, Ming
Xie, Xiang
Na, Xingyu
2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 797 - 802
[28] Computational auditory scene analysis in cellular wave computing framework
Fodroczi, Zoltan
Radvanyi, Andras
INTERNATIONAL JOURNAL OF CIRCUIT THEORY AND APPLICATIONS, 2006, 34 (04) : 489 - 515
[29] Using knowledge to organize sound: The prediction-driven approach to computational auditory scene analysis and its application to speech/nonspeech mixtures
Ellis, Daniel P.W.
Speech Communication, 1999, 27 (03): : 281 - 298
[30] Using knowledge to organize sound: The prediction-driven approach to computational auditory scene analysis and its application to speech/nonspeech mixtures
Ellis, DPW
SPEECH COMMUNICATION, 1999, 27 (3-4) : 281 - 298

← 1 2 3 4 5 →