Computational auditory scene analysis and its application to robot audition

被引：0

作者：

Okuno, Hiroshi G. ^{[1
]}

Nakadai, Kazuhiro ^{[2
]}

机构：

[1] Kyoto Univ, Grad Sch Informat, Sakyo Ku, Kyoto 6068501, Japan

[2] Honda Res Inst Japan Co Ltd, Saitama 3510188, Japan

来源：

2008 HANDS-FREE SPEECH COMMUNICATION AND MICROPHONE ARRAYS | 2008年

关键词：

robot audition; computational auditory scene analysis; Missing feature theory; simultaneous speakers;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Robot capability of hearing sounds, in particular, a mixture of sounds, by its own microphones, that is, robot audition, is important in improving human robot interaction. This paper presents the robot audition open-source software, called "HARK" (HRI-JP Audition for Robots with Kyoto University), which consists of primitive functions in computational auditory scene analysis; sound source localization, separation, and recognition of separated sounds. Since separated sounds suffer from spectral distortion due to separation, the HARK generates a time-spectral map of reliability, called "missing feature mask", for features of separated sounds. Then separated sounds are recognized by the Missing-Feature Theory (MFT) based ASR with missing feature masks. The HARK is implemented on the middle-ware called "FlowDesigner" to share intermediate audio data, which enables near real-time processing.

引用

页码：125 / +

页数：2

共 50 条

[1] Computational auditory scene analysis and its application to robot audition
Okuno, HG
Ogata, T
Komatani, K
Nakadai, K
INTERNATIONAL CONFERENCE ON INFORMATICS RESEARCH FOR DEVELOPMENT OF KNOWLEDGE SOCIETY INFRASTRUCTURE, PROCEEDINGS, 2004, : 73 - 80
[2] Robot Audition and Computational Auditory Scene Analysis
Nakadai, Kazuhiro
Okuno, Hiroshi G.
ADVANCED INTELLIGENT SYSTEMS, 2020, 2 (09)
[3] Computational auditory scene analysis and its application to robot audition: Five years experience
Okuno, Hiroshi G.
Ogata, Tetsuya
Komatani, Kazunori
ICKS 2007: SECOND INTERNATIONAL CONFERENCE ON INFORMATICS RESEARCH FOR DEVELOPMENT OF KNOWLEDGE SOCIETY INFRASTRUCTURE, PROCEEDINGS, 2007, : 69 - +
[4] Robot audition from the viewpoint of computational auditory scene analysis
Okuno, Hiroshi G.
Ogata, Tetsuya
Komatani, Kazunori
INTERNATIONAL CONFERENCE ON INFORMATICS EDUCATION AND RESEARCH FOR KNOWLEDGE-CIRCULATING SOCIETY, PROCEEDINGS, 2008, : 35 - 40
[5] COMPUTATIONAL AUDITORY SCENE ANALYSIS
BROWN, GJ
COOKE, M
COMPUTER SPEECH AND LANGUAGE, 1994, 8 (04): : 297 - 336
[6] DEVELOPMENT OF ZONAL BEAMFORMER AND ITS APPLICATION TO ROBOT AUDITION
Tanaka, Nobuaki
Ogawa, Tetsuji
Akagiri, Kenzo
Kobayashi, Tetsunori
18TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2010), 2010, : 1529 - 1533
[7] A blackboard architecture for computational auditory scene analysis
Godsmark, D
Brown, GJ
SPEECH COMMUNICATION, 1999, 27 (3-4) : 351 - 366
[8] Computational Models of Auditory Scene Analysis: A Review
Szabo, Beata T.
Denham, Susan L.
Winkler, Istvan
FRONTIERS IN NEUROSCIENCE, 2016, 10
[9] Sound ontology for computational auditory scene analysis
Nakatani, T
Okuno, HG
FIFTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-98) AND TENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICAL INTELLIGENCE (IAAI-98) - PROCEEDINGS, 1998, : 1004 - 1010
[10] Blackboard architecture for computational auditory scene analysis
Godsmark, Darryl
Brown, Guy J.
Speech Communication, 1999, 27 (03): : 351 - 366

← 1 2 3 4 5 →