Computational auditory scene analysis and its application to robot audition

被引:0
|
作者
Okuno, Hiroshi G. [1 ]
Nakadai, Kazuhiro [2 ]
机构
[1] Kyoto Univ, Grad Sch Informat, Sakyo Ku, Kyoto 6068501, Japan
[2] Honda Res Inst Japan Co Ltd, Saitama 3510188, Japan
来源
2008 HANDS-FREE SPEECH COMMUNICATION AND MICROPHONE ARRAYS | 2008年
关键词
robot audition; computational auditory scene analysis; Missing feature theory; simultaneous speakers;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Robot capability of hearing sounds, in particular, a mixture of sounds, by its own microphones, that is, robot audition, is important in improving human robot interaction. This paper presents the robot audition open-source software, called "HARK" (HRI-JP Audition for Robots with Kyoto University), which consists of primitive functions in computational auditory scene analysis; sound source localization, separation, and recognition of separated sounds. Since separated sounds suffer from spectral distortion due to separation, the HARK generates a time-spectral map of reliability, called "missing feature mask", for features of separated sounds. Then separated sounds are recognized by the Missing-Feature Theory (MFT) based ASR with missing feature masks. The HARK is implemented on the middle-ware called "FlowDesigner" to share intermediate audio data, which enables near real-time processing.
引用
收藏
页码:125 / +
页数:2
相关论文
共 50 条
  • [1] Computational auditory scene analysis and its application to robot audition
    Okuno, HG
    Ogata, T
    Komatani, K
    Nakadai, K
    INTERNATIONAL CONFERENCE ON INFORMATICS RESEARCH FOR DEVELOPMENT OF KNOWLEDGE SOCIETY INFRASTRUCTURE, PROCEEDINGS, 2004, : 73 - 80
  • [2] Robot Audition and Computational Auditory Scene Analysis
    Nakadai, Kazuhiro
    Okuno, Hiroshi G.
    ADVANCED INTELLIGENT SYSTEMS, 2020, 2 (09)
  • [3] Computational auditory scene analysis and its application to robot audition: Five years experience
    Okuno, Hiroshi G.
    Ogata, Tetsuya
    Komatani, Kazunori
    ICKS 2007: SECOND INTERNATIONAL CONFERENCE ON INFORMATICS RESEARCH FOR DEVELOPMENT OF KNOWLEDGE SOCIETY INFRASTRUCTURE, PROCEEDINGS, 2007, : 69 - +
  • [4] Robot audition from the viewpoint of computational auditory scene analysis
    Okuno, Hiroshi G.
    Ogata, Tetsuya
    Komatani, Kazunori
    INTERNATIONAL CONFERENCE ON INFORMATICS EDUCATION AND RESEARCH FOR KNOWLEDGE-CIRCULATING SOCIETY, PROCEEDINGS, 2008, : 35 - 40
  • [5] COMPUTATIONAL AUDITORY SCENE ANALYSIS
    BROWN, GJ
    COOKE, M
    COMPUTER SPEECH AND LANGUAGE, 1994, 8 (04): : 297 - 336
  • [6] DEVELOPMENT OF ZONAL BEAMFORMER AND ITS APPLICATION TO ROBOT AUDITION
    Tanaka, Nobuaki
    Ogawa, Tetsuji
    Akagiri, Kenzo
    Kobayashi, Tetsunori
    18TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2010), 2010, : 1529 - 1533
  • [7] A blackboard architecture for computational auditory scene analysis
    Godsmark, D
    Brown, GJ
    SPEECH COMMUNICATION, 1999, 27 (3-4) : 351 - 366
  • [8] Computational Models of Auditory Scene Analysis: A Review
    Szabo, Beata T.
    Denham, Susan L.
    Winkler, Istvan
    FRONTIERS IN NEUROSCIENCE, 2016, 10
  • [9] Sound ontology for computational auditory scene analysis
    Nakatani, T
    Okuno, HG
    FIFTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-98) AND TENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICAL INTELLIGENCE (IAAI-98) - PROCEEDINGS, 1998, : 1004 - 1010
  • [10] Blackboard architecture for computational auditory scene analysis
    Godsmark, Darryl
    Brown, Guy J.
    Speech Communication, 1999, 27 (03): : 351 - 366