Improvement of multimodal gesture and speech recognition performance using time intervals between gestures and accompanying speech

被引:11
|
作者
Miki, Madoka [1 ]
Kitaoka, Norihide [1 ]
Miyajima, Chiyomi [1 ]
Nishino, Takanori [2 ]
Takeda, Kazuya [1 ]
机构
[1] Nagoya Univ, Dept Med Sci, Nagoya, Aichi 4648603, Japan
[2] Mie Univ, Dept Informat Engn, Tsu, Mie 5148507, Japan
来源
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING | 2014年
关键词
Speech Recognition; Recognition Performance; Gesture Recognition; Prosodic Feature; Candidate Pair;
D O I
10.1186/1687-4722-2014-2
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We propose an integrative method of recognizing gestures such as pointing, accompanying speech. Speech generated simultaneously with gestures can assist in the recognition of gestures, and since this occurs in a complementary manner, gestures can also assist in the recognition of speech. Our integrative recognition method uses a probability distribution which expresses the distribution of the time interval between the starting times of gestures and of the corresponding utterances. We evaluate the rate of improvement of the proposed integrative recognition method with a task involving the solution of a geometry problem.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] Improvement of multimodal gesture and speech recognition performance using time intervals between gestures and accompanying speech
    Madoka Miki
    Norihide Kitaoka
    Chiyomi Miyajima
    Takanori Nishino
    Kazuya Takeda
    EURASIP Journal on Audio, Speech, and Music Processing, 2014
  • [2] Speech ReaLLM - Real-time Streaming Speech Recognition with Multimodal LLMs by Teaching the Flow of Time
    Seide, Frank
    Doulaty, Morrie
    Shi, Yangyang
    Gaur, Yashesh
    Jia, Junteng
    Wu, Chunyang
    INTERSPEECH 2024, 2024, : 1900 - 1904
  • [3] From a Wizard of Oz experiment to a real time speech and gesture multimodal interface
    Carbini, S.
    Delphin-Poulat, L.
    Perron, L.
    Viallet, J. E.
    SIGNAL PROCESSING, 2006, 86 (12) : 3559 - 3577
  • [4] Operator-Friendly UAV Control System with HMI Using Speech and Gesture Recognition
    Lee, Yerang
    Choi, Dahui
    Kim, Sangho
    PROCEEDINGS OF THE 2021 ASIA-PACIFIC INTERNATIONAL SYMPOSIUM ON AEROSPACE TECHNOLOGY (APISAT 2021), VOL 2, 2023, 913 : 1035 - 1048
  • [5] A Multimodal Communication Aid for Persons with Cerebral Palsy Using Head Movement and Speech Recognition
    Ikeda, Tomoka
    Hirokawa, Masakazu
    Suzuki, Kenji
    COMPUTERS HELPING PEOPLE WITH SPECIAL NEEDS, ICCHP 2020, PT II, 2020, 12377 : 429 - 436
  • [6] Improvement of Speech Recognition for Robots Using Blind Signal Separation
    Bicher, Daniel
    Kroll-Peters, Olaf
    Lee, Thebin
    Tiotuico, Natascha
    Wilhelm, Mathias
    ISCGAV'08: PROCEEDINGS OF THE 8TH WSEAS INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMPUTATIONAL GEOMETRY AND ARTIFICIAL VISION, 2008, : 52 - 55
  • [7] Improvement of the speech recognition in noisy environments using a nonparametric regression
    Amrouche, A.
    Taleb-Ahmed, A.
    Rouvaen, J. M.
    Yagoub, M. C. E.
    INTERNATIONAL JOURNAL OF PARALLEL EMERGENT AND DISTRIBUTED SYSTEMS, 2009, 24 (01) : 49 - 67
  • [8] Using prosody in fixed stress languages for improvement of speech recognition
    Szaszak, Gyoergy
    Vicsi, Klara
    VERBAL AND NONVERBAL COMMUNICATION BEHAVIOURS, 2007, 4775 : 138 - +
  • [9] Performance Prediction of Speech Recognition Using Average-Voice-Based Speech Synthesis
    Saito, Tatsuhiko
    Nose, Takashi
    Kobayashi, Takao
    Okato, Yohei
    Horii, Akio
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1964 - +
  • [10] Speech recognition using Dynamic Time Warping (DTW)
    Permanasari, Yurika
    Harahap, Erwin H.
    Ali, Erwin Prayoga
    2ND INTERNATIONAL CONFERENCE ON APPLIED & INDUSTRIAL MATHEMATICS AND STATISTICS, 2019, 1366