Improvement of multimodal gesture and speech recognition performance using time intervals between gestures and accompanying speech

被引：11

作者：

Miki, Madoka ^{[1
]}

Kitaoka, Norihide ^{[1
]}

Miyajima, Chiyomi ^{[1
]}

Nishino, Takanori ^{[2
]}

Takeda, Kazuya ^{[1
]}

机构：

[1] Nagoya Univ, Dept Med Sci, Nagoya, Aichi 4648603, Japan

[2] Mie Univ, Dept Informat Engn, Tsu, Mie 5148507, Japan

来源：

EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING | 2014年

关键词：

Speech Recognition; Recognition Performance; Gesture Recognition; Prosodic Feature; Candidate Pair;

D O I：

10.1186/1687-4722-2014-2

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We propose an integrative method of recognizing gestures such as pointing, accompanying speech. Speech generated simultaneously with gestures can assist in the recognition of gestures, and since this occurs in a complementary manner, gestures can also assist in the recognition of speech. Our integrative recognition method uses a probability distribution which expresses the distribution of the time interval between the starting times of gestures and of the corresponding utterances. We evaluate the rate of improvement of the proposed integrative recognition method with a task involving the solution of a geometry problem.

引用

页数：7

共 50 条

[1] Improvement of multimodal gesture and speech recognition performance using time intervals between gestures and accompanying speech
Madoka Miki
Norihide Kitaoka
Chiyomi Miyajima
Takanori Nishino
Kazuya Takeda
EURASIP Journal on Audio, Speech, and Music Processing, 2014
[2] Speech ReaLLM - Real-time Streaming Speech Recognition with Multimodal LLMs by Teaching the Flow of Time
Seide, Frank
Doulaty, Morrie
Shi, Yangyang
Gaur, Yashesh
Jia, Junteng
Wu, Chunyang
INTERSPEECH 2024, 2024, : 1900 - 1904
[3] From a Wizard of Oz experiment to a real time speech and gesture multimodal interface
Carbini, S.
Delphin-Poulat, L.
Perron, L.
Viallet, J. E.
SIGNAL PROCESSING, 2006, 86 (12) : 3559 - 3577
[4] Operator-Friendly UAV Control System with HMI Using Speech and Gesture Recognition
Lee, Yerang
Choi, Dahui
Kim, Sangho
PROCEEDINGS OF THE 2021 ASIA-PACIFIC INTERNATIONAL SYMPOSIUM ON AEROSPACE TECHNOLOGY (APISAT 2021), VOL 2, 2023, 913 : 1035 - 1048
[5] A Multimodal Communication Aid for Persons with Cerebral Palsy Using Head Movement and Speech Recognition
Ikeda, Tomoka
Hirokawa, Masakazu
Suzuki, Kenji
COMPUTERS HELPING PEOPLE WITH SPECIAL NEEDS, ICCHP 2020, PT II, 2020, 12377 : 429 - 436
[6] Improvement of Speech Recognition for Robots Using Blind Signal Separation
Bicher, Daniel
Kroll-Peters, Olaf
Lee, Thebin
Tiotuico, Natascha
Wilhelm, Mathias
ISCGAV'08: PROCEEDINGS OF THE 8TH WSEAS INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMPUTATIONAL GEOMETRY AND ARTIFICIAL VISION, 2008, : 52 - 55
[7] Improvement of the speech recognition in noisy environments using a nonparametric regression
Amrouche, A.
Taleb-Ahmed, A.
Rouvaen, J. M.
Yagoub, M. C. E.
INTERNATIONAL JOURNAL OF PARALLEL EMERGENT AND DISTRIBUTED SYSTEMS, 2009, 24 (01) : 49 - 67
[8] Using prosody in fixed stress languages for improvement of speech recognition
Szaszak, Gyoergy
Vicsi, Klara
VERBAL AND NONVERBAL COMMUNICATION BEHAVIOURS, 2007, 4775 : 138 - +
[9] Performance Prediction of Speech Recognition Using Average-Voice-Based Speech Synthesis
Saito, Tatsuhiko
Nose, Takashi
Kobayashi, Takao
Okato, Yohei
Horii, Akio
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1964 - +
[10] Speech recognition using Dynamic Time Warping (DTW)
Permanasari, Yurika
Harahap, Erwin H.
Ali, Erwin Prayoga
2ND INTERNATIONAL CONFERENCE ON APPLIED & INDUSTRIAL MATHEMATICS AND STATISTICS, 2019, 1366

← 1 2 3 4 5 →