Learning Individual Styles of Conversational Gesture

被引:182
作者
Ginosar, Shiry [1 ]
Bar, Amir [2 ]
Kohavi, Gefen [1 ]
Chan, Caroline [3 ]
Owens, Andrew [1 ]
Malik, Jitendra [1 ]
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
[2] Zebra Med Vis, Kibutz Shfayim, Israel
[3] MIT, Cambridge, MA 02139 USA
来源
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年
关键词
SPEECH;
D O I
10.1109/CVPR.2019.00361
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human speech is often accompanied by hand and arm gestures. We present a method for cross-modal translation from "in-the-wild" monologue speech of a single speaker to their conversational gesture motion. We train on unlabeled videos for which we only have noisy pseudo ground truth from an automatic pose detection system. Our proposed model significantly outperforms baseline methods in a quantitative comparison. To support research toward obtaining a computational understanding of the relationship between gesture and speech, we release a large video dataset of person-specific gestures.
引用
收藏
页码:3492 / 3501
页数:10
相关论文
共 50 条
[11]   Low speech rate but high gesture rate during conversational interaction in people with Cornelia de Lange syndrome [J].
Pearson, E. ;
Nielsen, E. ;
Kita, S. ;
Groves, L. ;
Nelson, L. ;
Moss, J. ;
Oliver, C. .
JOURNAL OF INTELLECTUAL DISABILITY RESEARCH, 2021, 65 (06) :601-607
[12]   The role of gesture and mimicry for children's pattern learning [J].
Vest, Nicholas A. ;
Fagan, Shawn E. ;
Fyfe, Emily R. .
COGNITIVE DEVELOPMENT, 2022, 63
[13]   Learning from gesture: How early does it happen? [J].
Novack, Miriam A. ;
Goldin-Meadow, Susan ;
Woodward, Amanda L. .
COGNITION, 2015, 142 :138-147
[14]   Gesture's Role in Speaking, Learning, and Creating Language [J].
Goldin-Meadow, Susan ;
Alibali, Martha Wagner .
ANNUAL REVIEW OF PSYCHOLOGY, VOL 64, 2013, 64 :257-283
[15]   Consolidation and Transfer of Learning After Observing Hand Gesture [J].
Cook, Susan Wagner ;
Duffy, Ryan G. ;
Fenn, Kimberly M. .
CHILD DEVELOPMENT, 2013, 84 (06) :1863-1871
[16]   Learning to generate pointing gestures in situated embodied conversational agents [J].
Deichler, Anna ;
Wang, Siyang ;
Alexanderson, Simon ;
Beskow, Jonas .
FRONTIERS IN ROBOTICS AND AI, 2023, 10
[17]   The Geranium System: Multimodal Conversational Agents for E-learning [J].
Griol, David ;
Manuel Molina, Jose ;
Sanchis de Miguel, Araceli .
DISTRIBUTED COMPUTING AND ARTIFICIAL INTELLIGENCE, 11TH INTERNATIONAL CONFERENCE, 2014, 290 :219-226
[18]   Automatic Categorization of Educational Videos According to Learning Styles [J].
Ciurez, Marius Andrei ;
Mihaescu, Marian Cristian ;
Gimenez, Maite ;
Heras, Stella ;
Palanca, Javier ;
Julian, Vicente .
2019 27TH INTERNATIONAL CONFERENCE ON SOFTWARE, TELECOMMUNICATIONS AND COMPUTER NETWORKS (SOFTCOM), 2019, :391-396
[19]   Learning from an avatar video instructor The role of gesture mimicry [J].
Vest, Nicholas A. ;
Fyfe, Emily R. ;
Nathan, Mitchell J. ;
Alibali, Martha W. .
GESTURE, 2020, 19 (01) :128-155
[20]   Developing multimodal conversational agents for an enhanced e-learning experience [J].
Griol, David ;
Manuel Molina, Jose ;
Sanchis de Miguel, Araceli .
ADCAIJ-ADVANCES IN DISTRIBUTED COMPUTING AND ARTIFICIAL INTELLIGENCE JOURNAL, 2014, 3 (01) :13-25