Towards Robust Human-Robot Collaborative Manufacturing: Multimodal Fusion

被引:60
作者
Liu, Hongyi [1 ]
Fang, Tongtong [2 ]
Zhou, Tianyu [2 ]
Wang, Lihui [1 ]
机构
[1] KTH Royal Inst Technol, Dept Prod Engn, SE-10044 Stockholm, Sweden
[2] KTH Royal Inst Technol, Dept Software & Comp Syst, SE-10044 Stockholm, Sweden
基金
欧盟地平线“2020”;
关键词
Deep learning; human-robot collaboration; multimodal fusion; intelligent manufacturing systems; NEURAL-NETWORKS; RECOGNITION; INTERFACE;
D O I
10.1109/ACCESS.2018.2884793
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Intuitive and robust multimodal robot control is the key toward human-robot collaboration (HRC) for manufacturing systems. Multimodal robot control methods were introduced in previous studies. The methods allow human operators to control robot intuitively without programming brand-specific code. However, most of the multimodal robot control methods are unreliable because the feature representations are not shared across multiple modalities. To target this problem, a deep learning-based multimodal fusion architecture is proposed in this paper for robust multimodal HRC manufacturing systems. The proposed architecture consists of three modalities: speech command, hand motion, and body motion. Three unimodal models are first trained to extract features, which are further fused for representation sharing. Experiments show that the proposed multimodal fusion model outperforms the three unimodal models. This paper indicates a great potential to apply the proposed multimodal fusion architecture to robust HRC manufacturing systems.
引用
收藏
页码:74762 / 74771
页数:10
相关论文
共 65 条
[1]  
Abdel-Hamid O, 2012, INT CONF ACOUST SPEE, P4277, DOI 10.1109/ICASSP.2012.6288864
[2]  
Abdel-Hamid Ossama., 2013, Interspeech, V2013, P1173, DOI [DOI 10.21437/INTERSPEECH.2013-744, DOI 10.1093/JNCI/58.4.1173]
[3]  
[Anonymous], 2010, Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, DOI DOI 10.1109/CVPR.2010.5539857
[4]  
[Anonymous], 2010, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2010.5539963
[5]  
[Anonymous], 2012, Advances in Neural Information Processing Systems
[6]   Robust gesture recognition using feature pre-processing and weighted dynamic time warping [J].
Arici, Tarik ;
Celebi, Sait ;
Aydin, Ali S. ;
Temiz, Talha T. .
MULTIMEDIA TOOLS AND APPLICATIONS, 2014, 72 (03) :3045-3062
[7]  
Baccouche Moez, 2011, Human Behavior Unterstanding. Proceedings Second International Workshop, HBU 2011, P29, DOI 10.1007/978-3-642-25446-8_4
[8]   Multimodal Child-Robot Interaction: Building Social Bonds [J].
Belpaeme, Tony ;
Baxter, Paul ;
Read, Robin ;
Wood, Rachel ;
Cuayahuitl, Heriberto ;
Kiefer, Bernd ;
Racioppa, Stefania ;
Kruijff-Korbayova, Ivana ;
Athanasopoulos, Georgios ;
Enescu, Valentin ;
Looije, Rosemarijn ;
Neerincx, Mark ;
Demiris, Yiannis ;
Ros-Espinoza, Raquel ;
Beck, Aryel ;
Carinamero, Lola ;
Hiolle, Antione ;
Lewis, Matthew ;
Baroni, Ilaria ;
Nalin, Marco ;
Cosi, Piero ;
Paci, Giulio ;
Tesser, Fabio ;
Sommavilla, Giacomo ;
Humbert, Remi .
JOURNAL OF HUMAN-ROBOT INTERACTION, 2012, 1 (02) :33-53
[9]  
Bo L., 2013, EXPT ROBOTICS, P387, DOI DOI 10.1007/978-3-319-00065-7
[10]   Present and future robot control development -: An industrial perspective [J].
Brogardh, Torgny .
ANNUAL REVIEWS IN CONTROL, 2007, 31 (01) :69-79