MMM-GCN: Multi-Level Multi-Modal Graph Convolution Network for Video-Based Person Identification

被引：0

作者：

Liao, Ziyan ^{[1
]}

Di, Dening ^{[1
]}

Hao, Jingsong ^{[1
]}

Zhang, Jiang ^{[1
]}

Zhu, Shulei ^{[1
]}

Yin, Jun ^{[1
]}

机构：

[1] Dahua Technol Co Ltd, Hangzhou, Peoples R China

来源：

MULTIMEDIA MODELING, MMM 2023, PT I | 2023年 / 13833卷

关键词：

Person identification; Multi-modal; Multi biometrics; GCN; Feature fusion;

D O I：

10.1007/978-3-031-27077-2_1

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Video-based multi-modal person identification has attracted rising research interest recently to address the inadequacies of single-modal identification in unconstrained scenes. Most existing methods model video-level and multi-modal-level information of target video respectively, which suffer from separation of different levels and insufficient information contained in a specific video. In this paper, we introduce extra neighbor-level information for the first time to enhance the informativeness of target video. Then a Multi-Level(neighbor-level, multi-modal-level, and video-level) and Multi-Modal GCN model is proposed, to capture correlation among different levels and achieve adaptive fusion in a unified model. Experiments on iQIYI-VID-2019 dataset show that MMM-GCN significantly outperforms current state-of-the-art methods, proving its superiority and effectiveness. Besides, we point out feature fusion is heavily polluted by noisy nodes that result in a suboptimal result. Further improvement could be explored on this basis to approach the performance upper bound of our paradigm.

引用

页码：3 / 15

页数：13

共 50 条

[21] A Novel Deep Multi-Modal Feature Fusion Method for Celebrity Video Identification [J].

Chen, Jianrong ;

Yang, Li ;

Xu, Yuanyuan ;

Huo, Jing ;

Shi, Yinghuan ;

Gao, Yang .

PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, :2535-2538

[22] GRAPH-BASED MULTI-MODAL SCENE DETECTION FOR MOVIE AND TELEPLAY [J].

Xu, Su ;

Feng, Bailan ;

Ding, Peng ;

Xu, Bo .

2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, :1413-1416

[23] A multi-modal and multi-stage fusion enhancement network for segmentation based on OCT and OCTA images [J].

Quan, Xiongwen ;

Hou, Guangyao ;

Yin, Wenya ;

Zhang, Han .

INFORMATION FUSION, 2025, 113

[24] MULTI-MODAL HIERARCHICAL ATTENTION-BASED DENSE VIDEO CAPTIONING [J].

Munusamy, Hemalatha ;

Sekhar, Chandra C. .

2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, :475-479

[25] Reserch of Multi-modal Emotion Recognition Based on Voice and Video Images [J].

Wang, Chuanyu ;

Li, Weixiang ;

Chen, Zhenhuan .

Computer Engineering and Applications, 2024, 57 (23) :163-170

[26] Sound event detection in traffic scenes based on graph convolutional network to obtain multi-modal information [J].

Jiang, Yanji ;

Guo, Dingxu ;

Wang, Lan ;

Zhang, Haitao ;

Dong, Hao ;

Qiu, Youli ;

Zou, Huiwen .

COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (04) :5653-5668

[27] M2GCNet: Multi-Modal Graph Convolution Network for Precise Brain Tumor Segmentation Across Multiple MRI Sequences [J].

Zhou, Tongxue .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 :4896-4910

[28] Hierarchical Graph Semantic Pooling Network for Multi-modal Community Question Answer Matching [J].

Hu, Jun ;

Qian, Shengsheng ;

Fang, Quan ;

Xu, Changsheng .

PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, :1157-1165

[29] A Hierarchical Framwork with Improved Loss for Large-scale Multi-modal Video Identification [J].

Zhang, Shichuan ;

Tang, Zengming ;

Pan, Hao ;

Wei, Xinyu ;

Huang, Jun .

PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, :2539-2542

[30] A presentation attack detection network based on dynamic convolution and multi-level feature fusion with security and reliability [J].

Cheng, Xin ;

Zhou, Jingmei ;

Zhao, Xiangmo ;

Wang, Hongfei ;

Li, Yuqi .

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2023, 146 :114-121

← 1 2 3 4 5 →