MMM-GCN: Multi-Level Multi-Modal Graph Convolution Network for Video-Based Person Identification

被引:0
|
作者
Liao, Ziyan [1 ]
Di, Dening [1 ]
Hao, Jingsong [1 ]
Zhang, Jiang [1 ]
Zhu, Shulei [1 ]
Yin, Jun [1 ]
机构
[1] Dahua Technol Co Ltd, Hangzhou, Peoples R China
来源
MULTIMEDIA MODELING, MMM 2023, PT I | 2023年 / 13833卷
关键词
Person identification; Multi-modal; Multi biometrics; GCN; Feature fusion;
D O I
10.1007/978-3-031-27077-2_1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video-based multi-modal person identification has attracted rising research interest recently to address the inadequacies of single-modal identification in unconstrained scenes. Most existing methods model video-level and multi-modal-level information of target video respectively, which suffer from separation of different levels and insufficient information contained in a specific video. In this paper, we introduce extra neighbor-level information for the first time to enhance the informativeness of target video. Then a Multi-Level(neighbor-level, multi-modal-level, and video-level) and Multi-Modal GCN model is proposed, to capture correlation among different levels and achieve adaptive fusion in a unified model. Experiments on iQIYI-VID-2019 dataset show that MMM-GCN significantly outperforms current state-of-the-art methods, proving its superiority and effectiveness. Besides, we point out feature fusion is heavily polluted by noisy nodes that result in a suboptimal result. Further improvement could be explored on this basis to approach the performance upper bound of our paradigm.
引用
收藏
页码:3 / 15
页数:13
相关论文
共 50 条
  • [1] MMGCN: Multi-modal Graph Convolution Network for Personalized Recommendation of Micro-video
    Wei, Yinwei
    Wang, Xiang
    Nie, Liqiang
    He, Xiangnan
    Hong, Richang
    Chua, Tat-Seng
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 1437 - 1445
  • [2] Multi-level Interaction Network for Multi-Modal Rumor Detection
    Zou, Ting
    Qian, Zhong
    Li, Peifeng
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [3] MRCap: Multi-modal and Multi-level Relationship-based Dense Video Captioning
    Chen, Wei
    Niu, Jianwei
    Liu, Xuefeng
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2615 - 2620
  • [4] Cascade Graph Convolution Network Based on Multi-level Graph Structures in Heterogeneous Graph
    Song, Ling-Yun
    Liu, Zhi-Zhen
    Zhang, Yang
    Li, Zhan-Huai
    Shang, Xue-Qun
    Ruan Jian Xue Bao/Journal of Software, 2024, 35 (11): : 5179 - 5195
  • [5] Multi-Modal Graph Interaction for Multi-Graph Convolution Network in Urban Spatiotemporal Forecasting
    Zhang, Lingyu
    Geng, Xu
    Qin, Zhiwei
    Wang, Hongjun
    Wang, Xiao
    Zhang, Ying
    Liang, Jian
    Wu, Guobin
    Song, Xuan
    Wang, Yunhai
    SUSTAINABILITY, 2022, 14 (19)
  • [6] Multi-level fusion network for mild cognitive impairment identification using multi-modal neuroimages
    Xu, Haozhe
    Zhong, Shengzhou
    Zhang, Yu
    PHYSICS IN MEDICINE AND BIOLOGY, 2023, 68 (09):
  • [7] A sparse graph wavelet convolution neural network for video-based person re-identification
    Yao, Yingmao
    Jiang, Xiaoyan
    Fujita, Hamido
    Fang, Zhijun
    PATTERN RECOGNITION, 2022, 129
  • [8] MBIAN: Multi-level bilateral interactive attention network for multi-modal
    Sun, Kai
    Zhang, Jiangshe
    Wang, Jialin
    Xu, Shuang
    Zhang, Chunxia
    Hu, Junying
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 231
  • [9] CMGNet: Collaborative multi-modal graph network for video captioning
    Rao, Qi
    Yu, Xin
    Li, Guang
    Zhu, Linchao
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 238
  • [10] MLSFF: Multi-level structural features fusion for multi-modal knowledge graph completion
    Zhai, Hanming
    Lv, Xiaojun
    Hou, Zhiwen
    Tong, Xin
    Bu, Fanliang
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2023, 20 (08) : 14096 - 14116