Multi-label video classification via coupling attentional multiple instance learning with label relation graph *

被引：12

作者：

Li, Xuewei ^{[1
]}

Wu, Hongjun ^{[1
]}

Li, Mengzhu ^{[1
]}

Liu, Hongzhe ^{[1
]}

机构：

[1] Beijing Union Univ, Beijing Key Lab Informat Serv Engn, Beijing 100101, Peoples R China

来源：

PATTERN RECOGNITION LETTERS | 2022年 / 156卷

关键词：

Multi-label video classification; Multiple instance learning; Attentional feature learning; Label relation graph;

D O I：

10.1016/j.patrec.2022.01.003

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multi-label video classification is a challenging problem in pattern recognition field, as it is difficult to grasp the occurring localizations of a huge number of labels in videos. To solve this problem, we propose a general framework named MALL-CNN, i.e., Multi-Attention Label Relation Learning Convolutional Neural Network. MALL-CNN not only builds the correspondences between labels and videos by an attention mechanism, but also captures label co-occurrence by a graph learning approach. Specifically, we introduce multiple instance learning to composite a set of frame-level features into a video-level feature. Then, video-level feature is mapped into the content-aware category representations in an improved attentional manner. Further, these representations are enhanced by a series of label relation graphs, which transform global label relationships to the label relationships of current video. With the three processes, frame feature aggregation, video feature mapping, and label relationship construction can be achieved in MALL-CNN for multi-label video classification. Extensive experiments on real-world scene benchmark Youtube-8M verify that MALL-CNN with only frame feature surpasses the state of the arts with multi modal features as well as ensemble models.(c) 2022 Elsevier B.V. All rights reserved.

引用

页码：53 / 59

页数：7

共 38 条

[31] Multi-class Cancer Classification of Whole Slide Images Through Transformer and Multiple Instance Learning [J].

Luan, Haijing ;

Hu, Taiyuan ;

Hu, Jifang ;

Li, Ruilin ;

Ji, Detao ;

He, Jiayin ;

Duan, Xiaohong ;

Yang, Chunyan ;

Gao, Yajun ;

Chen, Fan ;

Niu, Beifang .

BIOINFORMATICS RESEARCH AND APPLICATIONS, ISBRA 2023, 2023, 14248 :150-164

[32] MC-MIL: video surveillance anomaly detection with multi-instance learning and multiple overlapped cameras [J].

Pereira S.S.L. ;

Maia J.E.B. .

Neural Computing and Applications, 2024, 36 (18) :10527-10543

[33] MSMMIL: Multi-scan Mamba-based Multiple Instance Learning for whole slide image classification [J].

Zhong, Haiqin ;

Ding, Meidan ;

Zhao, Cheng ;

Zhang, Yongtao ;

Wang, Tianfu ;

Lei, Baiying .

KNOWLEDGE-BASED SYSTEMS, 2025, 324

[34] MULTIPLE INSTANCE LEARNING WITH TASK-SPECIFIC MULTI-LEVEL FEATURES FOR WEAKLY ANNOTATED HISTOPATHOLOGICAL IMAGE CLASSIFICATION [J].

Zhou, Yuanpin ;

Lu, Yao .

2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, :1366-1370

[35] Multi-center Ovarian Tumor Classification Using Hierarchical Transformer-Based Multiple-Instance Learning [J].

Claessens, Cris H. B. ;

Schultz, Eloy W. R. ;

Koch, Anna ;

Nies, Ingrid ;

Hellstrom, Terese A. E. ;

Nederend, Joost ;

Niers-Stobbe, Ilse ;

Bruining, Annemarie ;

Piek, Jurgen M. J. ;

De With, Peter H. N. ;

van der Sommen, Fons .

CANCER PREVENTION, DETECTION, AND INTERVENTION, CAPTION 2024, 2025, 15199 :3-13

[36] dMIL-Transformer: Multiple Instance Learning Via Integrating Morphological and Spatial Information for Lymph Node Metastasis Classification [J].

Chen, Yang ;

Shao, Zhuchen ;

Bian, Hao ;

Fang, Zijie ;

Wang, Yifeng ;

Cai, Yuanhao ;

Wang, Haoqian ;

Liu, Guojun ;

Li, Xi ;

Zhang, Yongbing .

IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2023, 27 (09) :4433-4443

[37] GAMMIL: A graph attention-guided multi-scale fusion multiple instance learning model for the WHO grading of meningioma in whole slide images [J].

Tu, Guilan ;

Li, Wuchao ;

Lin, Yongshun ;

Xu, Zi ;

He, Junjie ;

Fu, Bangkang ;

Huang, Ping ;

Wang, Rongpin ;

Peng, Yunsong .

BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 105

[38] Multi-resolution domain adaptation via multiple instance learning for improving the recognition accuracy of Japanese oak wilt in low-resolution satellite imagery [J].

Otsu, Mitsuyoshi ;

Nakamuraa, Sho ;

Tomitaa, Shigeru ;

Suhamaa, Tomoyuki ;

Shimazakia, Yasunobu ;

Nishimuraa, Katsuya .

SPIE FUTURE SENSING TECHNOLOGIES 2023, 2023, 12327

← 1 2 3 4 →