Multi-Modal Multi-Instance Multi-Label Learning with Graph Convolutional Network

被引：3

作者：

Hang, Cheng ^{[1
]}

Wang, Wei ^{[1
]}

Zhan, De-Chuan ^{[1
]}

机构：

[1] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing 210023, Peoples R China

来源：

2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2021年

关键词：

Multi-modal; Multi-instance; Multi-label; Graph Convolutional Network; Deep Learning;

D O I：

10.1109/IJCNN52387.2021.9534428

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

When applying machine learning to tackle real-world problems, it is common to see that objects come with multiple labels rather than a single label. In addition, complex objects can be composed of multiple modalities, e.g. a post on social media may contain both texts and images. Previous approaches typically treat every modality as a whole, while it is not the case in real world, as every post may contain multiple images and texts with quite diverse semantic meanings. Therefore, Multi-modal Multi-instance Multi-label (M3) learning was proposed. Previous attempt at M3 learning argues that exploiting label correlations is crucial. In this paper, we find that we can handle M3 problems using graph convolutional network. Specifically, a graph is built over all labels and each label is initially represented by its word embedding. The main goal for GCN is to map those label embeddings into inter-correlated label classifiers. Moreover, multi-instance aggregation is based on attention mechanism, making it more interpretable because it naturally learns to discover which pattern triggers the labels. Empirical studies are conducted on both benchmark datasets and industrial datasets, validating the effectiveness of our method, and it is demonstrated in ablation studies that the components in our methods are essential.

引用

页数：8

共 50 条

[1] Learning to Annotate Clothes in Everyday Photos: Multi-Modal, Multi-Label, Multi-Instance Approach
Nogueira, Keiller
Veloso, Adriano Alonso
dos Santos, Jefersson A.
2014 27TH SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 2014, : 327 - 334
[2] Multi-instance multi-label learning
Zhou, Zhi-Hua
Zhang, Min-Ling
Huang, Sheng-Jun
Li, Yu-Feng
ARTIFICIAL INTELLIGENCE, 2012, 176 (01) : 2291 - 2320
[3] A Deep Multi-Modal CNN for Multi-Instance Multi-Label Image Classification
Song, Lingyun
Liu, Jun
Qian, Buyue
Sun, Mingxuan
Yang, Kuan
Sun, Meng
Abbas, Samar
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (12) : 6025 - 6038
[4] Semi-Supervised Multi-Modal Multi-Instance Multi-Label Deep Network with Optimal Transport
Yang, Yang
Fu, Zhao-Yang
Zhan, De-Chuan
Liu, Zhi-Bin
Jiang, Yuan
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (02) : 696 - 709
[5] Complex Object Classification: A Multi-Modal Multi-Instance Multi-Label Deep Network with Optimal Transport
Yang, Yang
Wu, Yi-Feng
Zhan, De-Chuan
Liu, Zhi-Bin
Jiang, Yuan
KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, : 2594 - 2603
[6] Instance Annotation for Multi-Instance Multi-Label Learning
Briggs, Forrest
Fern, Xiaoli Z.
Raich, Raviv
Lou, Qi
ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2013, 7 (03)
[7] Learnability of multi-instance multi-label learning
Wang Wei
Zhou ZhiHua
CHINESE SCIENCE BULLETIN, 2012, 57 (19): : 2488 - 2491
[8] Learnability of multi-instance multi-label learning
WANG Wei & ZHOU ZhiHua National Key Laboratory for Novel Software Technology
ChineseScienceBulletin, 2012, 57 (19) : 2492 - 2495
[9] Fast Multi-Instance Multi-Label Learning
Huang, Sheng-Jun
Gao, Wei
Zhou, Zhi-Hua
PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 1868 - 1874
[10] Multi-Instance Multi-Label Active Learning
Huang, Sheng-Jun
Gao, Nengneng
Chen, Songcan
PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1886 - 1892

← 1 2 3 4 5 →