Implicit and Explicit Concept Relations in Deep Neural Networks for Multi-Label Video/Image Annotation

被引:45
作者
Markatopoulou, Foteini [1 ,2 ]
Mezaris, Vasileios [1 ]
Patras, Ioannis [2 ]
机构
[1] Ctr Res & Technol Hellas, Informat Technol Inst, Thermi 57001, Greece
[2] Queen Mary Univ London, Mile End Campus, London E1 4NS, England
基金
欧盟地平线“2020”;
关键词
Video/image concept annotation; deep learning; multi-task learning; structured outputs; multi-label learning; concept correlations; video analysis; IMAGE ANNOTATION; GRAPH;
D O I
10.1109/TCSVT.2018.2848458
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we propose a deep convolutional neural network (DCNN) architecture that addresses the problem of video/image concept annotation by exploiting concept relations at two different levels. At the first level, we build on ideas from multi-task learning, and propose an approach to learn concept-specific representations that are sparse, linear combinations of representations of latent concepts. By enforcing the sharing of the latent concept representations, we exploit the implicit relations between the target concepts. At a second level, we build on ideas from structured output learning and propose the introduction, at training time, of a new cost term that explicitly models the correlations between the concepts. By doing so, we explicitly model the structure in the output space (i.e., the concept labels). Both of the above are implemented using standard convolutional layers and are incorporated in a single DCNN architecture that can then be trained end-to-end with standard back-propagation. Experiments on four large-scale video and image data sets show that the proposed DCNN improves concept annotation accuracy and outperforms the related state-of-the-art methods.
引用
收藏
页码:1631 / 1644
页数:14
相关论文
共 50 条
[31]   Partially Disentangled Latent Relations for Multi-label Deep Learning [J].
Lian, Si-ming ;
Liu, Jian-wei ;
Luo, Xiong-lin .
NEURAL INFORMATION PROCESSING, ICONIP 2020, PT II, 2020, 12533 :570-579
[32]   Partially disentangled latent relations for multi-label deep learning [J].
Si-ming Lian ;
Jian-wei Liu ;
Run-kun Lu ;
Xiong-lin Luo .
Neural Computing and Applications, 2021, 33 :6039-6064
[33]   Partially disentangled latent relations for multi-label deep learning [J].
Lian, Si-ming ;
Liu, Jian-wei ;
Lu, Run-kun ;
Luo, Xiong-lin .
NEURAL COMPUTING & APPLICATIONS, 2021, 33 (11) :6039-6064
[34]   Automatic image annotation using model fusion and multi-label selection algorithm [J].
Wang, Liqin ;
Zhang, Aofan ;
Wang, Peng ;
Dong, Yongfeng .
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 37 (04) :4999-5008
[35]   Adaptive Hypergraph Embedded Semi-Supervised Multi-Label Image Annotation [J].
Tang, Chang ;
Liu, Xinwang ;
Wang, Pichao ;
Zhang, Changqing ;
Li, Miaomiao ;
Wang, Lizhe .
IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (11) :2837-2849
[36]   Image Annotation by Deep Neural Networks with Attention Shaping [J].
Zheng, Kexin ;
Lv, Shaohe ;
Ma, Fang ;
Chen, Fei ;
Jin, Chi ;
Dou, Yong .
NINTH INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING (ICDIP 2017), 2017, 10420
[37]   Deep code operation network for multi-label image retrieval [J].
Song, Ge ;
Tan, Xiaoyang .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2020, 193
[38]   DEEP HASHING MULTI-LABEL IMAGE RETRIEVAL WITH ATTENTION MECHANISM [J].
Xie, Wu ;
Cui, Mengyin ;
Liu, Manyi ;
Wang, Peilei ;
Qiang, Baohua .
INTERNATIONAL JOURNAL OF ROBOTICS & AUTOMATION, 2022, 37 (04) :372-381
[39]   Evaluating the Performance of Chinese Multi-Label Grammatical Error Detection Using Deep Neural Networks [J].
Lin, Tzu-Mi ;
Chen, Chao-Yi ;
Lee, Lung-Hao ;
Tseng, Yuen-Hsien .
30TH INTERNATIONAL CONFERENCE ON COMPUTERS IN EDUCATION, ICCE 2022, VOL 1, 2022, :524-526
[40]   Multi-Label Benthic Foraminifera Identification With Convolutional Neural Networks [J].
Yayan, Kubra ;
Baglum, Cem ;
Yayan, Ugur .
IEEE ACCESS, 2024, 12 :196769-196785