Implicit and Explicit Concept Relations in Deep Neural Networks for Multi-Label Video/Image Annotation

被引:44
作者
Markatopoulou, Foteini [1 ,2 ]
Mezaris, Vasileios [1 ]
Patras, Ioannis [2 ]
机构
[1] Ctr Res & Technol Hellas, Informat Technol Inst, Thermi 57001, Greece
[2] Queen Mary Univ London, Mile End Campus, London E1 4NS, England
基金
欧盟地平线“2020”;
关键词
Video/image concept annotation; deep learning; multi-task learning; structured outputs; multi-label learning; concept correlations; video analysis; IMAGE ANNOTATION; GRAPH;
D O I
10.1109/TCSVT.2018.2848458
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we propose a deep convolutional neural network (DCNN) architecture that addresses the problem of video/image concept annotation by exploiting concept relations at two different levels. At the first level, we build on ideas from multi-task learning, and propose an approach to learn concept-specific representations that are sparse, linear combinations of representations of latent concepts. By enforcing the sharing of the latent concept representations, we exploit the implicit relations between the target concepts. At a second level, we build on ideas from structured output learning and propose the introduction, at training time, of a new cost term that explicitly models the correlations between the concepts. By doing so, we explicitly model the structure in the output space (i.e., the concept labels). Both of the above are implemented using standard convolutional layers and are incorporated in a single DCNN architecture that can then be trained end-to-end with standard back-propagation. Experiments on four large-scale video and image data sets show that the proposed DCNN improves concept annotation accuracy and outperforms the related state-of-the-art methods.
引用
收藏
页码:1631 / 1644
页数:14
相关论文
共 50 条
  • [1] A Novel Model for Multi-label Image Annotation
    Wu, Xinjian
    Zhang, Li
    Li, Fanzhang
    Wang, Bangjun
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 1953 - 1958
  • [2] Deep Multi-Instance Multi-Label Learning for Image Annotation
    Guo, Hai-Feng
    Han, Lixin
    Su, Shoubao
    Sun, Zhou-Bao
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2018, 32 (03)
  • [3] Multi-view multi-label learning for image annotation
    Zou, Fuhao
    Liu, Yu
    Wang, Hua
    Song, Jingkuan
    Shao, Jie
    Zhou, Ke
    Zheng, Sheng
    MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (20) : 12627 - 12644
  • [4] Multi-view multi-label learning for image annotation
    Fuhao Zou
    Yu Liu
    Hua Wang
    Jingkuan Song
    Jie Shao
    Ke Zhou
    Sheng Zheng
    Multimedia Tools and Applications, 2016, 75 : 12627 - 12644
  • [5] MULTI-LABEL IMAGE ANNOTATION VIA MAXIMUM CONSISTENCY
    Wang, Hua
    Hu, Jian
    2010 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, 2010, : 2337 - 2340
  • [6] Multi-label Text Classification with Deep Neural Networks
    Chen, Yun
    Xiao, Bo
    Lin, Zhiqing
    Dai, Cheng
    Li, Zuochao
    Yang, Liping
    PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON NETWORK INFRASTRUCTURE AND DIGITAL CONTENT (IEEE IC-NIDC), 2018, : 409 - 413
  • [7] Deep multi-label learning for image distortion identification
    Liang, Dong
    Gao, Xinbo
    Lu, Wen
    He, Lihuo
    SIGNAL PROCESSING, 2020, 172
  • [8] Multi-Label Dictionary Learning for Image Annotation
    Jing, Xiao-Yuan
    Wu, Fei
    Li, Zhiqiang
    Hu, Ruimin
    Zhang, David
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (06) : 2712 - 2725
  • [9] Semi-supervised robust deep neural networks for multi-label image classification
    Cevikalp, Hakan
    Benligiray, Burak
    Gerek, Omer Nezih
    PATTERN RECOGNITION, 2020, 100
  • [10] MULTI-TASK DEEP NEURAL NETWORK FOR MULTI-LABEL LEARNING
    Huang, Yan
    Wang, Wei
    Wang, Liang
    Tan, Tieniu
    2013 20TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2013), 2013, : 2897 - 2900