Implicit and Explicit Concept Relations in Deep Neural Networks for Multi-Label Video/Image Annotation

被引：44

作者：

Markatopoulou, Foteini ^{[1
,2
]}

Mezaris, Vasileios ^{[1
]}

Patras, Ioannis ^{[2
]}

机构：

[1] Ctr Res & Technol Hellas, Informat Technol Inst, Thermi 57001, Greece

[2] Queen Mary Univ London, Mile End Campus, London E1 4NS, England

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2019年 / 29卷 / 06期

基金：

欧盟地平线“2020”;

关键词：

Video/image concept annotation; deep learning; multi-task learning; structured outputs; multi-label learning; concept correlations; video analysis; IMAGE ANNOTATION; GRAPH;

D O I：

10.1109/TCSVT.2018.2848458

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper, we propose a deep convolutional neural network (DCNN) architecture that addresses the problem of video/image concept annotation by exploiting concept relations at two different levels. At the first level, we build on ideas from multi-task learning, and propose an approach to learn concept-specific representations that are sparse, linear combinations of representations of latent concepts. By enforcing the sharing of the latent concept representations, we exploit the implicit relations between the target concepts. At a second level, we build on ideas from structured output learning and propose the introduction, at training time, of a new cost term that explicitly models the correlations between the concepts. By doing so, we explicitly model the structure in the output space (i.e., the concept labels). Both of the above are implemented using standard convolutional layers and are incorporated in a single DCNN architecture that can then be trained end-to-end with standard back-propagation. Experiments on four large-scale video and image data sets show that the proposed DCNN improves concept annotation accuracy and outperforms the related state-of-the-art methods.

引用

页码：1631 / 1644

页数：14

共 50 条

[1] A Novel Model for Multi-label Image Annotation
Wu, Xinjian
Zhang, Li
Li, Fanzhang
Wang, Bangjun
2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 1953 - 1958
[2] Deep Multi-Instance Multi-Label Learning for Image Annotation
Guo, Hai-Feng
Han, Lixin
Su, Shoubao
Sun, Zhou-Bao
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2018, 32 (03)
[3] Multi-view multi-label learning for image annotation
Zou, Fuhao
Liu, Yu
Wang, Hua
Song, Jingkuan
Shao, Jie
Zhou, Ke
Zheng, Sheng
MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (20) : 12627 - 12644
[4] Multi-view multi-label learning for image annotation
Fuhao Zou
Yu Liu
Hua Wang
Jingkuan Song
Jie Shao
Ke Zhou
Sheng Zheng
Multimedia Tools and Applications, 2016, 75 : 12627 - 12644
[5] MULTI-LABEL IMAGE ANNOTATION VIA MAXIMUM CONSISTENCY
Wang, Hua
Hu, Jian
2010 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, 2010, : 2337 - 2340
[6] Multi-label Text Classification with Deep Neural Networks
Chen, Yun
Xiao, Bo
Lin, Zhiqing
Dai, Cheng
Li, Zuochao
Yang, Liping
PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON NETWORK INFRASTRUCTURE AND DIGITAL CONTENT (IEEE IC-NIDC), 2018, : 409 - 413
[7] Deep multi-label learning for image distortion identification
Liang, Dong
Gao, Xinbo
Lu, Wen
He, Lihuo
SIGNAL PROCESSING, 2020, 172
[8] Multi-Label Dictionary Learning for Image Annotation
Jing, Xiao-Yuan
Wu, Fei
Li, Zhiqiang
Hu, Ruimin
Zhang, David
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (06) : 2712 - 2725
[9] Semi-supervised robust deep neural networks for multi-label image classification
Cevikalp, Hakan
Benligiray, Burak
Gerek, Omer Nezih
PATTERN RECOGNITION, 2020, 100
[10] MULTI-TASK DEEP NEURAL NETWORK FOR MULTI-LABEL LEARNING
Huang, Yan
Wang, Wei
Wang, Liang
Tan, Tieniu
2013 20TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2013), 2013, : 2897 - 2900

← 1 2 3 4 5 →