Real-Time MDNet

被引：192

作者：

Jung, Ilchae ^{[1
]}

Son, Jeany ^{[1
]}

Baek, Mooyeol ^{[1
]}

Han, Bohyung ^{[2
]}

机构：

[1] POSTECH, Dept CSE, Pohang, South Korea

[2] Seoul Natl Univ, Dept ECE & ASRI, Seoul, South Korea

来源：

COMPUTER VISION - ECCV 2018, PT IV | 2018年 / 11208卷

关键词：

Visual tracking; Multi-domain learning; RoIAlign; Instance embedding loss;

D O I：

10.1007/978-3-030-01225-0_6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a fast and accurate visual tracking algorithm based on the multi-domain convolutional neural network (MDNet). The proposed approach accelerates feature extraction procedure and learns more discriminative models for instance classification; it enhances representation quality of target and background by maintaining a high resolution feature map with a large receptive field per activation. We also introduce a novel loss term to differentiate foreground instances across multiple domains and learn a more discriminative embedding of target objects with similar semantics. The proposed techniques are integrated into the pipeline of a well known CNN-based visual tracking algorithm, MDNet. We accomplish approximately 25 times speed-up with almost identical accuracy compared to MDNet. Our algorithm is evaluated in multiple popular tracking benchmark datasets including OTB2015, UAV123, and TempleColor, and outperforms the state-of-the-art real-time tracking methods consistently even without dataset-specific parameter tuning.

引用

页码：89 / 104

页数：16

共 35 条

[1]

[Anonymous], 2016, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2016.465

[2]

[Anonymous], 2017, ICCV

[3] Fully-Convolutional Siamese Networks for Object Tracking [J].

Bertinetto, Luca ;

Valmadre, Jack ;

Henriques, Joao F. ;

Vedaldi, Andrea ;

Torr, Philip H. S. .

COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 :850-865

[4] The devil is in the details: an evaluation of recent feature encoding methods [J].

Chatfield, Ken ;

Lempitsky, Victor ;

Vedaldi, Andrea ;

Zisserman, Andrew .

PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2011, 2011,

[5] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].

Chen, Liang-Chieh ;

Papandreou, George ;

Kokkinos, Iasonas ;

Murphy, Kevin ;

Yuille, Alan L. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848

[6] Histograms of oriented gradients for human detection [J].

Dalal, N ;

Triggs, B .

2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893

[7] ECO: Efficient Convolution Operators for Tracking [J].

Danelljan, Martin ;

Bhat, Goutam ;

Khan, Fahad Shahbaz ;

Felsberg, Michael .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6931-6939

[8] Discriminative Scale Space Tracking [J].

Danelljan, Martin ;

Hager, Gustav ;

Khan, Fahad Shahbaz ;

Felsberg, Michael .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (08) :1561-1575

[9] Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking [J].

Danelljan, Martin ;

Robinson, Andreas ;

Khan, Fahad Shahbaz ;

Felsberg, Michael .

COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 :472-488

[10] Learning Spatially Regularized Correlation Filters for Visual Tracking [J].

Danelljan, Martin ;

Hager, Gustav ;

Khan, Fahad Shahbaz ;

Felsberg, Michael .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :4310-4318

← 1 2 3 4 →