Improving Multispectral Pedestrian Detection by Addressing Modality Imbalance Problems

被引:182
作者
Zhou, Kailai [1 ]
Chen, Linsen [1 ]
Cao, Xun [1 ]
机构
[1] Nanjing Univ, Nanjing, Peoples R China
来源
COMPUTER VISION - ECCV 2020, PT XVIII | 2020年 / 12363卷
基金
中国国家自然科学基金;
关键词
Multispectral pedestrian detection; Modality imbalance problems; Multimodal feature fusion; DEEP NEURAL-NETWORKS; FUSION;
D O I
10.1007/978-3-030-58523-5_46
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multispectral pedestrian detection is capable of adapting to insufficient illumination conditions by leveraging color-thermal modalities. On the other hand, it is still lacking of in-depth insights on how to fuse the two modalities effectively. Compared with traditional pedestrian detection, we find multispectral pedestrian detection suffers from modality imbalance problems which will hinder the optimization process of dual-modality network and depress the performance of detector. Inspired by this observation, we propose Modality Balance Network (MBNet) which facilitates the optimization process in a much more flexible and balanced manner. Firstly, we design a novel Differential Modality Aware Fusion (DMAF) module to make the two modalities complement each other. Secondly, an illumination aware feature alignment module selects complementary features according to the illumination conditions and aligns the two modality features adaptively. Extensive experimental results demonstrate MBNet outperforms the state-of-the-arts on both the challenging KAIST and CVC-14 multispectral pedestrian datasets in terms of the accuracy and the computational efficiency. Code is available at https://github.com/CalayZhou/MBNet.
引用
收藏
页码:787 / 803
页数:17
相关论文
共 49 条
[1]  
Behley J., 2019, P IEEE CVF INT C COM, V3
[2]   Pedestrian Detection with Autoregressive Network Phases [J].
Brazil, Garrick ;
Liu, Xiaoming .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :7224-7233
[3]   Illuminating Pedestrians via Simultaneous Detection & Segmentation [J].
Brazil, Garrick ;
Yin, Xi ;
Liu, Xiaoming .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :4960-4969
[4]   Box-level segmentation supervised deep neural networks for accurate and real-time multispectral pedestrian detection [J].
Cao, Yanpeng ;
Guan, Dayan ;
Wu, Yulun ;
Yang, Jiangxin ;
Cao, Yanlong ;
Yang, Michael Ying .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2019, 150 :70-79
[5]   Towards Accurate One-Stage Object Detection with AP-Loss [J].
Chen, Kean ;
Li, Jianguo ;
Lin, Weiyao ;
See, John ;
Wang, Ji ;
Duan, Lingyu ;
Chen, Zhibo ;
He, Changwei ;
Zou, Junni .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5114-5122
[6]   Beyond triplet loss: a deep quadruplet network for person re-identification [J].
Chen, Weihua ;
Chen, Xiaotang ;
Zhang, Jianguo ;
Huang, Kaiqi .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1320-1329
[7]  
Chi C, 2019, Arxiv, DOI arXiv:1909.10674
[8]  
Choi H, 2016, INT C PATT RECOG, P621, DOI 10.1109/ICPR.2016.7899703
[9]  
Deng L, 2019, arXiv
[10]   Fast Feature Pyramids for Object Detection [J].
Dollar, Piotr ;
Appel, Ron ;
Belongie, Serge ;
Perona, Pietro .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2014, 36 (08) :1532-1545