Dual-Mode Learning for Multi-Dataset X-Ray Security Image Detection

被引:5
作者
Yang, Fenghong [1 ]
Jiang, Runqing [1 ]
Yan, Yan [1 ]
Xue, Jing-Hao [2 ]
Wang, Biao [3 ]
Wang, Hanzi [1 ]
机构
[1] Xiamen Univ, Sch Informat, Fujian Key Lab Sensing & Comp Smart City, Xiamen 361102, Peoples R China
[2] UCL, Dept Stat Sci, London WC1E 6BT, England
[3] Zhejiang Lab, Hangzhou 311101, Peoples R China
基金
中国国家自然科学基金;
关键词
X-ray security image detection; domain discrepancy; occlusion; feature distillation; multi-dataset learning;
D O I
10.1109/TIFS.2024.3364368
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
With the recent advance of deep learning, a large number of methods have been developed for prohibited item detection in X-ray security images. Generally, these methods train models on a single X-ray image dataset that may contain only limited categories of prohibited items. To detect more prohibited items, it is desirable to train a model on the multi-dataset that is constructed by combining multiple datasets. However, directly applying existing methods to the multi-dataset cannot guarantee good performance because of the large domain discrepancy between datasets and the occlusion in images. To address the above problems, we propose a novel Dual-Mode Learning Network (DML-Net) to effectively detect all the prohibited items in the multi-dataset. In particular, we develop an enhanced RetinaNet as the architecture of DML-Net, where we introduce a lattice appearance enhanced sub-net to enhance appearance representations. Such a way benefits the detection of occluded prohibited items. Based on the enhanced RetinaNet, the learning process of DML-Net involves both common mode learning (detecting the common prohibited items across datasets) and unique mode learning (detecting the unique prohibited items in each dataset). For common mode learning, we introduce an adversarial prototype alignment module to align the feature prototypes from different datasets in the domain-invariant feature space. For unique mode learning, we take advantage of feature distillation to enforce the student model to mimic the features extracted by multiple pre-trained teacher models. By tightly combining and jointly training the dual modes, our DML-Net method successfully eliminates the domain discrepancy and exhibits superior model capacity on the multi-dataset. Extensive experimental results on several combined X-ray image datasets demonstrate the effectiveness of our method against several state-of-the-art methods. Our code is available at https://github.com/vampirename/dmlnet.
引用
收藏
页码:3510 / 3524
页数:15
相关论文
共 48 条
[1]   Towards automatic threat detection: A survey of advances of deep learning within X-ray security imaging [J].
Akcay, Samet ;
Breckon, Toby .
PATTERN RECOGNITION, 2022, 122
[2]   Using Deep Convolutional Neural Network Architectures for Object Classification and Detection Within X-Ray Baggage Security Imagery [J].
Akcay, Samet ;
Kundegorski, Mikolaj E. ;
Willcocks, Chris G. ;
Breckon, Toby P. .
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2018, 13 (09) :2203-2215
[3]   Self-Supervised Deep Monocular Depth Estimation With Ambiguity Boosting [J].
Bello, Juan Luis Gonzalez ;
Kim, Munchurl .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) :9131-9149
[4]   Detecting prohibited objects with physical size constraint from cluttered X-ray baggage images [J].
Chang, An ;
Zhang, Yu ;
Zhang, Shunli ;
Zhong, Leisheng ;
Zhang, Li .
KNOWLEDGE-BASED SYSTEMS, 2022, 237
[5]   Variational Attention: Propagating Domain-Specific Knowledge for Multi-Domain Learning in Crowd Counting [J].
Chen, Binghui ;
Yan, Zhaoyi ;
Li, Ke ;
Li, Pengyu ;
Wang, Biao ;
Zuo, Wangmeng ;
Zhang, Lei .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :16045-16055
[6]   ScaleDet: A Scalable Multi-Dataset Object Detector [J].
Chen, Yanbei ;
Wang, Manchen ;
Mittal, Abhay ;
Xu, Zhenlin ;
Favaro, Paolo ;
Tighe, Joseph ;
Modolo, Davide .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, :7288-7297
[7]   Domain Adaptive Faster R-CNN for Object Detection in the Wild [J].
Chen, Yuhua ;
Li, Wen ;
Sakaridis, Christos ;
Dai, Dengxin ;
Van Gool, Luc .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3339-3348
[8]  
Ganin Y, 2015, PR MACH LEARN RES, V37, P1180
[9]   Knowledge Distillation: A Survey [J].
Gou, Jianping ;
Yu, Baosheng ;
Maybank, Stephen J. ;
Tao, Dacheng .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (06) :1789-1819
[10]   DIGITAL LATTICE AND LADDER FILTER SYNTHESIS [J].
GRAY, AH ;
MARKEL, JD .
IEEE TRANSACTIONS ON AUDIO AND ELECTROACOUSTICS, 1973, AU21 (06) :491-500