An off-policy deep reinforcement learning-based active learning for crime scene investigation image classification

被引:1
作者
Zhang, Yixin [1 ]
Liu, Yang [2 ]
Jiang, Guofan [3 ]
Yang, Yuchen [4 ]
Zhang, Jian [5 ]
Jing, Yang [6 ]
Roohallah, Alizadehsani [7 ]
Ryszard, Tadeusiewicz [8 ]
Pawel, Plawiak [9 ,10 ]
机构
[1] Univ Birmingham, Law Sch, Birmingham B152TT, England
[2] Guangxi Police Coll, Law Sch, Nanning 530000, Peoples R China
[3] Peoples Publ Secur Univ China, Dept Natl Secur, Beijing 10038, Peoples R China
[4] China Agr Univ, Sch Law, Beijing 10038, Peoples R China
[5] Univ Malaya, Sch Law, Kuala Lumpur 50603, Malaysia
[6] Univ Malaya, Dept Comp Syst & Technol, Wilayar Persekutuan 50603, Malaysia
[7] Deakin Univ, Inst Intelligent Syst Res & Innovat IISRI, Waurn Ponds, Australia
[8] AGH Univ Sci & Technol, Dept Biocybernet & Biomed Engn, Krakow, Poland
[9] Cracow Univ Technol, Fac Comp Sci & Telecommun, Dept Comp Sci, Warszawska 24, PL-31155 Krakow, Poland
[10] Polish Acad Sci, Inst Theoret & Appl Informat, Baltycka 5, PL-44100 Gliwice, Poland
关键词
Crime scene investigation; Active learning; Deep reinforcement learning; Generative adversarial network; Convolutional neural network;
D O I
10.1016/j.ins.2025.122074
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Crime scene investigation (CSI) image classification is crucial in forensic analysis, significantly boosting the efficiency of police investigations. Conventional CSI image classification detection approaches depend heavily on convolutional neural networks (CNNs) and pre-labeled, extensive image data, which are time-consuming and costly to assemble. To address this issue, we present an active learning method that boosts model performance using fewer labeled examples. Unlike traditional active learning methods that employ heuristic selection techniques independently of the training process and compromise their effectiveness, our strategy integrates off-policy deep reinforcement learning (DRL) to make strategic data selections. The off-policy approach allows for learning from a broader range of experiences, improving adaptability and accelerating the learning process compared to on-policy methods. Our model utilizes numerous CNNs to pull features from different layers of images, which are subsequently processed by the softmax layer for image categorization. Starting with a minimal labeled dataset, the classifier employs DRL to identify which unlabeled images should be annotated next. These newly labeled images are then added to the training pool, and the classifier is periodically retrained to enhance its accuracy. Additionally, our framework incorporates a generative adversarial network (GAN) for online data augmentation and introduces a novel regularization technique to stabilize GAN training and prevent mode collapse. Furthermore, our model employs the Random Key method, optimized by the differential evolution (DE) algorithm, to minimize the dependence of the approach on hyperparameters. Our comprehensive testing across diverse datasets, including the center for image and information processing-CSI dataset (CIIP-CSID), the global human image dataset with 10,000 images (GHIM-10 K), and corel 1,000 (Corel-1 K), demonstrates that our model achieves F-measures ranging from 86.758 % to 92.611 %. This underscores the superior performance and versatility of the model in handling varied CSI image classification tasks.
引用
收藏
页数:34
相关论文
共 39 条
[1]   Automatically classifying crime scene images using machine learning methodologies [J].
Abraham, Joshua ;
Ng, Ronnie ;
Morelato, Marie ;
Tahtouh, Mark ;
Roux, Claude .
FORENSIC SCIENCE INTERNATIONAL-DIGITAL INVESTIGATION, 2021, 39
[2]   Optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm [J].
Ashraf, Nesma M. ;
Mostafa, Reham R. ;
Sakr, Rasha H. ;
Rashad, M. Z. .
PLOS ONE, 2021, 16 (06)
[3]   Advancements in Generative AI: A Comprehensive Review of GANs, GPT, Autoencoders, Diffusion Model, and Transformers [J].
Bengesi, Staphord ;
El-Sayed, Hoda ;
Sarker, M. D. Kamruzzaman ;
Houkpati, Yao ;
Irungu, John ;
Oladunni, Timothy .
IEEE ACCESS, 2024, 12 :69812-69837
[4]   Entropy adjustment by interpolation for exploration in Proximal Policy Optimization (PPO) [J].
Boudlal, Ayoub ;
Khafaji, Abderahim ;
Elabbadi, Jamal .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
[5]   Algorithms for dynamic control of a deep-sea mining vehicle based on deep reinforcement learning [J].
Chen, Qihang ;
Yang, Jianmin ;
Zhao, Wenhua ;
Tao, Longbin ;
Mao, Jinghang ;
Li, Zhiyuan .
OCEAN ENGINEERING, 2024, 298
[6]   Proximal policy optimization guidance algorithm for intercepting near-space maneuvering targets [J].
Chen, Wenxue ;
Gao, Changsheng ;
Jing, Wuxing .
AEROSPACE SCIENCE AND TECHNOLOGY, 2023, 132
[7]  
Chen X, 2023, AAAI CONF ARTIF INTE, P7078
[8]   Multi-modal active learning with deep reinforcement learning for target feature extraction in multi-media image processing applications [J].
Dhiman, Gaurav ;
Kumar, A. Vignesh ;
Nirmalan, R. ;
Sujitha, S. ;
Srihari, K. ;
Yuvaraj, N. ;
Arulprakash, P. ;
Raja, R. Arshath .
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (04) :5343-5367
[9]   Sample efficient reinforcement learning with active learning for molecular design [J].
Dodds, Michael ;
Guo, Jeff ;
Loehr, Thomas ;
Tibo, Alessandro ;
Engkvist, Ola ;
Janet, Jon Paul .
CHEMICAL SCIENCE, 2024, 15 (11) :4146-4160