An Attention-Based Convolutional Neural Network With Spatial Transformer Module for Automated Optical Inspection of Small Objects

被引:0
作者
Kim, Hyun Yong [1 ]
Yi, Taek Joon [2 ]
Lee, Jong Yun [2 ]
机构
[1] Chungbuk Natl Univ, Dept Smart Factory Management, Cheongju 28644, South Korea
[2] Chungbuk Natl Univ, Dept Comp Sci, Cheongju 28644, South Korea
基金
新加坡国家研究基金会;
关键词
Feature extraction; Accuracy; Product codes; Adaptation models; YOLO; Convolutional neural networks; Computational modeling; Optical imaging; Optical character recognition; Cameras; Attention module; automated optical inspection (AOI); convolutional neural network (CNN); Jetson Nano; region of interest (ROI); small object;
D O I
10.1109/TIM.2025.3548240
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In the manufacturing industry, automated optical inspection (AOI) systems play a crucial role in accurate and efficient inspection. Recently, convolutional neural networks (CNNs) have become increasingly popular for this AOI because they can automatically extract optimal features through convolutional and pooling operations. In this article, we aim to classify product images obtained from an AOI system based on the product codes in them. However, since the size of the product codes is very small compared with the image, it is challenging to achieve sufficient classification accuracy using a standard CNN architecture. To address this issue, we propose an attention-based CNN (A-CNN) model that focuses on small objects within images. The A-CNN model integrates an attention module to adaptively extract regions of interests (ROIs) centered on small objects, and a classification network (CN) to classify these ROIs, thereby increasing the object-to-image area ratio (OAR) and improving classification accuracy for small objects. The A-CNN model can be effectively trained end-to-end with minimal data labeling compared with object detection methods. Experimental results show that the proposed A-CNN model achieves a classification accuracy of 99.92% and an inference speed of 33 frames/s on the NVIDIA Jetson Nano platform, outperforming the smallest models of YOLOv5, YOLOv7, YOLOv8, YOLOv9, and YOLOv10, state-of-the-art object detection algorithms, in terms of both accuracy and latency. Notably, our model has $2.0\times $ faster than the fastest YOLO model, underscoring its efficiency in real-time applications. These findings highlight the potential of the A-CNN model as an accurate and practical solution for small object classification.
引用
收藏
页数:12
相关论文
共 57 条
[1]   Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI [J].
Barredo Arrieta, Alejandro ;
Diaz-Rodriguez, Natalia ;
Del Ser, Javier ;
Bennetot, Adrien ;
Tabik, Siham ;
Barbado, Alberto ;
Garcia, Salvador ;
Gil-Lopez, Sergio ;
Molina, Daniel ;
Benjamins, Richard ;
Chatila, Raja ;
Herrera, Francisco .
INFORMATION FUSION, 2020, 58 :82-115
[2]  
Bochkovskiy A, 2020, Arxiv, DOI [arXiv:2004.10934, DOI 10.48550/ARXIV.2004.10934]
[3]   Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J].
Carreira, Joao ;
Zisserman, Andrew .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4724-4733
[4]   Recent advances in efficient computation of deep convolutional neural networks [J].
Cheng, Jian ;
Wang, Pei-song ;
Li, Gang ;
Hu, Qing-hao ;
Lu, Han-qing .
FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2018, 19 (01) :64-77
[5]  
Das A., 2020, arXiv, DOI DOI 10.48550/ARXIV.2006.11371
[6]  
Howard AG, 2017, Arxiv, DOI [arXiv:1704.04861, 10.48550/arXiv.1704.04861]
[7]   Attention mechanisms in computer vision: A survey [J].
Guo, Meng-Hao ;
Xu, Tian-Xing ;
Liu, Jiang-Jiang ;
Liu, Zheng-Ning ;
Jiang, Peng-Tao ;
Mu, Tai-Jiang ;
Zhang, Song-Hai ;
Martin, Ralph R. ;
Cheng, Ming-Ming ;
Hu, Shi-Min .
COMPUTATIONAL VISUAL MEDIA, 2022, 8 (03) :331-368
[8]   Compact camera module cover defect classification using quadtree decomposition based deep learning [J].
Ha M.-H. ;
Park T.-H. .
Journal of Institute of Control, Robotics and Systems, 2021, 27 (08) :626-632
[9]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[10]   Deep Residual Neural Network-Based Defect Detection on Complex Backgrounds [J].
Ho, Chao-Ching ;
Hernandez, Miguel A. Benalcazar ;
Chen, Yi-Fan ;
Lin, Chih-Jer ;
Chen, Chin-Sheng .
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71