Arabic text detection and recognition in video using deep learning

被引:0
作者
Bouchakour, Lallouani [1 ,2 ]
Bettayeb, Nadjla [3 ]
机构
[1] Sci & Tech Res Ctr Dev Arab Language CRSTDLA, Algiers, Algeria
[2] Univ Sci & Technol Houari Boumediene USTHB, Speech & Signal Proc Lab, Algiers 16111, Algeria
[3] Univ Kasdi Merbah, Fac New Technol Informat & Commun, Dept Elect & Telecommun, Ouargla 30000, Algeria
关键词
Arabic Text detection; Character recognition; RCNN; YOLO; Gabor filter; U-Net;
D O I
10.1007/s11760-025-04295-1
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Detection and recognition of text in video content have become crucial research areas in computer vision due to the rapid expansion of multimedia data and the growing demand for automated information extraction. The dynamic nature of video introduces challenges such as varying lighting conditions, motion blur, complex backgrounds, and text appearing at different scales. This paper explores recent deep learning techniques for detecting and recognizing Arabic text in videos, focusing on enhancing the precision of text localization. We review state-of-the-art methods, particularly YOLO (You Only Look Once) and RCNN (Region-based Convolutional Neural Networks), for text detection. In addition, we investigate fine-tuning techniques and the U-Net deep model to improve the accuracy and robustness of text detection and recognition in video frames, utilizing CNN (Convolutional Neural Network) architectures for character recognition. Our research includes a comparative analysis of these methods with alternative approaches, incorporating temporal information from consecutive frames to enhance text consistency and recognition accuracy using Gabor filters. Experimental results show the superiority of YOLO in text detection and the U-Net model in character recognition.
引用
收藏
页数:9
相关论文
共 26 条
[1]  
Abdessamad A., 2024, 2024 IEEE 12 INT S S, P1
[2]  
Al-Muhtaseb H., 2010, Arabic text recognition of printed manuscripts
[3]  
Alkhateeb JH., 2020, Int. J. Softw. Eng. Comput. Syst, V6, P53, DOI [DOI 10.15282/IJSECS.6.2.2020.7.0076, 10.15282/ijsecs.6.2.2020.7.0076]
[4]   Character Region Awareness for Text Detection [J].
Baek, Youngmin ;
Lee, Bado ;
Han, Dongyoon ;
Yun, Sangdoo ;
Lee, Hwalsuk .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :9357-9366
[5]  
Bouchakour L., 2021, 2021 INT C REC ADV M, P1, DOI [10.1109/ICRAMI52622.2021.9585941, DOI 10.1109/ICRAMI52622.2021.9585941]
[6]   Deep learning object detection for optical monitoring of spatters in L-PBF [J].
Chebil, G. ;
Bettebghor, D. ;
Renollet, Y. ;
Lapouge, P. ;
Davoine, C. ;
Thomas, M. ;
Favier, V. ;
Schneider, M. .
JOURNAL OF MATERIALS PROCESSING TECHNOLOGY, 2023, 319
[7]  
Epshtein B, 2010, PROC CVPR IEEE, P2963, DOI 10.1109/CVPR.2010.5540041
[8]   A Survey of OCR in Arabic Language: Applications, Techniques, and Challenges [J].
Faizullah, Safiullah ;
Ayub, Muhammad Sohaib ;
Hussain, Sajid ;
Khan, Muhammad Asad .
APPLIED SCIENCES-BASEL, 2023, 13 (07)
[9]   Tinier-YOLO: A Real-Time Object Detection Method for Constrained Environments [J].
Fang, Wei ;
Wang, Lin ;
Ren, Peiming .
IEEE ACCESS, 2020, 8 :1935-1944
[10]  
Guan LB, 2017, 2017 2ND INTERNATIONAL CONFERENCE ON IMAGE, VISION AND COMPUTING (ICIVC 2017), P26, DOI 10.1109/ICIVC.2017.7984452