GPGPU-based High Throughput Image Pre-processing Towards Large-Scale Optical Character Recognition

被引:0
作者
Gener, Serhan [1 ]
Dattilo, Parker [1 ]
Gajaria, Dhruv [1 ]
Fusco, Alexander [1 ]
Akoglu, Ali [1 ]
机构
[1] Univ Arizona, Dept Elect & Comp Engn, Tucson, AZ 85721 USA
来源
2022 IEEE/ACS 19TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA) | 2022年
基金
美国国家科学基金会;
关键词
Optical Character Recognition (OCR); Tesseract; Leptonica; Image Processing; CUDA; GPU;
D O I
10.1109/AICCSA56895.2022.10017481
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Studies have shown that pre-processing digital images through scaling, rotation and blurring type of operations allow optical character recognition (OCR) to focus on the key features in the image and result in improving recognition accuracy. We leverage the open-source Tesseract OCR and show that its accuracy can be improved through a pre-processing flow that includes thresholding, rotation, rescaling, erosion, dilation, and noise removal steps based on a dataset that is formed of 560 phone screen images. However, the serial CPU-based implementation of this flow introduces a latency of 48.32 ms per image on average. Even though time scale is low in the context of a single image, this latency poses as a barrier when processing millions of images with OCR. To address this, we parallelize the entire pre-processing flow on the Nvidia P100 GPU, implement a streaming based execution, and reduce the latency to 0.846 ms. This streaming-enabled implementation enables setting up a GPU based OCR engine to process large scale workloads.
引用
收藏
页数:7
相关论文
共 22 条
[1]  
Aldulaimi F., 2016, WATER AIR SOIL POLL, V26, P1
[2]  
[Anonymous], 2003, 177 US POST SERV
[3]  
Bieniecki W, 2007, PERSPECTIVE TECHNOLOGIES AND METHODS IN MEMS DESIGN, P75
[4]  
Chen D, 2013, 2013 2ND INTERNATIONAL SYMPOSIUM ON INSTRUMENTATION AND MEASUREMENT, SENSOR NETWORK AND AUTOMATION (IMSNA), P1044, DOI 10.1109/IMSNA.2013.6743460
[5]  
Gaster B. R., 2013, HETEROGENEOUS COMPUT, V2, P65
[6]  
/github.com, TESSERACT
[7]  
Gonzales R.C., 2001, Digital image processing, Vsecond
[8]   Importance of Textlines in Historical Document Classification [J].
Kiss, Martin ;
Kohut, Jan ;
Benes, Karel ;
Hradis, Michal .
DOCUMENT ANALYSIS SYSTEMS, DAS 2022, 2022, 13237 :158-170
[9]  
Kraus M, 2007, LECT NOTES COMPUT SC, V4522, P532
[10]  
Lat A, 2018, INT C PATT RECOG, P3162, DOI 10.1109/ICPR.2018.8545609