Ensemble deep learning model for optical character recognition

被引：3

作者：

Shetty, Ashish ^{[1
]}

Sharma, Sanjeev ^{[1
]}

机构：

[1] Indian Inst Informat Technol, Pune, India

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2024年 / 83卷 / 04期

关键词：

Character recognition; OCR; Convolution Neural Network; CNN; Deep learning; The Chars74K dataset; Ensemble model;

D O I：

10.1007/s11042-023-16018-0

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In modern deep learning, character recognition in images is a very important field of study due to its has many real life applications. The goal of this paper is to create the state-of-the-art character recognition model using a stacking ensemble of convolution neural networks (CNNs).To develop the proposed ensemble model, we evaluated several CNN models. The models were judged on how well they performed on the Chars74k dataset. The dataset contains 74,103 images divided into 62 classes with labels [A-Z], [a-z], and [0-9]. The accuracy distribution based on the dataset's subgroups (uppercase, lowercase, and digit) is shown in results. The proposed ensemble model achieves state-of-the-art performance with a maximum accuracy of 92.31% on complete dataset, 99.22% on Uppercase alphabets, 98.66% on Lowercase alphabets, 99.77% on Digits, 91.97% on Uppercase+Lowercase alphabets. On the complete and partial datasets, a comparison report between the proposed model and other existing approaches is also displayed. A comparative study of the proposed work and the previous methods is also shown in this paper, in order to demonstrate the effectiveness of the proposed work.

引用

页码：11411 / 11431

页数：21

共 40 条

[1]

[Anonymous], 1995, HDB BRAIN THEORY NEU

[2]

[Anonymous], 2015, convolutional neural networks

[3]

CHOLLET F, 2017, PROC CVPR IEEE, P1800, DOI [DOI 10.1109/CVPR.2017.195, 10.1109/CVPR.2017.195]

[4]

de Campos TE, 2009, VISAPP 2009: PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON COMPUTER VISION THEORY AND APPLICATIONS, VOL 2, P273

[5]

Dey R, 2022, MULTIMED TOOLS APPL, P1

[6]

Dey R, 2021, MULTIMED TOOLS APPL, P1

[7]

Driss S. B, 2017, Real-Time Image and Video Processing 2017, V10223

[8] Deep-Pneumonia Framework Using Deep Learning Models Based on Chest X-Ray Images [J].

Elshennawy, Nada M. ;

Ibrahim, Dina M. .

DIAGNOSTICS, 2020, 10 (09)

[9] Convolutional neural network with joint stepwise character/word modeling based system for scene text recognition [J].

Harizi, Riadh ;

Walha, Rim ;

Drira, Fadoua ;

Zaied, Mourad .

MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (03) :3091-3106

[10] Identity Mappings in Deep Residual Networks [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 :630-645

← 1 2 3 4 →