Deep features based convolutional neural network model for text and non-text region segmentation from document images

被引：16

作者：

Umer, Saiyed

Mondal, Ranjan

Pandey, Hari Mohan ^{[1
,3
]}

Rout, Ranjeet Kumar ^{[2
]}

机构：

[1] Aliah Univ, Dept Comp Sci & Engn, Kolkata, India

[2] Indian Stat Inst, Elect & Commun Sci Unit, Kolkata, India

[3] Edge Hill Univ, Dept Comp Sci, Ormskirk, Lancs, England

来源：

APPLIED SOFT COMPUTING | 2021年 / 113卷

关键词：

Complex layout; Document image; Text and Non-text region; Segmentation; Patch-based approach; Deep Learning Method; EXTRACTION;

D O I：

10.1016/j.asoc.2021.107917

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A deep convolutional neural network model is presented here which uses deep learning features for text and non-text region segmentation from document images. The key objective is to extract text regions from the complex layout document images without any prior knowledge of segmentation. In a real-world scenario, a document or magazine images contain various text information along with non-text regions such as symbols, logos, pictures, and graphics. Extraction of text regions from non-text regions is challenging. To mitigate these issues, an efficient and robust segmentation technique has been proposed in this paper. The implementation of the proposed model is divided into three phases: (a) a method for pre-processing of document images using different patch sizes is employed to handle the situations for variants of text fonts and sizes in mage; (b) a deep convolutional neural network model is proposed to predict the text or non-text or ambiguous region within the image; (c) a method for post-processing of document image is proposed to handle the situation where the image has complex ambiguous regions by utilizing the recursive partitioning of those regions into their proper classes (i.e. text or non-text) and then the system accumulates the responses of those predictive patches with varying resolutions for handling the situation of text fonts variations within the image. Extensive computer simulations have been conducted using a collection of complex layout magazine images from Google sites and the ICDAR 2015 database. Results are collected and compared with state-of-the-art methods. It reveals that the proposed model is robust and more effective as compared to state-of-the-art methods. (C) 2021 Elsevier B.V. All rights reserved.

引用

页数：14

共 50 条

[1] Text and non-text separation in offline document images: a survey
Showmik Bhowmik
Ram Sarkar
Mita Nasipuri
David Doermann
International Journal on Document Analysis and Recognition (IJDAR), 2018, 21 : 1 - 20
[2] Text recognition in document images obtained by a smartphone based on deep convolutional and recurrent neural network
El Bahi, Hassan
Zatni, Abdelkarim
MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (18) : 26453 - 26481
[3] Text/non-text classification of connected components in document images
Julca-Aguilar, Frank D.
Maia, Ana L. L. M.
Hirata, Nina S. T.
2017 30TH SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 2017, : 450 - 455
[4] Text and non-text separation in offline document images: a survey
Bhowmik, Showmik
Sarkar, Ram
Nasipuri, Mita
Doermann, David
INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2018, 21 (1-2) : 1 - 20
[5] Text recognition in document images obtained by a smartphone based on deep convolutional and recurrent neural network
Hassan El Bahi
Abdelkarim Zatni
Multimedia Tools and Applications, 2019, 78 : 26453 - 26481
[6] Application of texture-based features for text non-text classification in printed document images with novel feature selection algorithm
Soulib Ghosh
S. K. Khalid Hassan
Ali Hussain Khan
Ankur Manna
Showmik Bhowmik
Ram Sarkar
Soft Computing, 2022, 26 : 891 - 909
[7] Application of texture-based features for text non-text classification in printed document images with novel feature selection algorithm
Ghosh, Soulib
Hassan, S. K. Khalid
Khan, Ali Hussain
Manna, Ankur
Bhowmik, Showmik
Sarkar, Ram
SOFT COMPUTING, 2022, 26 (02) : 891 - 909
[8] TEXNET: A DEEP CONVOLUTIONAL NEURAL NETWORK MODEL TO RECOGNIZE TEXT IN NATURAL SCENE IMAGES
KAVITHA, D.
RADHA, V.
JOURNAL OF ENGINEERING SCIENCE AND TECHNOLOGY, 2021, 16 (02): : 1782 - 1799
[9] Text Line Segmentation for Challenging Handwritten Document Images Using Fully Convolutional Network
Barakat, Berat
Droby, Ahmad
Kassis, Majeed
El-Sana, Jihad
PROCEEDINGS 2018 16TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2018, : 374 - 379
[10] Malayalam Text and Non-Text Classification of Natural Scene Images Based on Multiple Instance Learning
Manjaly, Anit V.
Priya, B. Shanmuga
2016 IEEE INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTER APPLICATIONS (ICACA), 2016, : 190 - 196

← 1 2 3 4 5 →