Text and non-text separation in offline document images: a survey

被引:0
|
作者
Showmik Bhowmik
Ram Sarkar
Mita Nasipuri
David Doermann
机构
[1] Jadavpur University,Institute for Advanced Computer Studies
[2] University of Maryland,undefined
来源
International Journal on Document Analysis and Recognition (IJDAR) | 2018年 / 21卷
关键词
Text/non-text separation; Segmentation; Offline document images; Engineering drawing; Map; Unconstrained handwritten document; Newspaper; Journal; Magazine; Check; Form; Survey;
D O I
暂无
中图分类号
学科分类号
摘要
Separation of text and non-text is an essential processing step for any document analysis system. Therefore, it is important to have a clear understanding of the state-of-the-art of text/non-text separation in order to facilitate the development of efficient document processing systems. This paper first summarizes the technical challenges of performing text/non-text separation. It then categorizes offline document images into different classes according to the nature of the challenges one faces, in an attempt to provide insight into various techniques presented in the literature. The pros and cons of various techniques are explained wherever possible. Along with the evaluation protocols, benchmark databases, this paper also presents a performance comparison of different methods. Finally, this article highlights the future research challenges and directions in this domain.
引用
收藏
页码:1 / 20
页数:19
相关论文
共 50 条
  • [1] Text and non-text separation in offline document images: a survey
    Bhowmik, Showmik
    Sarkar, Ram
    Nasipuri, Mita
    Doermann, David
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2018, 21 (1-2) : 1 - 20
  • [2] Text/non-text classification of connected components in document images
    Julca-Aguilar, Frank D.
    Maia, Ana L. L. M.
    Hirata, Nina S. T.
    2017 30TH SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 2017, : 450 - 455
  • [3] Text and Non-text Separation in Handwritten Document Images Using Local Binary Pattern Operator
    Bhowmik, Showmik
    Sarkar, Ram
    Nasipuri, Mita
    PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND COMMUNICATION, 2017, 458 : 507 - 515
  • [4] Separation of Text and Non-text in Document Layout Analysis using a Recursive Filter
    Tuan-Anh Tran
    Na, In-Seop
    Kim, Soo-Hyung
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2015, 9 (10): : 4072 - 4091
  • [5] Text/Non-Text Separation from Handwritten Document Images Using LBP Based Features: An Empirical Study
    Ghosh, Sourav
    Lahiri, Dibyadwati
    Bhowmik, Showmik
    Kavallieratou, Ergina
    Sarkar, Ram
    JOURNAL OF IMAGING, 2018, 4 (04)
  • [6] Text and Non-text Separation in Scanned Color-Official Documents
    Nandedkar, Amit Vijay
    Mukherjee, Jayanta
    Sural, Shamik
    COMPUTER VISION, GRAPHICS, AND IMAGE PROCESSING, ICVGIP 2016, 2017, 10481 : 231 - 242
  • [7] Context Modeling for Text/Non-Text Separation in Freeform Online Handwritten Documents
    Delaye, Adrien
    Liu, Cheng-Lin
    DOCUMENT RECOGNITION AND RETRIEVAL XX, 2013, 8658
  • [8] Deep features based convolutional neural network model for text and non-text region segmentation from document images
    Umer, Saiyed
    Mondal, Ranjan
    Pandey, Hari Mohan
    Rout, Ranjeet Kumar
    APPLIED SOFT COMPUTING, 2021, 113
  • [9] Application of texture-based features for text non-text classification in printed document images with novel feature selection algorithm
    Soulib Ghosh
    S. K. Khalid Hassan
    Ali Hussain Khan
    Ankur Manna
    Showmik Bhowmik
    Ram Sarkar
    Soft Computing, 2022, 26 : 891 - 909
  • [10] Application of texture-based features for text non-text classification in printed document images with novel feature selection algorithm
    Ghosh, Soulib
    Hassan, S. K. Khalid
    Khan, Ali Hussain
    Manna, Ankur
    Bhowmik, Showmik
    Sarkar, Ram
    SOFT COMPUTING, 2022, 26 (02) : 891 - 909