Historical Text Line Segmentation Using Deep Learning Algorithms: Mask-RCNN against U-Net Networks

被引:2
|
作者
Fizaine, Florian Come [1 ,2 ]
Bard, Patrick [1 ]
Paindavoine, Michel [1 ]
Robin, Cecile [2 ,3 ]
Bouye, Edouard [2 ]
Lefevre, Raphael [4 ]
Vinter, Annie [1 ]
机构
[1] Univ Bourgogne, LEAD CNRS, F-21000 Dijon, France
[2] Arch Dept Cote dOr, F-21000 Dijon, France
[3] Inst Natl Patrimoine, F-75002 Paris, France
[4] Soc Natl Chemins Fer Francais, F-93200 St Denis, France
关键词
deep learning; line segmentation; instance segmentation; Mask-RCNN; U-Net; historical document analysis; DOCUMENTS;
D O I
10.3390/jimaging10030065
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
Text line segmentation is a necessary preliminary step before most text transcription algorithms are applied. The leading deep learning networks used in this context (ARU-Net, dhSegment, and Doc-UFCN) are based on the U-Net architecture. They are efficient, but fall under the same concept, requiring a post-processing step to perform instance (e.g., text line) segmentation. In the present work, we test the advantages of Mask-RCNN, which is designed to perform instance segmentation directly. This work is the first to directly compare Mask-RCNN- and U-Net-based networks on text segmentation of historical documents, showing the superiority of the former over the latter. Three studies were conducted, one comparing these networks on different historical databases, another comparing Mask-RCNN with Doc-UFCN on a private historical database, and a third comparing the handwritten text recognition (HTR) performance of the tested networks. The results showed that Mask-RCNN outperformed ARU-Net, dhSegment, and Doc-UFCN using relevant line segmentation metrics, that performance evaluation should not focus on the raw masks generated by the networks, that a light mask processing is an efficient and simple solution to improve evaluation, and that Mask-RCNN leads to better HTR performance.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Multiple-Object Detection and Segmentation Based on Deep Learning in High-Resolution Video Using Mask-RCNN
    Rajjak, Shaikh Shakil Abdul
    Kureshi, A. K.
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2021, 35 (13)
  • [2] Deep Learning Based Cell Segmentation Using Cascaded U-Net Models
    Bakir, Mehmet Emin
    Keles, Hacer Yalim
    29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
  • [3] Automated Cardiac Chamber Size and Cardiac Physiology Measurement in Water Fleas by U-Net and Mask RCNN Convolutional Networks
    Saputra, Ferry
    Farhan, Ali
    Suryanto, Michael Edbert
    Kurnia, Kevin Adi
    Chen, Kelvin H-C
    Vasquez, Ross D.
    Roldan, Marri Jmelou M.
    Huang, Jong-Chin
    Lin, Yih-Kai
    Hsiao, Chung-Der
    ANIMALS, 2022, 12 (13):
  • [4] Bladder Wall Segmentation using U-Net based Deep Learning
    Ivanitskiy, Michael
    Hadjiiski, Lubomir
    Chan, Heang-Ping
    Samala, Ravi
    Cohan, Richard H.
    Caoili, Elaine M.
    Weizer, Alon
    Alva, Ajjai
    Wei, Jun
    Zhou, Chuan
    MEDICAL IMAGING 2020: COMPUTER-AIDED DIAGNOSIS, 2020, 11314
  • [5] Deep Learning Model Development with U-net Architecture for Glottis Segmentation
    Derdiman, Yasar Said
    Koc, Turgay
    29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
  • [6] Deep Learning with Limited Data: Organ Segmentation Performance by U-Net
    Bardis, Michelle
    Houshyar, Roozbeh
    Chantaduly, Chanon
    Ushinsky, Alexander
    Glavis-Bloom, Justin
    Shaver, Madeleine
    Chow, Daniel
    Uchio, Edward
    Chang, Peter
    ELECTRONICS, 2020, 9 (08) : 1 - 12
  • [7] Brain Tumor Segmentation Using U-Net Based Deep Neural Networks
    Hai Thanh Le
    Hien Thi-Thu Pham
    7TH INTERNATIONAL CONFERENCE ON THE DEVELOPMENT OF BIOMEDICAL ENGINEERING IN VIETNAM (BME7): TRANSLATIONAL HEALTH SCIENCE AND TECHNOLOGY FOR DEVELOPING COUNTRIES, 2020, 69 : 39 - 42
  • [8] Deep Learning for Carotid Plaque Segmentation using a Dilated U-Net Architecture
    Meshram, Nirvedh H.
    Mitchell, Carol C.
    Wilbrand, Stephanie
    Dempsey, Robert J.
    Varghese, Tomy
    ULTRASONIC IMAGING, 2020, 42 (4-5) : 221 - 230
  • [9] RURAL SETTLEMENTS SEGMENTATION BASED ON DEEP LEARNING U-NET USING REMOTE SENSING IMAGES
    Aamir, Zakaria
    Seddouki, Mariem
    Himmy, Oussama
    Maanan, Mehdi
    Tahiri, Mohamed
    Rhinane, Hassan
    GEOINFORMATION WEEK 2022, VOL. 48-4, 2023, : 1 - 5
  • [10] Remote Sensing Image Segmentation for Aircraft Recognition Using U-Net as Deep Learning Architecture
    Shaar, Fadi
    Yilmaz, Arif
    Topcu, Ahmet Ercan
    Alzoubi, Yehia Ibrahim
    APPLIED SCIENCES-BASEL, 2024, 14 (06):