GAN-based text line segmentation method for challenging handwritten documents

被引:0
作者
Ozseker, Ibrahim [1 ]
Demir, Ali Alper [2 ]
Ozkaya, Ufuk [2 ]
机构
[1] Yildiz Tech Univ, Kafein Technol Solut, Davutpasa Campus A2, TR-34220 Istanbul, Turkiye
[2] Suleyman Demirel Univ, Elect Elect Engn Dept, TR-32260 Isparta, Turkiye
关键词
Text line segmentation; Generative adversarial networks; Document analysis; Handwritten document; EXTRACTION;
D O I
10.1007/s10032-024-00488-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text line segmentation (TLS) is an essential step of the end-to-end document analysis systems. The main purpose of this step is to extract the individual text lines of any handwritten documents with high accuracy. Handwritten and historical documents mostly contain touching and overlapping characters, heavy diacritics, footnotes and side notes added over the years. In this work, we present a new TLS method based on generative adversarial networks (GAN). TLS problem is tackled as an image-to-image translation problem and the GAN model was trained to learn the spatial information between the individual text lines and their corresponding masks including the text lines. To evaluate the segmentation performance of the proposed GAN model, two challenging datasets, VML-AHTE and VML-MOC, were used. According to the qualitative and quantitative results, the proposed GAN model achieved the best segmentation accuracy on the VML-MOC dataset and showed competitive performance on the VML-AHTE dataset.
引用
收藏
页数:11
相关论文
共 33 条
  • [1] A Statistical approach to line segmentation in handwritten documents
    Arivazhagan, Manivannan
    Srinivasan, Harish
    Srihari, Sargur
    [J]. DOCUMENT RECOGNITION AND RETRIEVAL XIV, 2007, 6500
  • [2] Ataer E., 2006, Multimedia Information Retrieval, P155
  • [3] Barakat Berat Kurar, 2021, Pattern Recognition. ICPR International Workshops and Challenges. Proceedings. Lecture Notes in Computer Science (LNCS 12667), P126, DOI 10.1007/978-3-030-68787-8_9
  • [4] Text Line Segmentation for Challenging Handwritten Document Images Using Fully Convolutional Network
    Barakat, Berat
    Droby, Ahmad
    Kassis, Majeed
    El-Sana, Jihad
    [J]. PROCEEDINGS 2018 16TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2018, : 374 - 379
  • [5] Unsupervised Learning of Text Line Segmentation by Differentiating Coarse Patterns
    Barakat, Berat Kurar
    Droby, Ahmad
    Saabni, Raid
    El-Sana, Jihad
    [J]. DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2021, PT II, 2021, 12822 : 523 - 537
  • [6] Unsupervised deep learning for text line segmentation
    Barakat, Berat Kurar
    Droby, Ahmad
    Aasam, Reem A.
    Madi, Boraq
    Rabaev, Irina
    Shammes, Raed
    El-Sana, Jihad
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 2304 - 2311
  • [7] VML-MOC: Segmenting a multiply oriented and curved handwritten text line dataset
    Barakat, Berat Kurar
    Cohen, Rafi
    El-Sana, Jihad
    Rabaev, Irina
    [J]. 2019 INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION WORKSHOPS (ICDARW) AND 3RD INTERNATIONAL WORKSHOP ON ARABIC AND DERIVED SCRIPT ANALYSIS AND RECOGNITION (ASAR 2019), VOL 6, 2019, : 13 - 18
  • [8] Demir A.A., 2021, 2021 INT C INN INT S, P1
  • [9] Understanding Unsupervised Deep Learning for Text Line Segmentation
    Droby, Ahmad
    Barakat, Berat Kurar
    Saabni, Raid
    Alaasam, Reem
    Madi, Boraq
    El-Sana, Jihad
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (19):
  • [10] Text Line Extraction in Historical Documents Using Mask R-CNN
    Droby, Ahmad
    Barakat, Berat Kurar
    Alaasam, Reem
    Madi, Boraq
    Rabaev, Irina
    El-Sana, Jihad
    [J]. SIGNALS, 2022, 3 (03): : 535 - 549