A deep learning framework for historical manuscripts writer identification using data-driven features

被引:10
作者
Bennour, Akram [1 ]
Boudraa, Merouane [1 ]
Siddiqi, Imran [2 ]
Al-Sarem, Mohammad [3 ]
Al-Shaby, Mohammad [4 ]
Ghabban, Fahad [3 ]
机构
[1] Echahid Cheikh Larbi Tebessi Univ, Lab Math Informat & Syst LAMIS, Tebessa, Algeria
[2] Xynoptik Pty Ltd, Melbourne, Vic, Australia
[3] Taibah Univ, Coll Comp Sci & Engn, Medina, Saudi Arabia
[4] Taibah Univ, Coll Business Adm, Dept Management Informat Syst, Medina, Saudi Arabia
关键词
Writer identification; Historical manuscripts; Feature extraction; Key-points detection; Clustering; Deep learning;
D O I
10.1007/s11042-024-18187-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Writer identification form historical manuscripts presents a challenging problem with significant implications for understanding the authorship of ancient texts. In this paper, we propose a novel deep learning framework tailored for the task of historical manuscripts writer identification. Our approach leverages data-driven features, harnessing the power of neural networks to extract and learn discriminative patterns from handwritten historical documents. The key innovation of our framework lies in its ability to automatically discover and utilize relevant features from data to profile the writer, eliminating the need for manual feature engineering. Our methodology encompasses three well-defined steps: initially, manuscript preprocessing involves image denoising using advanced techniques such as non-local means and total-variation, followed by binarization using a Canny-edge detector. In the subsequent phase, we employ Harris corner detector for automatic key-point detection and clustering, allowing us to identify the regions of interest within the documents. Lastly, the features extracted from these regions are subjected to classification through transfer learning, utilizing a deep learning-based model specifically trained on the extracted patches. To achieve the final document-level identification, we enhance the system accuracy by implementing a majority vote scheme, where the aggregated decisions from multiple patches contribute to the ultimate classification outcome. We validate our approach on "ICDAR 2017" dataset, spanning different periods and writing styles of historical manuscripts. Experimental results demonstrate the superior performance of our method in accurately identifying the authors of historical documents, surpassing existing techniques. Moreover, our framework exhibits robustness in scenarios where limited training data is available. This work not only contributes to the field of historical manuscripts analysis but also highlights the potential of deep learning in solving intricate problems in the realm of document analysis and authorship attribution. Our framework offers a promising avenue for scholars and historians to gain deeper insights into the authors of historical texts, opening new doors for historical research and preservation.
引用
收藏
页码:80075 / 80101
页数:27
相关论文
共 42 条
[1]  
Abbas F., 2020, MED C PATT REC ART I, P188
[2]   Texture feature column scheme for single- and multi-script writer identification [J].
Abbas, Faycel ;
Gattal, Abdeljalil ;
Djeddi, Chawki ;
Siddiqi, Imran ;
Bensefia, Ameur ;
Saoudi, Kamel .
IET BIOMETRICS, 2021, 10 (02) :179-193
[3]   Writer Identification on Historical Documents Using Oriented Basic Image Features [J].
Abdeljalil, Gattal ;
Djeddi, Chawki ;
Siddiqi, Imran ;
Al-Maadeed, Somaya .
PROCEEDINGS 2018 16TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2018, :369-373
[4]   On writer identification for Arabic historical manuscripts [J].
Asi, Abedelkadir ;
Abdalhaleem, Alaa ;
Fecker, Daniel ;
Maergner, Volker ;
El-Sana, Jihad .
INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2017, 20 (03) :173-187
[5]  
Bennour A, 2018, P 8 INT C INFORM SYS, P1
[6]   Automatic handwriting analysis for writer identification and verification [J].
Bennour, Akram .
PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND NEW TECHNOLOGIES (ICSENT '18), 2018,
[7]   Handwriting based writer recognition using implicit shape codebook [J].
Bennour, Akram ;
Djeddi, Chawki ;
Gattal, Abdeljalil ;
Siddiqi, Imran ;
Mekhaznia, Tahar .
FORENSIC SCIENCE INTERNATIONAL, 2019, 301 :91-100
[8]  
Boudraa M, 2024, Intelligent Systems and Pattern Recognition. ISPR 2023. Commun Comput Inf Sci, DOI [10.1007/978-3-031-46335-8_11, DOI 10.1007/978-3-031-46335-8_11]
[9]   Non-Local Means Denoising [J].
Buades, Antoni ;
Coll, Bartomeu ;
Morel, Jean-Michel .
IMAGE PROCESSING ON LINE, 2011, 1 :208-212