KERTAS: dataset for automatic dating of ancient Arabic manuscripts

被引:21
作者
Adam, Kalthoum [1 ]
Baig, Asim [1 ]
Al-Maadeed, Somaya [1 ]
Bouridane, Ahmed [2 ]
El-Menshawy, Sherine [3 ]
机构
[1] Qatar Univ, Coll Engn, Comp Sci & Engn Dept, Doha, Qatar
[2] Northumbria Univ Newcastle, Dept Comp & Informat Sci, Newcastle Upon Tyne, Tyne & Wear, England
[3] Qatar Univ, Coll Arts & Sci, Dept Humanities, Doha, Qatar
关键词
Historical documents dataset; Image processing; Classification; Feature extraction; RECOGNITION; FEATURES;
D O I
10.1007/s10032-018-0312-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The age of a historical manuscript can be an invaluable source of information for paleographers and historians. The process of automatic manuscript age detection has inherent complexities, which are compounded by the lack of suitable datasets for algorithm testing. This paper presents a dataset of historical handwritten Arabic manuscripts designed specifically to test state-of-the-art authorship and age detection algorithms. Qatar National Library has been the main source of manuscripts for this dataset while the remaining manuscripts are open source. The dataset consists of over 2000 images taken from various handwritten Arabic manuscripts spanning fourteen centuries. In addition, a sparse representation-based approach for dating historical Arabic manuscript is also proposed. There is lack of existing datasets that provide reliable writing date and author identity as metadata. KERTAS is a new dataset of historical documents that can help researchers, historians and paleographers to automatically date Arabic manuscripts more accurately and efficiently.
引用
收藏
页码:283 / 290
页数:8
相关论文
共 27 条
[1]  
Abulhab SaadD., 2007, Sawt Dahesh, P50
[2]  
Aiolli F, 2009, FRONT ARTIF INTEL AP, V196, P53, DOI 10.3233/978-1-60750-010-0-53
[3]  
[Anonymous], P 3 INT WORKSH HIST
[4]  
[Anonymous], 2015, P 3 INT WORKSH HIST, DOI DOI 10.1145/2809544.2809560
[5]  
At E, 2011, ARABIC MANUSCRIPT ST
[6]  
Bellier C., 1993, ANTIQUITE CLASSIQUE, V62, P512
[7]   Writer identification using directional ink-trace width measurements [J].
Brink, A. A. ;
Smit, J. ;
Bulacu, M. L. ;
Schomaker, L. R. B. .
PATTERN RECOGNITION, 2012, 45 (01) :162-171
[8]   Text-independent writer identification and verification using textural and allographic features [J].
Bulacu, Marius ;
Schomaker, Lambert .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2007, 29 (04) :701-717
[9]   Text-independent writer recognition using multi-script handwritten texts [J].
Djeddi, Chawki ;
Siddiqi, Imran ;
Souici-Meslati, Labiba ;
Ennaji, Abdellatif .
PATTERN RECOGNITION LETTERS, 2013, 34 (10) :1196-1202
[10]   Arab science in the golden age (750-1258 CE) and today [J].
Falagas, Matthew E. ;
Zarkadoulia, Effie A. ;
Samonis, George .
FASEB JOURNAL, 2006, 20 (10) :1581-1586