Hadith data mining and classification: a comparative analysis

被引:43
作者
Saloot, Mohammad Arshi [1 ]
Idris, Norisma [1 ]
Mahmud, Rohana [1 ]
Ja'afar, Salinah [1 ]
Thorleuchter, Dirk [2 ]
Gani, Abdullah [1 ]
机构
[1] Univ Malaya, Kuala Lumpur 50603, Malaysia
[2] Inst Fraunhofer INT, Appelsgarten 2, D-53879 Euskirchen, Germany
关键词
Review; Comparison; Islamic knowledge; Hadith; Classification; Data mining;
D O I
10.1007/s10462-016-9458-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hadiths are important textual sources of law, tradition, and teaching in the Islamic world. Analyzing the unique linguistic features of Hadiths (e.g. ancient Arabic language and story-like text) results to compile and utilize specific natural language processing methods. In the literature, no study is solely focused on Hadith from artificial intelligence perspective, while many new developments have been overlooked and need to be highlighted. Therefore, this review analyze all academic journal and conference publications that using two main methods of artificial intelligence for Hadith text: Hadith classification and mining. All Hadith relevant methods and algorithms from the literature are discussed and analyzed in terms of functionality, simplicity, F-score and accuracy. Using various different Hadith datasets makes a direct comparison between the evaluation results impossible. Therefore, we have re-implemented and evaluated the methods using a single dataset (i.e. 3150 Hadiths from Sahih Al-Bukhari book). The result of evaluation on the classification method reveals that neural networks classify the Hadith with 94% accuracy. This is because neural networks are capable of handling complex (high dimensional) input data. The Hadith mining method that combines vector space model, Cosine similarity, and enriched queries obtains the best accuracy result (i.e. 88%) among other re-evaluated Hadith mining methods. The most important aspect in Hadith mining methods is query expansion since the query must be fitted to the Hadith lingo. The lack of knowledge based methods is evident in Hadith classification and mining approaches and this absence can be covered in future works using knowledge graphs.
引用
收藏
页码:113 / 128
页数:16
相关论文
共 53 条
[11]  
Alkhatib M., 2010, EUR MED MIDDL E C IN, P1
[12]  
Alrazou HM, 2004, 17 NAT C COMP MED SA, P596
[13]  
Alrazou HM, 2008, DATA MINING APPL ISL
[14]  
Althobaiti M, 2014, P 9 INT C LANG RES E
[15]  
[Anonymous], P 3 INT C AR LANG PR
[16]  
[Anonymous], 2007, P 1 INT S COMP AR LA
[17]  
[Anonymous], 2000, NAACL
[18]  
Attia M, 2010, P 7 INT C LANG RES E
[19]  
Atwell E., 2011, P NITS 3 NATL INFORM, P1
[20]   Significance of the hadith of the Prophet Muhammad in Kazakh proverbs and sayings [J].
Batyrzhan, Mansurov ;
Kulzhanova, B. R. ;
Abzhalov, S. U. ;
Mukhitdinov, R. S. .
5TH WORLD CONFERENCE ON EDUCATIONAL SCIENCES, 2014, 116 :4899-4904