Indexing Arabic texts using association rule data mining

被引:7
|
作者
Haraty, Ramzi A. [1 ]
Nasrallah, Rouba [2 ]
机构
[1] Lebanese Amer Univ, Dept Comp Sci & Math, Beirut, Lebanon
[2] Lebanese Amer Univ, Beirut, Lebanon
关键词
Precision; Recall; Arabic text; Auto-indexing; Frequent sets; Rule-based data mining; FREQUENCY;
D O I
10.1108/LHT-07-2017-0147
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
Purpose The purpose of this paper is to propose a new model to enhance auto-indexing Arabic texts. The model denotes extracting new relevant words by relating those chosen by previous classical methods to new words using data mining rules. Design/methodology/approach The proposed model uses an association rule algorithm for extracting frequent sets containing related items - to extract relationships between words in the texts to be indexed with words from texts that belong to the same category. The associations of words extracted are illustrated as sets of words that appear frequently together. Findings The proposed methodology shows significant enhancement in terms of accuracy, efficiency and reliability when compared to previous works. Research limitations/implications -The stemming algorithm can be further enhanced. In the Arabic language, we have many grammatical rules. The more we integrate rules to the stemming algorithm, the better the stemming will be. Other enhancements can be done to the stop-list. This is by adding more words to it that should not be taken into consideration in the indexing mechanism. Also, numbers should be added to the list as well as using the thesaurus system because it links different phrases or words with the same meaning to each other, which improves the indexing mechanism. The authors also invite researchers to add more pre-requisite texts to have better results. Originality/value -In this paper, the authors present a full text-based auto-indexing method for Arabic text documents. The auto-indexing method extracts new relevant words by using data mining rules, which has not been investigated before. The method uses an association rule mining algorithm for extracting frequent sets containing related items to extract relationships between words in the texts to be indexed with words from texts that belong to the same category. The benefits of the method are demonstrated using empirical work involving several Arabic texts.
引用
收藏
页码:101 / 117
页数:17
相关论文
共 50 条
  • [1] Using Dynamic Data Mining in Association Rule Mining
    Qaddoum, Kifaya
    MESM '2006: 9TH MIDDLE EASTERN SIMULATION MULTICONFERENCE, 2008, : 89 - 92
  • [2] Incremental association rule mining using materialized data mining views
    Morzy, M
    Morzy, T
    Królikowski, Z
    ADVANCES IN INFORMATION SYSTEMS, PROCEEDINGS, 2004, 3261 : 77 - 87
  • [3] A Selective Analysis of Microarray Data using Association Rule Mining
    Alagukumar, S.
    Lawrance, R.
    GRAPH ALGORITHMS, HIGH PERFORMANCE IMPLEMENTATIONS AND ITS APPLICATIONS (ICGHIA 2014), 2015, 47 : 3 - 12
  • [4] Generalized association rule mining using an efficient data structure
    Wu, Chieh-Ming
    Huang, Yin-Fu
    EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (06) : 7277 - 7290
  • [5] Association rule mining using fuzzy spatial data cubes
    Isik, Narin
    Yazici, Adnan
    GEOGRAPHIC UNCERTAINTY IN ENVIRONMENTAL SECURITY, 2007, : 201 - +
  • [6] Web Data Analysis Using Negative Association Rule Mining
    Kumar, Raghvendra
    Pattnaik, Prasant Kumar
    Sharma, Yogesh
    INFORMATION SYSTEMS DESIGN AND INTELLIGENT APPLICATIONS, VOL 1, INDIA 2016, 2016, 433 : 513 - 518
  • [7] Using a fuzzy association rule mining approach to identify the financial data association
    Ho, G. T. S.
    Ip, W. H.
    Wu, C. H.
    Tse, Y. K.
    EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (10) : 9054 - 9063
  • [8] Encrypted Association Rule Mining for Outsourced Data Mining
    Liu, Fang
    Ng, Wee Keong
    Zhang, Wei
    2015 IEEE 29TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS (IEEE AINA 2015), 2015, : 550 - 557
  • [9] Arabic Text Mining Using Rule Based Classification
    Thabtah, Fadi
    Gharaibeh, Omar
    Al-Zubaidy, Rashid
    JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT, 2012, 11 (01)
  • [10] Data Mining Application using Association Rule Mining ECLAT Algorithm Based on SPMF
    Reynaldo, Jason
    Tonara, David Boy
    3RD INTERNATIONAL CONFERENCE ON ELECTRICAL SYSTEMS, TECHNOLOGY AND INFORMATION (ICESTI 2017), 2018, 164