Using Extended Random Set to Find Specific Patterns

被引:5
作者
Albathan, Mubarak [1 ,2 ]
Li, Yuefeng [1 ]
Xu, Yue [1 ]
机构
[1] Queensland Univ Technol, Sch Elect Engn & Comp Sci, Brisbane, Qld 4001, Australia
[2] Al Imam Mohammad Ibn Saud Islamic Univ, Riyadh 11432, Saudi Arabia
来源
2014 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 2 | 2014年
关键词
Specific Closed Sequential Patterns; Select top - k Patterns; Extended Random Set; Text mining; Information retrieval;
D O I
10.1109/WI-IAT.2014.77
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the overwhelming increase in the amount of data on the web and data bases, many text mining techniques have been proposed for mining useful patterns in text documents. Extracting closed sequential patterns using the Pattern Taxonomy Model (PTM) is one of the pruning methods to remove noisy, inconsistent, and redundant patterns. However, PTM model treats each extracted pattern as whole without considering included terms, which could affect the quality of extracted patterns. This paper propose an innovative and effective method that extends the random set to accurately weigh patterns based on their distribution in the documents and their terms distribution in patterns. Then, the proposed approach will find the specific closed sequential patterns (SCSP) based on the new calculated weight. The experimental results on Reuters Corpus Volume 1 (RCV1) data collection and TREC topics show that the proposed method significantly outperforms other state-of-the-art methods in different popular measures.
引用
收藏
页码:30 / 37
页数:8
相关论文
共 48 条
  • [1] Afrati F., 2004, P KDD, P12, DOI DOI 10.1145/1014052.1014057
  • [2] Agrawal R., P 20 INT C VERY LARG
  • [3] Albathan M., 2013, Adv. Artif. Intell., V8272, P453
  • [4] [Anonymous], 2007, THESIS
  • [5] [Anonymous], 2010, P 16 ACM SIGKDD INT, DOI DOI 10.1145/1835804.1835843
  • [6] [Anonymous], 1997, ICML
  • [7] [Anonymous], 2009, LOCAL PATTERNS GLOBA
  • [8] [Anonymous], P INT C MACH LEARN
  • [9] Bauer B., 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580), P440, DOI 10.1109/AFGR.2000.840672
  • [10] Bekkerman Ron., 2011, Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, P231