Parameterless data compression and noise filtering using association rule mining

被引:0
|
作者
Woon, YK
Li, X
Ng, WK
Lu, WF
机构
[1] Nanyang Technol Univ, Singapore 639798, Singapore
[2] Singapore Inst Mfg Technol, Singapore 638075, Singapore
[3] Singapore MIT Alliance, Singapore, Singapore
来源
DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS | 2003年 / 2737卷
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The explosion of raw data in our information age necessitates the use of unsupervised knowledge discovery techniques to understand mountains of data. Cluster analysis is suitable for this task because of its ability to discover natural groupings of objects without human intervention. However, noise in the data greatly affects clustering results. Existing clustering techniques use density-based, grid-based or resolution-based methods to handle noise but they require the fine-tuning of complex parameters. Moreover, for high-dimensional data that cannot be visualized by humans, this fine-tuning process is greatly impaired. There are several noise/outlier detection techniques but they too need suitable parameters. In this paper, we present a novel parameterless method of filtering noise using ideas borrowed from association rule mining. We term our technique, FLUID (Filtering Using Itemset Discovery). FLUID automatically discovers representative points in the dataset without any input parameter by mapping the dataset into a form suitable for frequent itemset discovery. After frequent itemsets are discovered, they are mapped back to their original form and become representative points of the original dataset. As such, FLUID accomplishes both data and noise reduction simultaneously, making it an ideal preprocessing step for cluster analysis. Experiments involving a prominent synthetic dataset prove the effectiveness and efficiency of FLUID.
引用
收藏
页码:278 / 287
页数:10
相关论文
共 50 条
  • [1] Hybrid Recommendation System with Collaborative Filtering and Association Rule Mining using Big Data
    Gandhi, Sonali
    Gandhi, Monali
    2018 3RD INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2018,
  • [2] Using Dynamic Data Mining in Association Rule Mining
    Qaddoum, Kifaya
    MESM '2006: 9TH MIDDLE EASTERN SIMULATION MULTICONFERENCE, 2008, : 89 - 92
  • [3] Incremental association rule mining using materialized data mining views
    Morzy, M
    Morzy, T
    Królikowski, Z
    ADVANCES IN INFORMATION SYSTEMS, PROCEEDINGS, 2004, 3261 : 77 - 87
  • [4] A Selective Analysis of Microarray Data using Association Rule Mining
    Alagukumar, S.
    Lawrance, R.
    GRAPH ALGORITHMS, HIGH PERFORMANCE IMPLEMENTATIONS AND ITS APPLICATIONS (ICGHIA 2014), 2015, 47 : 3 - 12
  • [5] Generalized association rule mining using an efficient data structure
    Wu, Chieh-Ming
    Huang, Yin-Fu
    EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (06) : 7277 - 7290
  • [6] Indexing Arabic texts using association rule data mining
    Haraty, Ramzi A.
    Nasrallah, Rouba
    LIBRARY HI TECH, 2019, 37 (01) : 101 - 117
  • [7] Association rule mining using fuzzy spatial data cubes
    Isik, Narin
    Yazici, Adnan
    GEOGRAPHIC UNCERTAINTY IN ENVIRONMENTAL SECURITY, 2007, : 201 - +
  • [8] Web Data Analysis Using Negative Association Rule Mining
    Kumar, Raghvendra
    Pattnaik, Prasant Kumar
    Sharma, Yogesh
    INFORMATION SYSTEMS DESIGN AND INTELLIGENT APPLICATIONS, VOL 1, INDIA 2016, 2016, 433 : 513 - 518
  • [9] Using a fuzzy association rule mining approach to identify the financial data association
    Ho, G. T. S.
    Ip, W. H.
    Wu, C. H.
    Tse, Y. K.
    EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (10) : 9054 - 9063
  • [10] Boolean Algebra and Compression Technique for Association Rule Mining
    Anekritmongkol, Somboon
    Kasamsan, M. L. Kulthon
    ADVANCED DATA MINING AND APPLICATIONS (ADMA 2010), PT II, 2010, 6441 : 150 - 157