Parameterless data compression and noise filtering using association rule mining

被引：0

作者：

Woon, YK

Li, X

Ng, WK

Lu, WF

机构：

[1] Nanyang Technol Univ, Singapore 639798, Singapore

[2] Singapore Inst Mfg Technol, Singapore 638075, Singapore

[3] Singapore MIT Alliance, Singapore, Singapore

来源：

DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS | 2003年 / 2737卷

关键词：

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The explosion of raw data in our information age necessitates the use of unsupervised knowledge discovery techniques to understand mountains of data. Cluster analysis is suitable for this task because of its ability to discover natural groupings of objects without human intervention. However, noise in the data greatly affects clustering results. Existing clustering techniques use density-based, grid-based or resolution-based methods to handle noise but they require the fine-tuning of complex parameters. Moreover, for high-dimensional data that cannot be visualized by humans, this fine-tuning process is greatly impaired. There are several noise/outlier detection techniques but they too need suitable parameters. In this paper, we present a novel parameterless method of filtering noise using ideas borrowed from association rule mining. We term our technique, FLUID (Filtering Using Itemset Discovery). FLUID automatically discovers representative points in the dataset without any input parameter by mapping the dataset into a form suitable for frequent itemset discovery. After frequent itemsets are discovered, they are mapped back to their original form and become representative points of the original dataset. As such, FLUID accomplishes both data and noise reduction simultaneously, making it an ideal preprocessing step for cluster analysis. Experiments involving a prominent synthetic dataset prove the effectiveness and efficiency of FLUID.

引用

页码：278 / 287

页数：10

共 50 条

[1] Hybrid Recommendation System with Collaborative Filtering and Association Rule Mining using Big Data
Gandhi, Sonali
Gandhi, Monali
2018 3RD INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2018,
[2] Using Dynamic Data Mining in Association Rule Mining
Qaddoum, Kifaya
MESM '2006: 9TH MIDDLE EASTERN SIMULATION MULTICONFERENCE, 2008, : 89 - 92
[3] Incremental association rule mining using materialized data mining views
Morzy, M
Morzy, T
Królikowski, Z
ADVANCES IN INFORMATION SYSTEMS, PROCEEDINGS, 2004, 3261 : 77 - 87
[4] A Selective Analysis of Microarray Data using Association Rule Mining
Alagukumar, S.
Lawrance, R.
GRAPH ALGORITHMS, HIGH PERFORMANCE IMPLEMENTATIONS AND ITS APPLICATIONS (ICGHIA 2014), 2015, 47 : 3 - 12
[5] Generalized association rule mining using an efficient data structure
Wu, Chieh-Ming
Huang, Yin-Fu
EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (06) : 7277 - 7290
[6] Indexing Arabic texts using association rule data mining
Haraty, Ramzi A.
Nasrallah, Rouba
LIBRARY HI TECH, 2019, 37 (01) : 101 - 117
[7] Association rule mining using fuzzy spatial data cubes
Isik, Narin
Yazici, Adnan
GEOGRAPHIC UNCERTAINTY IN ENVIRONMENTAL SECURITY, 2007, : 201 - +
[8] Web Data Analysis Using Negative Association Rule Mining
Kumar, Raghvendra
Pattnaik, Prasant Kumar
Sharma, Yogesh
INFORMATION SYSTEMS DESIGN AND INTELLIGENT APPLICATIONS, VOL 1, INDIA 2016, 2016, 433 : 513 - 518
[9] Using a fuzzy association rule mining approach to identify the financial data association
Ho, G. T. S.
Ip, W. H.
Wu, C. H.
Tse, Y. K.
EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (10) : 9054 - 9063
[10] Boolean Algebra and Compression Technique for Association Rule Mining
Anekritmongkol, Somboon
Kasamsan, M. L. Kulthon
ADVANCED DATA MINING AND APPLICATIONS (ADMA 2010), PT II, 2010, 6441 : 150 - 157

← 1 2 3 4 5 →