Wavelet-based attribute noise detection

被引:0
作者
Folleco, Andres [1 ]
Khoshgoftaar, Taghi [1 ]
机构
[1] Florida Atlantic Univ, Coll Engn & Comp Sci, Boca Raton, FL 33431 USA
来源
Eleventh ISSAT International Conference Reliability and Quality in Design, Proceedings | 2005年
关键词
wavelet analysis; noise detection; data quality; supervised and unsupervised datasets; near real-time processing;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Data quality is a critically important issue when inferring knowledge from datasets. The identification of noisy attributes which can easily corrupt and curtail valuable knowledge and information from a dataset, can be invaluable to analysts. We present a novel detection method to identify noisy attributes in datasets of software metrics using multi-resolution transformations based on Discrete Wavelets. The proposed method has been applied to supervised datasets of scientific full-scale data from NASA's Software Metric Data Program (MDP). Empirical results have been favorably compared to those obtained from the robust Pairwise Attribute Noise Detection Algorithm (PANDA) using the same datasets. These results were verified with several case studies that included injecting known simulated noise into a specific attribute of a dataset with no Class noise.
引用
收藏
页码:116 / 121
页数:6
相关论文
共 18 条
[1]   The analysis and design of windowed Fourier frame based multiple description source coding schemes [J].
Balan, R ;
Daubechies, I ;
Vaishampayan, V .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2000, 46 (07) :2491-2536
[2]  
Chatterjee S., 1998, P 24 INT C VER LARG, P428
[3]   THE WAVELET TRANSFORM, TIME-FREQUENCY LOCALIZATION AND SIGNAL ANALYSIS [J].
DAUBECHIES, I .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1990, 36 (05) :961-1005
[4]  
Emery W.J., 2001, DATA ANAL METHODS PH
[5]   Real-world data is dirty: Data cleansing and the merge/purge problem [J].
Hernandez, MA ;
Stolfo, SJ .
DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (01) :9-37
[6]  
Joseph VR, 2002, J QUAL TECHNOL, V34, P345
[7]  
KHOSHGOFTAAR TM, 2004, ISSAT INT C REL QUAL
[8]  
KHOSHGOFTAAR TM, 2004, UNPUB DETECTING NOIS
[9]  
KHOSHGOFTAAR TM, 2004, UNPUB PAIRWISE ATTRI
[10]  
KNUTH D, 1975, ART COMPUTER PROGRAM, V3