A Data Preprocessing Algorithm Based-on SVM in Data Warehouse

被引:0
作者
Wang Jianfen [1 ]
Shi Changhong [1 ]
机构
[1] Zhejiang Univ Sci & Technol, Sch Informat & Elect Engn, Hangzhou 310023, Zhejiang, Peoples R China
来源
ISTM/2009: 8TH INTERNATIONAL SYMPOSIUM ON TEST AND MEASUREMENT, VOLS 1-6 | 2009年
关键词
SVM; data preprocessing; SVM-decision tree; data warehouse;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
As real-world data tends to be incomplete, noisy and inconsistent, data preprocessing is an important issue for both data warehouse and data mining. Besides well-structured data, data warehouse integrates semi-structured data from WWW data source and those exterior file data without structure. This paper presents a preprocessing classification algorithm that is based on SVM-decision tree. The multiple-categories classifier is composed of SVM and binary decision tree and used for data classification in data warehouse. It can reduce the train scale of SVM classifier and improve the training efficiency. The experiment that classify Chinese Web Page, one kinds of semi-structured data, with this algorithm shows that it not only reduces the size of train set but also has very high training efficiency. Its precision and recall are also very good.
引用
收藏
页码:648 / 650
页数:3
相关论文
共 6 条
[1]  
FU RG, 2005, MEMS BOOK, P400
[2]  
FU RG, 2004, J COMPUTER ENG, V8, P20
[3]  
KEOGH EJ, 2002, LEARNING AUGMENTED B
[4]  
WANG S, 2006, J MICRONANO SCI, V28, P300
[5]  
WANG S, 2000, COMPUTER SCI, V27, P237
[6]  
WANGJIANFEN, 2001, T BEIJING I TECHNOLO