Feature and instance reduction for PNN classifiers based on fuzzy rough sets

被引:34
作者
Tsang, Eric C. C. [1 ]
Hu, Qinghua [2 ]
Chen, Degang [3 ]
机构
[1] Macau Univ Sci & Technol, Fac Informat Technol, Macau, Peoples R China
[2] Tianjin Univ, Sch Comp Sci & Technol, Tianjin 300072, Peoples R China
[3] North China Elect Power Univ, Sch Math & Phys, Beijing, Peoples R China
关键词
Nearest neighbor rule; Feature reduction; Instance reduction; Fuzzy rough set; NEAREST-NEIGHBOR SEARCH; SELECTION; MODEL; ALGORITHMS; GENERATION; RULE;
D O I
10.1007/s13042-014-0232-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Instance reduction for K-nearest-neighbor classification rules (KNN) has attracted much attention these years, and most of the existing approaches lose the semantics of probability of original data. In this work, we propose a new reduced KNN rule, called FAIR-KNN, to perform feature and instance reduction based on fuzzy rough set theory. First, we use fuzzy rough sets to evaluate candidate features and select the most informative ones. The algorithm of feature selection returns the selected features and the membership values of samples to the lower approximations of their classes. These values reflect the distances of the samples to classification boundary and are used to compute probabilities of samples to be subsampled. Then we introduce a weighted Parzen window technique to estimate the probability from the weighted subsampled data. Thus we can not only reduce features and samples in original data, but also do not lose the semantics of probability. Finally, the memberships of samples to lower and upper approximations of decisions are interpreted as certainty and possibility degrees of samples belonging to the corresponding decisions, respectively. So the weighted averages with probability of the memberships of samples to lower and upper approximations are outputted as the certainty and possibility degrees of unseen samples belonging to some decisions, which enrich the semantics of KNN. Numerical experiments on artificial and real-world data validate the effectiveness of the proposed technique.
引用
收藏
页码:1 / 11
页数:11
相关论文
共 59 条
[1]   Fast nearest neighbor condensation for large data sets classification [J].
Angiulli, Fabrizio .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2007, 19 (11) :1450-1464
[2]   Weighted Parzen windows for pattern classification [J].
Babich, GA ;
Camps, OI .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1996, 18 (05) :567-570
[3]   Towards enriching the quality of k-nearest neighbor rule for document classification [J].
Basu, Tanmay ;
Murthy, C. A. .
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2014, 5 (06) :897-905
[4]   On fuzzy-rough sets approach to feature selection [J].
Bhatt, RB ;
Gopal, M .
PATTERN RECOGNITION LETTERS, 2005, 26 (07) :965-975
[5]  
Cha GH, 2002, IEEE T MULTIMEDIA, V4, P76
[6]   NEAREST NEIGHBOR PATTERN CLASSIFICATION [J].
COVER, TM ;
HART, PE .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) :21-+
[7]   Enhancing evolutionary instance selection algorithms by means of fuzzy rough set based feature selection [J].
Derrac, Joaquin ;
Cornelis, Chris ;
Garcia, Salvador ;
Herrera, Francisco .
INFORMATION SCIENCES, 2012, 186 (01) :73-92
[8]   Large margin nearest neighbor classifiers [J].
Domeniconi, C ;
Gunopulos, D ;
Peng, J .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2005, 16 (04) :899-909
[9]   ROUGH FUZZY-SETS AND FUZZY ROUGH SETS [J].
DUBOIS, D ;
PRADE, H .
INTERNATIONAL JOURNAL OF GENERAL SYSTEMS, 1990, 17 (2-3) :191-209
[10]   Considerations about sample-size sensitivity of a family of edited nearest-neighbor rules [J].
Ferri, FJ ;
Albert, JV ;
Vidal, E .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 1999, 29 (05) :667-672