Interpretable interval type-2 fuzzy predicates for data clustering: A new automatic generation method based on self-organizing maps

被引:17
作者
Comas, Diego S. [1 ,2 ]
Pastore, Juan I. [1 ,2 ]
Bouchet, Agustina [1 ,2 ]
Ballarin, Virginia L. [2 ]
Meschino, Gustavo J. [3 ]
机构
[1] Consejo Nacl Invest Cient & Tecn, Buenos Aires, DF, Argentina
[2] Univ Nacl Mar del Plata, CONICET, Digital Image Proc Lab, Inst Invest Cient & Tecnol Elect ICyTE,Fac Ingn, Juan B Justo 4302,B7608FDQ, Mar Del Plata, Buenos Aires, Argentina
[3] Univ Nacl Mar del Plata, CONICET, Bioengn Lab, Inst Invest Cient & Tecnol Elect ICyTE,Fac Ingn, Juan B Justo 4302,B7608FDQ, Mar Del Plata, Buenos Aires, Argentina
关键词
Fuzzy predicates; Interval type-2 fuzzy logic; Self-organizing maps; Interpretable clustering; Knowledge discovery; CLASSIFICATION; KNOWLEDGE; SYSTEMS; MODELS;
D O I
10.1016/j.knosys.2017.07.012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In previous works, we proposed two methods for data clustering based on automatically discovered fuzzy predicates which were referred to as SOM-based Fuzzy Predicate Clustering (SFPC) [Meschino et al., Neurocomputing, 147, 47-59 (2015)] and Type-2 Data-based Fuzzy Predicate Clustering (T2-DFPC) [Comas et al., Expert Syst. Appl., 68, 136-150 (2017)]. In such methods, fuzzy predicates allow both data clustering and knowledge discovering about the obtained clusters. This last feature constitutes novelty comparing to other existing approaches and it is a major contribution in the data clustering field. Based on these previous methods, in the present paper a new automatic clustering method based on fuzzy predicates is proposed which uses Self-Organizing Maps (SOMs) and is called Type-2 SOM-based Fuzzy Predicate Clustering (T2-SFPC). The new method does not require any prior knowledge about the clustering addressed. First, a random partition is defined on the dataset to be clustered and SOMs are configured and trained using the resulting data subsets. Second, an automatic clustering approach is applied on the SOM codebooks, discovering representative data of the different clusters, which are called cluster prototypes. Third, interval type-2 membership function formed by Gaussian-shape sub-functions and fuzzy predicates are defined, allowing data clustering and its interpretation. The proposed method preserves all the advantages of the previous methods SFPC and T2-DFPC in relation to the knowledge extraction capabilities and their potential application on distributed clustering and parallel computing, but results obtained on several public datasets tested showed more compactness and separation of the clusters defined by the T2-SFPC, outperforming both the previous methods and the several classical clustering approaches tested, considering internal and external validation indices. Additionally, both clustering interpretation and optimization capabilities are improved by the proposed method when compared to the methods SFPC and T2-DFPC. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:234 / 254
页数:21
相关论文
共 53 条
[1]   Relationship between the accuracy of classifier error estimation and complexity of decision boundary [J].
Atashpaz-Gargari, Esmaeil ;
Sima, Chao ;
Braga-Neto, Ulisses M. ;
Dougherty, Edward R. .
PATTERN RECOGNITION, 2013, 46 (05) :1315-1322
[2]  
Attik M., 2005, SELF ORGANIZING MAP, P357, DOI [10.1007/11550822_56., DOI 10.1007/11550822_56]
[3]  
Bache K., 2013, UCI Machine Learning Repository
[4]   QUANTIFYING THE NEIGHBORHOOD PRESERVATION OF SELF-ORGANIZING FEATURE MAPS [J].
BAUER, HU ;
PAWELZIK, KR .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1992, 3 (04) :570-579
[5]  
Bishop C., 2006, Pattern recognition and machine learning, P423
[6]  
Bodenhofer U, 2000, P 6 INT C SOFT COMP, P334
[7]   ARITHMETIC MEAN BASED COMPENSATORY FUZZY LOGIC [J].
Bouchet, Agustina ;
Ignacio Pastore, Juan ;
Espin Andrade, Rafael ;
Brun, Marcel ;
Ballarin, Virginia .
INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2011, 10 (02) :231-243
[8]   Model-based evaluation of clustering validation measures [J].
Brun, Marcel ;
Sima, Chao ;
Hua, Jianping ;
Lowey, James ;
Carroll, Brent ;
Suh, Edward ;
Dougherty, Edward R. .
PATTERN RECOGNITION, 2007, 40 (03) :807-824
[9]   Enhanced fuzzy system models with improved fuzzy clustering algorithm [J].
Celikyilmaz, Asli ;
Turksen, I. Burhan .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2008, 16 (03) :779-794
[10]  
Charytanowicz M, 2010, ADV INTEL SOFT COMPU, V69, P15