A method for mixed data classification base on RBF-ELM network

被引:30
作者
Li, Qiude [1 ,2 ,3 ]
Xiong, Qingyu [1 ,2 ]
Ji, Shengfen [4 ]
Yu, Yang [1 ,2 ]
Wu, Chao [1 ,2 ]
Yi, Hualing [1 ,2 ]
机构
[1] Chongqing Univ, Minist Educ, Key Lab Dependable Serv Comp Cyber Phys Soc, Chongqing, Peoples R China
[2] Chongqing Univ, Sch Big Data & Software Engn, Chongqing 401331, Peoples R China
[3] Guizhou Med Univ, Sch Biol & Engn, Guiyang 550025, Guizhou, Peoples R China
[4] Guizhou Inst Technol, Sch Foreign Language, Guiyang 550003, Guizhou, Peoples R China
基金
国家重点研发计划;
关键词
Mixed data classification; Distance metric; Density peaks clustering (DPC); Radial basis function (RBF); Extreme learning machine (ELM);
D O I
10.1016/j.neucom.2020.12.032
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The classification tasks for numerical or categorical data have been well developed. However, the data collected in the real world are frequently the mixed type containing numerical and categorical values, and how to classify the mixed data quickly and efficiently is a critical yet challenging task. Existing classification models for mixed data usually treat the mixed data processing and subsequent classification as two independent phases, without considering their compatibility. By fusing the mixed data processing into a classification algorithm, this paper proposes an extended version of RBF-ELM (Radial Basis Function-Extreme Learning Machine), a Mixed Data RBF-ELM method (MD-RBF-ELM for short), which can achieve direct, fast, and efficient classification for mixed data. Specifically, a distance metric method for mixed data is firstly designed to calculate the distances between the input data and the RBF centers, and then these distances are used to train the network structure and weights of MD-RBF-ELM, thereby realizing the fusion of data processing with model learning. In addition, to alleviate the problem of MD-RBF-ELM's unstable performance caused by randomly selecting the RBF centers, we propose an improved density peak clustering algorithm and use it to select the optimal RBF centers automatically and adaptively. Extensive experimental results on 34 data sets demonstrate that MD-RBF-ELM significantly enhances the classification performance (increasing 2.37% for F1-score, up to 14/34 for the number of best results, and reaching 2.4/8 for the averaged ranks), compared with seven state-of-the-art competitors. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页码:7 / 22
页数:16
相关论文
共 42 条
[1]  
AHA DW, 1991, MACH LEARN, V6, P37, DOI 10.1007/BF00153759
[2]   Survey of State-of-the-Art Mixed Data Clustering Algorithms [J].
Ahmad, Amir ;
Khan, Shehroz S. .
IEEE ACCESS, 2019, 7 :31883-31902
[3]   A Fast and Efficient Method for Training Categorical Radial Basis Function Networks [J].
Alexandridis, Alex ;
Chondrodima, Eva ;
Giannopoulos, Nikolaos ;
Sarimveis, Haralambos .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (11) :2831-2836
[4]  
[Anonymous], 2016, Int J Hybrid Inf Technol
[5]  
[Anonymous], 1988, Document Retrieval Systems
[6]  
Boriah Shyam, 2008, P 8 SIAM INT C DAT M, P243, DOI [DOI 10.1137/1.9781611972788.22, 10.1137/1.9781611972788.22]
[7]  
Cohen C., 2003, APPL MULTIPLE REGRES, DOI DOI 10.4324/9780203774441
[8]   NEAREST NEIGHBOR PATTERN CLASSIFICATION [J].
COVER, TM ;
HART, PE .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) :21-+
[9]  
Croft WB, 2010, Search engines: information retrieval in practice, V520
[10]  
Demsar J, 2006, J MACH LEARN RES, V7, P1