ESTIMATION OF MISSING VALUES USING OPTIMISED HYBRID FUZZY C-MEANS AND MAJORITY VOTE FOR MICROARRAY DATA

被引:0
|
作者
Kumaran, Shamini Raja [1 ]
Othman, Mohd Shahizan [1 ]
Yusuf, Lizawati Mi [1 ]
机构
[1] Univ Teknol Malaysia, Sch Comp, Skudai, Johor, Malaysia
来源
JOURNAL OF INFORMATION AND COMMUNICATION TECHNOLOGY-MALAYSIA | 2020年 / 19卷 / 04期
关键词
Fuzzy C-means; majority vote; missing values; microarray data; data optimisation; IMPUTATION; ALGORITHM;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Missing values are a huge constraint in microarray technologies towards improving and identifying disease-causing genes. Estimating missing values is an undeniable scenario faced by field experts. The imputation method is an effective way to impute the proper values to proceed with the next process in microarray technology. Missing value imputation methods may increase the classification accuracy. Although these methods might predict the values, classification accuracy rates prove the ability of the methods to identify the missing values in gene expression data. In this study, a novel method, Optimised Hybrid of Fuzzy C-Means and Majority Vote (opt-FCMMV), was proposed to identify the missing values in the data. Using the Majority Vote (MV) and optimisation through Particle Swann Optimisation (PSO), this study predicted missing values in the data to form more informative and solid data. In order to verify the effectiveness of opt-FCMMV, several experiments were carried out on two publicly available microarray datasets (i.e. Ovary and Lung Cancer) under three missing value mechanisms with five different percentage values in the biomedical domain using Support Vector Machine (SVM) classifier. The experimental results showed that the proposed method functioned efficiently by showcasing the highest accuracy rate as compared to the one without imputations, with imputation by Fuzzy C-Means (FCM), and imputation by Fuzzy C-Means with Majority Vote (FCMMV). For example, the accuracy rates for Ovary Cancer data with 5% missing values were 64.0% for no imputation, 81.8% (FCM), 90.0% (FCMMV), and 93.7% (opt-FCMMV). Such an outcome indicates that the opt-FCMMV may also be applied in different domains in order to prepare the dataset for various data mining tasks.
引用
收藏
页码:459 / 482
页数:24
相关论文
共 50 条
  • [31] Extended fuzzy c-means: an analyzing data clustering problems
    S. Ramathilagam
    R. Devi
    S. R. Kannan
    Cluster Computing, 2013, 16 : 389 - 406
  • [32] Generalized fuzzy c-means clustering in the presence of outlying data
    Hathaway, RJ
    Overstreet, DD
    Hu, YK
    Davenport, JW
    APPLICATIONS AND SCIENCE OF COMPUTATIONAL INTELLIGENCE II, 1999, 3722 : 509 - 517
  • [33] Strong fuzzy c-means in medical image data analysis
    Kannan, S. R.
    Ramathilagam, S.
    Devi, R.
    Hines, E.
    JOURNAL OF SYSTEMS AND SOFTWARE, 2012, 85 (11) : 2425 - 2438
  • [34] Hyperplane Division in Fuzzy C-Means: Clustering Big Data
    Shen, Yinghua
    Pedrycz, Witold
    Chen, Yuan
    Wang, Xianmin
    Gacek, Adam
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2020, 28 (11) : 3032 - 3046
  • [35] Interval kernel Fuzzy C-Means clustering of incomplete data
    Li, Tianhao
    Zhang, Liyong
    Lu, Wei
    Hou, Hui
    Liu, Xiaodong
    Pedrycz, Witold
    Zhong, Chongquan
    NEUROCOMPUTING, 2017, 237 : 316 - 331
  • [36] Using Fuzzy c-Means for Weighting Different Fuzzy Cognitive Maps
    Obiedat, Mamoon
    Al-yousef, Ali
    Khasawneh, Ahmad
    Hamadneh, Nabhan
    Aljammar, Ashraf
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (05) : 545 - 551
  • [37] Clustering Incomplete Data Using Kernel-Based Fuzzy C-means Algorithm
    Dao-Qiang Zhang
    Song-Can Chen
    Neural Processing Letters, 2003, 18 : 155 - 162
  • [38] Clustering incomplete data using kernel-based fuzzy C-means algorithm
    Zhang, DQ
    Chen, SC
    NEURAL PROCESSING LETTERS, 2003, 18 (03) : 155 - 162
  • [39] Fuzzy c-Means Clustering of Incomplete Data Using Dimension-Wise Fuzzy Variances of Clusters
    Himmelspach, Ludmila
    Conrad, Stefan
    INFORMATION PROCESSING AND MANAGEMENT OF UNCERTAINTY IN KNOWLEDGE-BASED SYSTEMS, IPMU 2016, PT I, 2016, 610 : 699 - 710
  • [40] Tree Cover Mapping Using Hybrid Fuzzy c-Means Method and Multispectral Satellite Images
    Gulbe, Linda
    Kozlovs, Aleksandrs
    Donis, Janis
    Traskovs, Agris
    BALTIC FORESTRY, 2019, 25 (01) : 113 - 123