ESTIMATION OF MISSING VALUES USING OPTIMISED HYBRID FUZZY C-MEANS AND MAJORITY VOTE FOR MICROARRAY DATA

被引:0
|
作者
Kumaran, Shamini Raja [1 ]
Othman, Mohd Shahizan [1 ]
Yusuf, Lizawati Mi [1 ]
机构
[1] Univ Teknol Malaysia, Sch Comp, Skudai, Johor, Malaysia
来源
JOURNAL OF INFORMATION AND COMMUNICATION TECHNOLOGY-MALAYSIA | 2020年 / 19卷 / 04期
关键词
Fuzzy C-means; majority vote; missing values; microarray data; data optimisation; IMPUTATION; ALGORITHM;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Missing values are a huge constraint in microarray technologies towards improving and identifying disease-causing genes. Estimating missing values is an undeniable scenario faced by field experts. The imputation method is an effective way to impute the proper values to proceed with the next process in microarray technology. Missing value imputation methods may increase the classification accuracy. Although these methods might predict the values, classification accuracy rates prove the ability of the methods to identify the missing values in gene expression data. In this study, a novel method, Optimised Hybrid of Fuzzy C-Means and Majority Vote (opt-FCMMV), was proposed to identify the missing values in the data. Using the Majority Vote (MV) and optimisation through Particle Swann Optimisation (PSO), this study predicted missing values in the data to form more informative and solid data. In order to verify the effectiveness of opt-FCMMV, several experiments were carried out on two publicly available microarray datasets (i.e. Ovary and Lung Cancer) under three missing value mechanisms with five different percentage values in the biomedical domain using Support Vector Machine (SVM) classifier. The experimental results showed that the proposed method functioned efficiently by showcasing the highest accuracy rate as compared to the one without imputations, with imputation by Fuzzy C-Means (FCM), and imputation by Fuzzy C-Means with Majority Vote (FCMMV). For example, the accuracy rates for Ovary Cancer data with 5% missing values were 64.0% for no imputation, 81.8% (FCM), 90.0% (FCMMV), and 93.7% (opt-FCMMV). Such an outcome indicates that the opt-FCMMV may also be applied in different domains in order to prepare the dataset for various data mining tasks.
引用
收藏
页码:459 / 482
页数:24
相关论文
共 50 条
  • [1] Missing value estimation for microarray data based on fuzzy C-means clustering
    Luo, JiaWei
    Yang, Tao
    Wang, Yan
    Eighth International Conference on High-Performance Computing in Asia-Pacific Region, Proceedings, 2005, : 611 - 616
  • [2] A hybrid method for imputation of missing values using optimized fuzzy c-means with support vector regression and a genetic algorithm
    Aydilek, Ibrahim Berkan
    Arslan, Ahmet
    INFORMATION SCIENCES, 2013, 233 : 25 - 35
  • [3] A hybrid approach to integrate fuzzy C-means based imputation method with genetic algorithm for missing traffic volume data estimation
    Tang, Jinjun
    Zhang, Guohui
    Wang, Yinhai
    Wang, Hua
    Liu, Fang
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2015, 51 : 29 - 40
  • [4] Fuzzy c-means clustering of partially missing data sets
    Hathaway, RJ
    Overstreet, DD
    Bezdek, JC
    APPLICATIONS AND SCIENCE OF COMPUTATIONAL INTELLIGENCE III, 2000, 4055 : 159 - 165
  • [5] An Integrated Fuzzy C-Means Method for Missing Data Imputation Using Taxi GPS Data
    Huang, Junsheng
    Mao, Baohua
    Bai, Yun
    Zhang, Tong
    Miao, Changjun
    SENSORS, 2020, 20 (07)
  • [6] Iterative Fuzzy C Means, Fuzzy Silhouette, and Imputation for Missing Values in a Dataset
    Mausor, Farahida Hanim
    Jaafar, Jafreezal
    Taib, Shakirah Mohd
    Razali, Razulaimi
    2021 IEEE INTERNATIONAL CONFERENCE ON COMPUTING (ICOCO), 2021, : 382 - 385
  • [7] Clustering of COVID-19 data for knowledge discovery using c-means and fuzzy c-means
    Afzal, Asif
    Ansari, Zahid
    Alshahrani, Saad
    Raj, Arun K.
    Kuruniyan, Mohamed Saheer
    Saleel, C. Ahamed
    Nisar, Kottakkaran Sooppy
    RESULTS IN PHYSICS, 2021, 29
  • [8] A Study of Data Imputation Using Fuzzy C-Means with Particle Swarm Optimization
    Samat, Nurul Ashikin
    Salleh, Mohd Najib Mohd
    RECENT ADVANCES ON SOFT COMPUTING AND DATA MINING, 2017, 549 : 91 - 100
  • [9] Hybrid Fuzzy C-Means Clustering Algorithm Oriented to Big Data Realms
    Perez-Ortega, Joaquin
    Silvia Roblero-Aguilar, Sandra
    Nely Almanza-Ortega, Nelva
    Frausto Solis, Juan
    Zavala-Diaz, Crispin
    Hernandez, Yasmin
    Landero-Najera, Vanesa
    AXIOMS, 2022, 11 (08)
  • [10] Extended fuzzy c-means: an analyzing data clustering problems
    Ramathilagam, S.
    Devi, R.
    Kannan, S. R.
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2013, 16 (03): : 389 - 406