ESTIMATION OF MISSING VALUES USING OPTIMISED HYBRID FUZZY C-MEANS AND MAJORITY VOTE FOR MICROARRAY DATA

被引:0
|
作者
Kumaran, Shamini Raja [1 ]
Othman, Mohd Shahizan [1 ]
Yusuf, Lizawati Mi [1 ]
机构
[1] Univ Teknol Malaysia, Sch Comp, Skudai, Johor, Malaysia
来源
JOURNAL OF INFORMATION AND COMMUNICATION TECHNOLOGY-MALAYSIA | 2020年 / 19卷 / 04期
关键词
Fuzzy C-means; majority vote; missing values; microarray data; data optimisation; IMPUTATION; ALGORITHM;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Missing values are a huge constraint in microarray technologies towards improving and identifying disease-causing genes. Estimating missing values is an undeniable scenario faced by field experts. The imputation method is an effective way to impute the proper values to proceed with the next process in microarray technology. Missing value imputation methods may increase the classification accuracy. Although these methods might predict the values, classification accuracy rates prove the ability of the methods to identify the missing values in gene expression data. In this study, a novel method, Optimised Hybrid of Fuzzy C-Means and Majority Vote (opt-FCMMV), was proposed to identify the missing values in the data. Using the Majority Vote (MV) and optimisation through Particle Swann Optimisation (PSO), this study predicted missing values in the data to form more informative and solid data. In order to verify the effectiveness of opt-FCMMV, several experiments were carried out on two publicly available microarray datasets (i.e. Ovary and Lung Cancer) under three missing value mechanisms with five different percentage values in the biomedical domain using Support Vector Machine (SVM) classifier. The experimental results showed that the proposed method functioned efficiently by showcasing the highest accuracy rate as compared to the one without imputations, with imputation by Fuzzy C-Means (FCM), and imputation by Fuzzy C-Means with Majority Vote (FCMMV). For example, the accuracy rates for Ovary Cancer data with 5% missing values were 64.0% for no imputation, 81.8% (FCM), 90.0% (FCMMV), and 93.7% (opt-FCMMV). Such an outcome indicates that the opt-FCMMV may also be applied in different domains in order to prepare the dataset for various data mining tasks.
引用
收藏
页码:459 / 482
页数:24
相关论文
共 50 条
  • [21] The Optimal Estimation of Fuzziness Parameter in Fuzzy C-Means Algorithm
    Kuo, Hsun-Chih
    Lin, Yu-Jau
    ROUGH SETS, 2017, 10313 : 566 - 575
  • [22] A hybrid Fuzzy C-Means and Neutrosophic for jaw lesions segmentation
    Alsmadi, Mutasem K.
    AIN SHAMS ENGINEERING JOURNAL, 2018, 9 (04) : 697 - 706
  • [23] Comparison of Illiteracy Cluster Pattern and Population Data using Fuzzy C-Means
    Rochmaniyah, Ni'matul
    Pujianto, Utomo
    2017 INTERNATIONAL CONFERENCE ON SUSTAINABLE INFORMATION ENGINEERING AND TECHNOLOGY (SIET), 2017, : 255 - 258
  • [24] A Fuzzy C-means Approach for Incomplete Data Sets Based on Nearest-neighbor Intervals
    Li, Dan
    Zhong, Chongquan
    Wang, Shiqiang
    INFORMATION TECHNOLOGY APPLICATIONS IN INDUSTRY II, PTS 1-4, 2013, 411-414 : 1108 - 1111
  • [25] A hybrid model for bearing performance degradation assessment based on support vector data description and fuzzy c-means
    Pan, Y. N.
    Chen, J.
    Dong, G. M.
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART C-JOURNAL OF MECHANICAL ENGINEERING SCIENCE, 2009, 223 (11) : 2687 - 2695
  • [26] Color image segmentation using histogram thresholding - Fuzzy C-means hybrid approach
    Tan, Khang Siang
    Isa, Nor Ashidi Mat
    PATTERN RECOGNITION, 2011, 44 (01) : 1 - 15
  • [27] Visualizing Fuzzy Relationship in Bibliographic Big Data Using Hybrid Approach Combining Fuzzy c-Means and Newman-Girvan Algorithm
    Zolkepli, Maslina
    Dong, Fangyan
    Hirota, Kaoru
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2014, 18 (06) : 896 - 907
  • [28] A Semi-Deterministic Channel Estimation Approach based on Geospatial Data and Fuzzy c-Means
    Zhu, Xiaoyi
    Koc, Asil
    Morawski, Robert
    Le-Ngoc, Tho
    IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2021), 2021,
  • [29] A novel approach to fuzzy c-Means clustering using kernel function
    Kochuveettil, Ani Davis
    Mathew, Raj
    INTELLIGENT DECISION TECHNOLOGIES-NETHERLANDS, 2022, 16 (04): : 643 - 651
  • [30] On Kernel Fuzzy c-Means for Data with Tolerance Using Explicit Mapping for Kernel Data Analysis
    Kanzawa, Yuchi
    Endo, Yasunori
    Miyamoto, Sadaaki
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2012, 16 (01) : 162 - 168