Comparative Study of Feature Selection and Classification Techniques for High-Throughput DNA Methylation Data

被引:0
|
作者
Alkuhlani, Alhasan [1 ]
Nassef, Mohammad [1 ]
Farag, Ibrahim [1 ]
机构
[1] Cairo Univ, Dept Comp Sci, Fac Comp & Informat, Giza, Egypt
来源
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT SYSTEMS AND INFORMATICS 2016 | 2017年 / 533卷
关键词
Microarray; DNA Methylation; Feature selection; Classification; Cross-alidation; SUPPORT VECTOR MACHINES; GENE SELECTION; CANCER CLASSIFICATION; MICROARRAY DATA;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The high dimensionality of data is a common problem in classification. In this work, a small number of significant features is investigated to classify data of two sample groups. Various feature selection and classification techniques are applied in a collection of four high-throughput DNA methylation microarray data sets. Using accuracy as a performance metric, the repeated 10-fold cross-validation strategy is implemented to evaluate the different proposed techniques. Combining the Signal to Noise Ratio (SNR) and Wilcoxon rank-sum test filter methods with Support Vector Machine-Recursive Feature Elimination (SVM-RFE) as an embedded method has resulted in a perfect performance. In addition, the linear classifiers showed excellent results compared to others classifiers when applied to such data sets.
引用
收藏
页码:793 / 803
页数:11
相关论文
共 50 条
  • [41] A Comparative Study on Feature Selection Techniques for Multi-cluster Text Data
    Gupta, Ananya
    Begum, Shahin Ara
    HARMONY SEARCH AND NATURE INSPIRED OPTIMIZATION ALGORITHMS, 2019, 741 : 203 - 215
  • [42] Assaying DNA methylation based on high-throughput melting curve approaches
    Akey, DT
    Akey, JM
    Zhang, K
    Jin, L
    GENOMICS, 2002, 80 (04) : 376 - 384
  • [43] On Efficient Feature Ranking Methods for High-Throughput Data Analysis
    Liao, Bo
    Jiang, Yan
    Liang, Wei
    Peng, Lihong
    Peng, Li
    Hanyurwimfura, Damien
    Li, Zejun
    Chen, Min
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2015, 12 (06) : 1374 - 1384
  • [44] AlphaBeta: computational inference of epimutation rates and spectra from high-throughput DNA methylation data in plants
    Yadollah Shahryary
    Aikaterini Symeonidi
    Rashmi R. Hazarika
    Johanna Denkena
    Talha Mubeen
    Brigitte Hofmeister
    Thomas van Gurp
    Maria Colomé-Tatché
    Koen J.F. Verhoeven
    Gerald Tuskan
    Robert J. Schmitz
    Frank Johannes
    Genome Biology, 21
  • [45] AlphaBeta: computational inference of epimutation rates and spectra from high-throughput DNA methylation data in plants
    Shahryary, Yadollah
    Symeonidi, Aikaterini
    Hazarika, Rashmi R.
    Denkena, Johanna
    Mubeen, Talha
    Hofmeister, Brigitte
    van Gurp, Thomas
    Colome-Tatch, Maria
    Verhoeven, Koen J. F.
    Tuskan, Gerald
    Schmitz, Robert J.
    Johannes, Frank
    GENOME BIOLOGY, 2020, 21 (01)
  • [46] Robust biomarker discovery for hepatocellular carcinoma from high-throughput data by multiple feature selection methods
    Zishuang Zhang
    Zhi-Ping Liu
    BMC Medical Genomics, 14
  • [47] Robust biomarker discovery for hepatocellular carcinoma from high-throughput data by multiple feature selection methods
    Zhang, Zishuang
    Liu, Zhi-Ping
    BMC MEDICAL GENOMICS, 2021, 14 (SUPPL 1)
  • [48] A Review on Feature Selection Techniques for Gene Expression Data
    Vanjimalar, S.
    Ramyachitra, D.
    Manikandan, P.
    2018 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (IEEE ICCIC 2018), 2018, : 26 - 29
  • [49] Analysis of Feature Selection Techniques for Classification Problems
    Adamov, Abzetdin Z.
    2021 IEEE 15TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT2021), 2021,
  • [50] A novel feature selection approach for biomedical data classification
    Peng, Yonghong
    Wu, Zhiqing
    Jiang, Jianmin
    JOURNAL OF BIOMEDICAL INFORMATICS, 2010, 43 (01) : 15 - 23