Bi-dimensional principal gene feature selection from big gene expression data

被引:4
|
作者
Hou, Xiaoqian [1 ]
Hou, Jingyu [1 ]
Huang, Guangyan [1 ]
机构
[1] Deakin Univ, Sch Informat Technol, Melbourne, Vic, Australia
来源
PLOS ONE | 2022年 / 17卷 / 12期
关键词
D O I
10.1371/journal.pone.0278583
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Gene expression sample data, which usually contains massive expression profiles of genes, is commonly used for disease related gene analysis. The selection of relevant genes from huge amount of genes is always a fundamental process in applications of gene expression data. As more and more genes have been detected, the size of gene expression data becomes larger and larger; this challenges the computing efficiency for extracting the relevant and important genes from gene expression data. In this paper, we provide a novel Bi-dimensional Principal Feature Selection (BPFS) method for efficiently extracting critical genes from big gene expression data. It applies the principal component analysis (PCA) method on sample and gene domains successively, aiming at extracting the relevant gene features and reducing redundancies while losing less information. The experimental results on four real-world cancer gene expression datasets show that the proposed BPFS method greatly reduces the data size and achieves a nearly double processing speed compared to the counterpart methods, while maintaining better accuracy and effectiveness.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] Improved Binary Imperialist Competition Algorithm for Feature Selection from Gene Expression Data
    Aorigele
    Wang, Shuaiqun
    Tang, Zheng
    Gao, Shangce
    Todo, Yuki
    INTELLIGENT COMPUTING METHODOLOGIES, ICIC 2016, PT III, 2016, 9773 : 67 - 78
  • [32] Recursive Cluster Elimination (RCE) for classification and feature selection from gene expression data
    Yousef, Malik
    Jung, Segun
    Showe, Louise C.
    Showe, Michael K.
    BMC BIOINFORMATICS, 2007, 8
  • [33] Recursive Cluster Elimination (RCE) for classification and feature selection from gene expression data
    Malik Yousef
    Segun Jung
    Louise C Showe
    Michael K Showe
    BMC Bioinformatics, 8
  • [34] Towards ultrahigh dimensional feature selection for big data
    Tan, Mingkui
    Tsang, Ivor W.
    Wang, Li
    Journal of Machine Learning Research, 2014, 15 : 1371 - 1429
  • [35] Towards Ultrahigh Dimensional Feature Selection for Big Data
    Tan, Mingkui
    Tsang, Ivor W.
    Wang, Li
    JOURNAL OF MACHINE LEARNING RESEARCH, 2014, 15 : 1371 - 1429
  • [36] Feature selection for high dimensional microarray gene expression data via weighted signal to noise ratio
    Hamraz, Muhammad
    Ali, Amjad
    Mashwani, Wali Khan
    Aldahmani, Saeed
    Khan, Zardad
    PLOS ONE, 2023, 18 (04):
  • [37] Sparse maximum margin discriminant analysis for feature extraction and gene selection on gene expression data
    Cui, Yan
    Zheng, Chun-Hou
    Yang, Jian
    Sha, Wen
    COMPUTERS IN BIOLOGY AND MEDICINE, 2013, 43 (07) : 933 - 941
  • [38] Feature Selection and Classification for Gene Expression Data using Evolutionary Computation
    Banka, Haider
    Dara, Suresh
    2012 23RD INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA), 2012, : 185 - 189
  • [39] An efficient statistical feature selection approach for classification of gene expression data
    Chandra, B.
    Gupta, Manish
    JOURNAL OF BIOMEDICAL INFORMATICS, 2011, 44 (04) : 529 - 535
  • [40] Multimodal Deep Boltzmann Machines for Feature Selection on Gene Expression Data
    Syafiandini, Arida Ferti
    Wasito, Ito
    Yazid, Setiadi
    Fitriawan, Aries
    Amien, Mukhlis
    2016 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS), 2016, : 407 - 411