Application of Biological Domain Knowledge Based Feature Selection on Gene Expression Data

被引:43
|
作者
Yousef, Malik [1 ,2 ]
Kumar, Abhishek [3 ,4 ]
Bakir-Gungor, Burcu [5 ]
机构
[1] Zefat Acad Coll, Dept Informat Syst, IL-13206 Safed, Israel
[2] Zefat Acad Coll, Galilee Digital Hlth Res Ctr GDH, IL-13206 Safed, Israel
[3] Inst Bioinformat, Int Technol Pk, Bangalore 560066, Karnataka, India
[4] Manipal Acad Higher Educ MAHE, Manipal 576104, India
[5] Abdullah Gul Univ, Dept Comp Engn, Fac Engn, TR-38080 Kayseri, Turkey
关键词
feature selection; feature ranking; grouping; clustering; biological knowledge;
D O I
10.3390/e23010002
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
In the last two decades, there have been massive advancements in high throughput technologies, which resulted in the exponential growth of public repositories of gene expression datasets for various phenotypes. It is possible to unravel biomarkers by comparing the gene expression levels under different conditions, such as disease vs. control, treated vs. not treated, drug A vs. drug B, etc. This problem refers to a well-studied problem in the machine learning domain, i.e., the feature selection problem. In biological data analysis, most of the computational feature selection methodologies were taken from other fields, without considering the nature of the biological data. Thus, integrative approaches that utilize the biological knowledge while performing feature selection are necessary for this kind of data. The main idea behind the integrative gene selection process is to generate a ranked list of genes considering both the statistical metrics that are applied to the gene expression data, and the biological background information which is provided as external datasets. One of the main goals of this review is to explore the existing methods that integrate different types of information in order to improve the identification of the biomolecular signatures of diseases and the discovery of new potential targets for treatment. These integrative approaches are expected to aid the prediction, diagnosis, and treatment of diseases, as well as to enlighten us on disease state dynamics, mechanisms of their onset and progression. The integration of various types of biological information will necessitate the development of novel techniques for integration and data analysis. Another aim of this review is to boost the bioinformatics community to develop new approaches for searching and determining significant groups/clusters of features based on one or more biological grouping functions.
引用
收藏
页码:1 / 15
页数:15
相关论文
共 50 条
  • [1] Feature Selection and Clustering of Gene Expression Profiles Using Biological Knowledge
    Mitra, Sushmita
    Ghosh, Sampreeti
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2012, 42 (06): : 1590 - 1599
  • [2] The γ-OMP Algorithm for Feature Selection With Application to Gene Expression Data
    Tsagris, Michail
    Papadovasilakis, Zacharias
    Lakiotaki, Kleanthi
    Tsamardinos, Ioannis
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2022, 19 (02) : 1214 - 1224
  • [3] Incremental forward feature selection with application to microarray gene expression data
    Lee, Yuh-Jye
    Chang, Chien-Chung
    Chao, Chia-Huang
    JOURNAL OF BIOPHARMACEUTICAL STATISTICS, 2008, 18 (05) : 827 - 840
  • [4] Quality of feature selection based on microarray gene expression data
    Maciejewski, Henryk
    COMPUTATIONAL SCIENCE - ICCS 2008, PT 3, 2008, 5103 : 140 - 147
  • [5] PSO Based Feature Selection for Clustering Gene Expression Data
    Deepthi, P. S.
    Thampi, Sabu M.
    2015 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, INFORMATICS, COMMUNICATION AND ENERGY SYSTEMS (SPICES), 2015,
  • [6] Null space based feature selection method for gene expression data
    Alok Sharma
    Seiya Imoto
    Satoru Miyano
    Vandana Sharma
    International Journal of Machine Learning and Cybernetics, 2012, 3 : 269 - 276
  • [7] Null space based feature selection method for gene expression data
    Sharma, Alok
    Imoto, Seiya
    Miyano, Satoru
    Sharma, Vandana
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2012, 3 (04) : 269 - 276
  • [8] Feature selection and gene clustering from gene expression data
    Mitra, P
    Majumder, DD
    PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, 2004, : 343 - 346
  • [9] Data mining for feature selection in gene expression autism data
    Latkowski, Tomasz
    Osowski, Stanislaw
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (02) : 864 - 872
  • [10] Informative Feature Clustering and Selection for Gene Expression Data
    Yang, Yuqi
    Yin, Pengshuai
    Luo, Zhihang
    Gu, Wenwen
    Chen, Renjie
    Wu, Qingyao
    IEEE ACCESS, 2019, 7 : 169174 - 169184