Combining Mutation and Gene Network Data in a Machine Learning Approach for False-Positive Cancer Driver Gene Discovery

被引:3
|
作者
Cutigi, Jorge Francisco [1 ,2 ]
Evangelista, Renato Feijo [2 ]
Ramos, Rodrigo Henrique [1 ,2 ]
Lage Ferreira, Cynthia de Oliveira [2 ]
Evangelista, Adriane Feijo [3 ]
de Carvalho, Andre C. P. L. F. [2 ]
Simao, Adenilso [2 ]
机构
[1] Fed Inst Sao Paulo, Sao Carlos, SP, Brazil
[2] Univ Sao Paulo, Sao Carlos, SP, Brazil
[3] Barretos Canc Hosp, Barretos, SP, Brazil
来源
ADVANCES IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, BSB 2020 | 2020年 / 12558卷
关键词
Cancer bioinformatics; Driver genes; False-positive driver; Complex networks; Machine learning; PATHWAYS;
D O I
10.1007/978-3-030-65775-8_8
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
An increasing interest in Cancer Genomics research emerged from the advent and widespread use of next-generation sequencing technologies, which have generated a large amount of digital biological data. However, not all of this information in fact contributes to cancer studies. For instance, false-positive-driver genes may contain characteristics of cancer genes but are not actually relevant to the cancer initiation and progression. Including this type of genes in cancer studies may lead to identifying unrealistic trends in the data and mislead biomedical decisions. Therefore, proper screening to detect this specific type of gene among genes considered drivers is of utmost importance. This work is focused on the development of models dedicated to this task. Support Vector Machine (SVM) and Random Forest (RF) machine learning algorithms were selected to induce predictive models to classify supposedly driver genes as real drivers or false-positive drivers based on both mutation data and gene network interactions. The results confirmed that the combination of the two sources of information improves the performance of the models. Moreover, SVM and RF models achieved a classification accuracy of 85.0% and 82.4% over labeled data, respectively. Finally, a literature-based analysis was performed over the classification of a new set of genes to further validate the concept.
引用
收藏
页码:81 / 92
页数:12
相关论文
共 50 条
  • [41] Machine Learning Methods for Cancer Classification Using Gene Expression Data: A Review
    Alharbi, Fadi
    Vakanski, Aleksandar
    BIOENGINEERING-BASEL, 2023, 10 (02):
  • [42] Identification of a glioma functional network from gene fitness data using machine learning
    Xiang, Chun-xiang
    Liu, Xi-guo
    Zhou, Da-quan
    Zhou, Yi
    Wang, Xu
    Chen, Feng
    JOURNAL OF CELLULAR AND MOLECULAR MEDICINE, 2022, 26 (04) : 1253 - 1263
  • [43] Gene reduction and machine learning algorithms for cancer classification based on microarray gene expression data: A comprehensive review
    Osama, Sarah
    Shaban, Hassan
    Ali, Abdelmgeid A.
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 213
  • [44] Alzheimer's disease: using gene/protein network machine learning for molecule discovery in olive oil
    Rita, Luis
    Neumann, Natalie R.
    Laponogov, Ivan
    Gonzalez, Guadalupe
    Veselkov, Dennis
    Pratico, Domenico
    Aalizadeh, Reza
    Thomaidis, Nikolaos S.
    Thompson, David C.
    Vasiliou, Vasilis
    Veselkov, Kirill
    HUMAN GENOMICS, 2023, 17 (01)
  • [45] Alzheimer’s disease: using gene/protein network machine learning for molecule discovery in olive oil
    Luís Rita
    Natalie R. Neumann
    Ivan Laponogov
    Guadalupe Gonzalez
    Dennis Veselkov
    Domenico Pratico
    Reza Aalizadeh
    Nikolaos S. Thomaidis
    David C. Thompson
    Vasilis Vasiliou
    Kirill Veselkov
    Human Genomics, 17
  • [46] An integrative approach to reveal driver gene fusions from paired end sequencing data in cancer
    Wang, Xiaosong
    Prensner, John R.
    Chen, Guoan
    Cao, Qi
    Han, Bo
    Dhanasekaran, Saravana M.
    Ponnala, Rakesh
    Beer, David G.
    Palanisamy, Nallasivam
    Sailor, Maureen
    Omenn, Gilbert S.
    Chinnaiyan, Arul M.
    CANCER RESEARCH, 2010, 70
  • [47] A Hybrid Approach for Biomarker Discovery from Microarray Gene Expression Data for Cancer Classification
    Peng, Yanxiong
    Li, Wenyuan
    Liu, Ying
    CANCER INFORMATICS, 2006, 2 : 301 - 311
  • [48] A machine learning approach to test data generation: A case study in evaluation of gene finders
    Christiansen, Henning
    Dahmcke, Christina Mackeprang
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, PROCEEDINGS, 2007, 4571 : 742 - +
  • [49] Author Correction: Forecasting risk gene discovery in autism with machine learning and genome-scale data
    Leo Brueggeman
    Tanner Koomar
    Jacob J. Michaelson
    Scientific Reports, 10
  • [50] SSCI: Self-Supervised Deep Learning Improves Network Structure for Cancer Driver Gene Identification
    Xu, Jialuo
    Hao, Jun
    Liao, Xingyu
    Shang, Xuequn
    Li, Xingyi
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2024, 25 (19)