Combining Mutation and Gene Network Data in a Machine Learning Approach for False-Positive Cancer Driver Gene Discovery

被引:3
|
作者
Cutigi, Jorge Francisco [1 ,2 ]
Evangelista, Renato Feijo [2 ]
Ramos, Rodrigo Henrique [1 ,2 ]
Lage Ferreira, Cynthia de Oliveira [2 ]
Evangelista, Adriane Feijo [3 ]
de Carvalho, Andre C. P. L. F. [2 ]
Simao, Adenilso [2 ]
机构
[1] Fed Inst Sao Paulo, Sao Carlos, SP, Brazil
[2] Univ Sao Paulo, Sao Carlos, SP, Brazil
[3] Barretos Canc Hosp, Barretos, SP, Brazil
来源
ADVANCES IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, BSB 2020 | 2020年 / 12558卷
关键词
Cancer bioinformatics; Driver genes; False-positive driver; Complex networks; Machine learning; PATHWAYS;
D O I
10.1007/978-3-030-65775-8_8
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
An increasing interest in Cancer Genomics research emerged from the advent and widespread use of next-generation sequencing technologies, which have generated a large amount of digital biological data. However, not all of this information in fact contributes to cancer studies. For instance, false-positive-driver genes may contain characteristics of cancer genes but are not actually relevant to the cancer initiation and progression. Including this type of genes in cancer studies may lead to identifying unrealistic trends in the data and mislead biomedical decisions. Therefore, proper screening to detect this specific type of gene among genes considered drivers is of utmost importance. This work is focused on the development of models dedicated to this task. Support Vector Machine (SVM) and Random Forest (RF) machine learning algorithms were selected to induce predictive models to classify supposedly driver genes as real drivers or false-positive drivers based on both mutation data and gene network interactions. The results confirmed that the combination of the two sources of information improves the performance of the models. Moreover, SVM and RF models achieved a classification accuracy of 85.0% and 82.4% over labeled data, respectively. Finally, a literature-based analysis was performed over the classification of a new set of genes to further validate the concept.
引用
收藏
页码:81 / 92
页数:12
相关论文
共 50 条
  • [1] Mitigating False-Positive Associations in Rare Disease Gene Discovery
    Akle, Sebastian
    Chun, Sung
    Jordan, Daniel M.
    Cassa, Christopher A.
    HUMAN MUTATION, 2015, 36 (10) : 998 - 1003
  • [2] Inferring Gene Network Rewiring by Combining Gene Expression and Gene Mutation Data
    Tu, Jia-Juan
    Le Ou-Yang
    Hu, Xiaohua
    Zhang, Xiao-Fei
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2019, 16 (03) : 1042 - 1048
  • [3] False-positive HIV serology, Candida lusitaniae pneumonia, and a novel mutation in the CYBB gene
    Banday, Aaqib Zaffar
    Nataraj, Lokesh
    Jindal, Ankur Kumar
    Kaur, Harsimran
    Gummadi, Anjani
    Sharma, Madhubala
    Pandiarajan, Vignesh
    Rawat, Amit
    IMMUNOBIOLOGY, 2021, 226 (04)
  • [4] MUFFINN: cancer gene discovery via network analysis of somatic mutation data
    Cho, Ara
    Shim, Jung Eun
    Kim, Eiru
    Supek, Fran
    Lehner, Ben
    Lee, Insuk
    GENOME BIOLOGY, 2016, 17
  • [5] MUFFINN: cancer gene discovery via network analysis of somatic mutation data
    Ara Cho
    Jung Eun Shim
    Eiru Kim
    Fran Supek
    Ben Lehner
    Insuk Lee
    Genome Biology, 17
  • [6] Network embedding framework for driver gene discovery by combining functional and structural information
    Xin Chu
    Boxin Guan
    Lingyun Dai
    Jin-xing Liu
    Feng Li
    Junliang Shang
    BMC Genomics, 24
  • [7] Network embedding framework for driver gene discovery by combining functional and structural information
    Chu, Xin
    Guan, Boxin
    Dai, Lingyun
    Liu, Jin-xing
    Li, Feng
    Shang, Junliang
    BMC GENOMICS, 2023, 24 (01)
  • [8] Evaluation of Machine Learning Classification Models for False-Positive Reduction in Prostate Cancer Detection Using MRI Data
    Rippa, Malte
    Schulze, Ruben
    Kenyon, Georgia
    Himstedt, Marian
    Kwiatkowski, Maciej
    Grobholz, Rainer
    Wyler, Stephen
    Cornelius, Alexander
    Schindera, Sebastian
    Burn, Felice
    DIAGNOSTICS, 2024, 14 (15)
  • [9] Combining gene expression data from inflamed tissue and machine learning for blood biomarker discovery
    Milanez-Almeida, Pedro
    Martins, Andrew J.
    Narayanan, Manikandan
    Torabi-Parizi, Parizad
    Franco, Luis
    Tsang, John S.
    Germain, Ronald N.
    JOURNAL OF IMMUNOLOGY, 2017, 198 (01):
  • [10] False-positive rifampicin resistance on Xpert® MTB/RIF caused by a silent mutation in the rpoB gene
    Mathys, V.
    van de Vyvere, M.
    de Droogh, E.
    Soetaert, K.
    Groenen, G.
    INTERNATIONAL JOURNAL OF TUBERCULOSIS AND LUNG DISEASE, 2014, 18 (10) : 1255 - 1257