The use of Gene Ontology terms and KEGG pathways for analysis and prediction of oncogenes

被引:51
|
作者
Xing, Zhihao [1 ]
Chu, Chen [2 ]
Chen, Lei [3 ]
Kong, Xiangyin [1 ]
机构
[1] Chinese Acad Sci, Shanghai Inst Biol Sci, Inst Hlth Sci, Shanghai 200031, Peoples R China
[2] Chinese Acad Sci, Shanghai Inst Biol Sci, Inst Biochem & Cell Biol, Shanghai 200031, Peoples R China
[3] Shanghai Maritime Univ, Coll Informat Engn, Shanghai 201306, Peoples R China
来源
BIOCHIMICA ET BIOPHYSICA ACTA-GENERAL SUBJECTS | 2016年 / 1860卷 / 11期
基金
中国国家自然科学基金;
关键词
Oncogenes; Gene Ontology; KEGG pathway; Minimum redundancy maximum relevance; Incremental feature selection; Random forest; PROTEIN INTERACTION NETWORKS; HUMAN CANCER; EXPRESSION; RECEPTOR; IDENTIFICATION; DIFFERENTIATION; TRANSFORMATION; POLYMORPHISMS; ASSOCIATIONS; RELEVANCE;
D O I
10.1016/j.bbagen.2016.01.012
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Background: Oncogenes are a type of genes that have the potential to cause cancer. Most normal cells undergo programmed cell death, namely apoptosis, but activated oncogenes can help cells avoid apoptosis and survive. Thus, studying oncogenes is helpful for obtaining a good understanding of the formation and development of various types of cancers. Methods: In this study, we proposed a computational method, called OPM, for investigating oncogenes from the view of Gene Ontology (GO) and biological pathways. All investigated genes, including validated oncogenes retrieved from some public databases and other genes that have not been reported to be oncogenes thus far, were encoded into numeric vectors according to the enrichment theory of GO terms and KEGG pathways. Some popular feature selection methods, minimum redundancy maximum relevance and incremental feature selection, and an advanced machine learning algorithm, random forest, were adopted to analyze the numeric vectors to extract key GO terms and KEGG pathways. Results: Along with the oncogenes, GO terms and KEGG pathways were discussed in terms of their relevance in this study. Some important GO terms and KEGG pathways were extracted using feature selection methods and were confirmed to be highly related to oncogenes. Additionally, the importance of these terms and pathways in predicting oncogenes was further demonstrated by finding new putative oncogenes based on them. Conclusions: This study investigated oncogenes based on GO terms and KEGG pathways. Some important GO terms and KEGG pathways were confirmed to be highly related to oncogenes. We hope that these GO terms and KEGG pathways can provide new insight for the study of oncogenes, particularly for building more effective prediction models to identify novel oncogenes. The program is available upon request. General significance: We hope that the new findings listed in this study may provide a new insight for the investigation of oncogenes. This article is part of a Special Issue entitled "System Genetics" (C) 2016 Published by Elsevier B.V.
引用
收藏
页码:2725 / 2734
页数:10
相关论文
共 50 条
  • [41] GOAL: the comprehensive gene ontology analysis layer
    Jeong, Jong Cheol
    Li, George
    Chen, Xue-Wen
    SCIENCE CHINA-INFORMATION SCIENCES, 2016, 59 (07)
  • [42] Automatic annotation of protein motif function with Gene Ontology terms
    Xinghua Lu
    Chengxiang Zhai
    Vanathi Gopalakrishnan
    Bruce G Buchanan
    BMC Bioinformatics, 5
  • [43] An experimental study of information content measurement of gene ontology terms
    Milano, Marianna
    Agapito, Giuseppe
    Guzzi, Pietro H.
    Cannataro, Mario
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2018, 9 (03) : 427 - 439
  • [44] An experimental study of information content measurement of gene ontology terms
    Marianna Milano
    Giuseppe Agapito
    Pietro H. Guzzi
    Mario Cannataro
    International Journal of Machine Learning and Cybernetics, 2018, 9 : 427 - 439
  • [45] Prediction of Human Phenotype Ontology terms by means of hierarchical ensemble methods
    Notaro, Marco
    Schubach, Max
    Robinson, Peter N.
    Valentini, Giorgio
    BMC BIOINFORMATICS, 2017, 18
  • [46] ProLoc-GO: Utilizing informative Gene Ontology terms for sequence-based prediction of protein subcellular localization
    Wen-Lin Huang
    Chun-Wei Tung
    Shih-Wen Ho
    Shiow-Fen Hwang
    Shinn-Ying Ho
    BMC Bioinformatics, 9
  • [47] Argot2: a large scale function prediction tool relying on semantic similarity of weighted Gene Ontology terms
    Marco Falda
    Stefano Toppo
    Alessandro Pescarolo
    Enrico Lavezzo
    Barbara Di Camillo
    Andrea Facchinetti
    Elisa Cilia
    Riccardo Velasco
    Paolo Fontana
    BMC Bioinformatics, 13
  • [48] A Literature Review of Gene Function Prediction by Modeling Gene Ontology
    Zhao, Yingwen
    Wang, Jun
    Chen, Jian
    Zhang, Xiangliang
    Guo, Maozu
    Yu, Guoxian
    FRONTIERS IN GENETICS, 2020, 11
  • [49] Antidepressant pathways of the Chinese herb jiaweisinisan through genetic ontology analysis
    Chen, Jie
    Huang, Yunling
    Li, Ling
    Niu, Jie
    Ye, Weiqiong
    Wang, Yunnan
    Yan, Can
    Wu, Lili
    JOURNAL OF INTEGRATIVE NEUROSCIENCE, 2020, 19 (02) : 385 - 395
  • [50] Grouping Gene Ontology terms to improve the assessment of gene set enrichment in microarray data
    Alex Lewin
    Ian C Grieve
    BMC Bioinformatics, 7