RGBM: regularized gradient boosting machines for identification of the transcriptional regulators of discrete glioma subtypes

被引:35
作者
Mall, Raghvendra [1 ]
Cerulo, Luigi [2 ,3 ]
Garofano, Luciano [2 ,3 ]
Frattini, Veronique [4 ]
Kunji, Khalid [1 ]
Bensmail, Halima [1 ]
Sabedot, Thais S. [5 ,6 ]
Noushmehr, Houtan [5 ,6 ]
Lasorella, Anna [4 ,7 ,8 ]
Iavarone, Antonio [4 ,7 ,9 ]
Ceccarelli, Michele [2 ,3 ]
机构
[1] Hamad Bin Khalifa Univ, Qatar Comp Res Inst, Doha, Qatar
[2] Univ Sannio, Dept Sci & Technol, Benevento, Italy
[3] BIOGEM Ist Ric Genet G Salvatore, Ariano Irpino, Italy
[4] Columbia Univ, Med Ctr, Inst Canc Genet, New York, NY 10032 USA
[5] Henry Ford Hlth Syst, Brain Tumor Ctr, Dept Neurosurg, Detroit, MI USA
[6] Univ Sao Paulo, Ribeirao Preto Med Sch, Dept Surg & Anat, Dept Genet CISBi NAP, Ribeirao Preto, Brazil
[7] Columbia Univ, Med Ctr, Dept Pathol & Cell Biol, New York, NY 10032 USA
[8] Columbia Univ, Med Ctr, Dept Pediat, New York, NY 10032 USA
[9] Columbia Univ, Med Ctr, Dept Neurol, New York, NY 10032 USA
关键词
NETWORK INFERENCE; L-CURVE; GENE NETWORKS; ALGORITHM; PATHWAYS; FUSIONS; RECONSTRUCTION; EXPANSION; SELECTION; FGFR;
D O I
10.1093/nar/gky015
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We propose a generic framework for gene regulatory network (GRN) inference approached as a feature selection problem. GRNs obtained using Machine Learning techniques are often dense, whereas real GRNs are rather sparse. We use a Tikonov regularization inspired optimal L-curve criterion that utilizes the edge weight distribution for a given target gene to determine the optimal set of TFs associated with it. Our proposed framework allows to incorporate a mechanistic active biding network based on cis-regulatory motif analysis. We evaluate our regularization framework in conjunction with two nonlinear ML techniques, namely gradient boosting machines (GBM) and random-forests (GENIE), resulting in a regularized feature selection based method specifically called RGBM and RGENIE respectively. RGBM has been used to identify the main transcription factors that are causally involved as master regulators of the gene expression signature activated in the FGFR3-TACC3-positive glioblastoma. Here, we illustrate that RGBM identifies the main regulators of the molecular subtypes of brain tumors. Our analysis reveals the identity and corresponding biological activities of the master regulators characterizing the difference between G-CIMP-high and G-CIMP-low subtypes and between PA-like and LGm6-GBM, thus providing a clue to the yet undetermined nature of the transcriptional events among these subtypes.
引用
收藏
页数:16
相关论文
共 72 条
[51]   Integrative random forest for gene regulatory network inference [J].
Petralia, Francesca ;
Wang, Pei ;
Yang, Jialiang ;
Tu, Zhidong .
BIOINFORMATICS, 2015, 31 (12) :197-205
[52]   From Knockouts to Networks: Establishing Direct Cause-Effect Relationships through Graph Analysis [J].
Pinna, Andrea ;
Soranzo, Nicola ;
de la Fuente, Alberto .
PLOS ONE, 2010, 5 (10)
[53]   Causal Mechanistic Regulatory Network for Glioblastoma Deciphered Using Systems Genetics Network Analysis [J].
Plaisier, Christopher L. ;
O'Brien, Sofie ;
Bernard, Brady ;
Reynolds, Sheila ;
Simon, Zac ;
Toledo, Chad M. ;
Ding, Yu ;
Reiss, David J. ;
Paddison, Patrick J. ;
Baliga, Nitin S. .
CELL SYSTEMS, 2016, 3 (02) :172-186
[54]   Towards a Rigorous Assessment of Systems Biology Models: The DREAM3 Challenges [J].
Prill, Robert J. ;
Marbach, Daniel ;
Saez-Rodriguez, Julio ;
Sorger, Peter K. ;
Alexopoulos, Leonidas G. ;
Xue, Xiaowei ;
Clarke, Neil D. ;
Altan-Bonnet, Gregoire ;
Stolovitzky, Gustavo .
PLOS ONE, 2010, 5 (02)
[55]   Context-specific transcriptional regulatory network inference from global gene expression maps using double two-way t-tests [J].
Qi, Jianlong ;
Michoel, Tom .
BIOINFORMATICS, 2012, 28 (18) :2325-2332
[56]   Stability of building gene regulatory networks with sparse autoregressive models [J].
Rajapakse, Jagath C. ;
Mundra, Piyushkumar A. .
BMC BIOINFORMATICS, 2011, 12
[57]   GeneNetWeaver: in silico benchmark generation and performance profiling of network inference methods [J].
Schaffter, Thomas ;
Marbach, Daniel ;
Floreano, Dario .
BIOINFORMATICS, 2011, 27 (16) :2263-2270
[58]   Discovering molecular pathways from protein interaction and gene expression data [J].
Segal, E. ;
Wang, H. ;
Koller, D. .
BIOINFORMATICS, 2003, 19 :i264-i272
[59]   Transforming Fusions of FGFR and TACC Genes in Human Glioblastoma [J].
Singh, Devendra ;
Chan, Joseph Minhow ;
Zoppoli, Pietro ;
Niola, Francesco ;
Sullivan, Ryan ;
Castano, Angelica ;
Liu, Eric Minwei ;
Reichel, Jonathan ;
Porrati, Paola ;
Pellegatta, Serena ;
Qiu, Kunlong ;
Gao, Zhibo ;
Ceccarelli, Michele ;
Riccardi, Riccardo ;
Brat, Daniel J. ;
Guha, Abhijit ;
Aldape, Ken ;
Golfinos, John G. ;
Zagzag, David ;
Mikkelsen, Tom ;
Finocchiaro, Gaetano ;
Lasorella, Anna ;
Rabadan, Raul ;
Iavarone, Antonio .
SCIENCE, 2012, 337 (6099) :1231-1235
[60]   ENNET: inferring large gene regulatory networks from expression data using gradient boosting [J].
Slawek, Janusz ;
Arodz, Tomasz .
BMC SYSTEMS BIOLOGY, 2013, 7