Guidelines for using sigQC for systematic evaluation of gene signatures

被引:21
作者
Dhawan, Andrew [1 ,2 ]
Barberis, Alessandro [1 ,2 ]
Cheng, Wei-Chen [1 ,2 ]
Domingo, Enric [1 ,2 ]
West, Catharine [3 ]
Maughan, Tim [1 ,2 ]
Scott, Jacob G. [4 ]
Harris, Adrian L. [1 ,2 ]
Buffa, Francesca M. [1 ,2 ]
机构
[1] Univ Oxford, CRUK Oxford Inst, MRC, Computat Biol & Integrat Genom Lab, Oxford, England
[2] Univ Oxford, Dept Oncol, Oxford, England
[3] Univ Manchester, Div Canc Studies, Manchester, Lancs, England
[4] Cleveland Clin, Translat Hematol & Oncol Res, Cleveland, OH 44106 USA
基金
欧洲研究理事会; 英国医学研究理事会;
关键词
PROGNOSIS; REVEALS; BIOCONDUCTOR; CANCER;
D O I
10.1038/s41596-019-0136-8
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
With the increased use of next-generation sequencing generating large amounts of genomic data, gene expression signatures are becoming critically important tools for the interpretation of these data, and are poised to have a substantial effect on diagnosis, management, and prognosis for a number of diseases. It is becoming crucial to establish whether the expression patterns and statistical properties of sets of genes, or gene signatures, are conserved across independent datasets. Conversely, it is necessary to compare established signatures on the same dataset to better understand how they capture different clinical or biological characteristics. Here we describe how to use sigQC, a tool that enables a streamlined, systematic approach for the evaluation of previously obtained gene signatures across multiple gene expression datasets. We implemented sigQC in an R package, making it accessible to users who have knowledge of file input/output and matrix manipulation in R and a moderate grasp of core statistical principles. SigQC has been adopted in basic biology and translational studies, including, but not limited to, the evaluation of multiple gene signatures for potential clinical use as cancer biomarkers. This protocol uses a previously obtained signature for breast cancer metastasis as an example to illustrate the critical quality control steps involved in evaluating its expression, variability, and structure in breast tumor RNA-sequencing data, a different dataset from that in which the signature was originally derived. We demonstrate how the outputs created from sigQC can be used for the evaluation of gene signatures on large-scale gene expression datasets.
引用
收藏
页码:1377 / 1400
页数:24
相关论文
共 35 条
  • [21] Comprehensive molecular portraits of human breast tumours
    Koboldt, Daniel C.
    Fulton, Robert S.
    McLellan, Michael D.
    Schmidt, Heather
    Kalicki-Veizer, Joelle
    McMichael, Joshua F.
    Fulton, Lucinda L.
    Dooling, David J.
    Ding, Li
    Mardis, Elaine R.
    Wilson, Richard K.
    Ally, Adrian
    Balasundaram, Miruna
    Butterfield, Yaron S. N.
    Carlsen, Rebecca
    Carter, Candace
    Chu, Andy
    Chuah, Eric
    Chun, Hye-Jung E.
    Coope, Robin J. N.
    Dhalla, Noreen
    Guin, Ranabir
    Hirst, Carrie
    Hirst, Martin
    Holt, Robert A.
    Lee, Darlene
    Li, Haiyan I.
    Mayo, Michael
    Moore, Richard A.
    Mungall, Andrew J.
    Pleasance, Erin
    Robertson, A. Gordon
    Schein, Jacqueline E.
    Shafiei, Arash
    Sipahimalani, Payal
    Slobodan, Jared R.
    Stoll, Dominik
    Tam, Angela
    Thiessen, Nina
    Varhol, Richard J.
    Wye, Natasja
    Zeng, Thomas
    Zhao, Yongjun
    Birol, Inanc
    Jones, Steven J. M.
    Marra, Marco A.
    Cherniack, Andrew D.
    Saksena, Gordon
    Onofrio, Robert C.
    Pho, Nam H.
    [J]. NATURE, 2012, 490 (7418) : 61 - 70
  • [22] Machine learning applications in cancer prognosis and prediction
    Kourou, Konstantina
    Exarchos, Themis P.
    Exarchos, Konstantinos P.
    Karamouzis, Michalis V.
    Fotiadis, Dimitrios I.
    [J]. COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2015, 13 : 8 - 17
  • [23] Unsupervised Analysis of Transcriptomic Profiles Reveals Six Glioma Subtypes
    Li, Aiguo
    Walling, Jennifer
    Ahn, Susie
    Kotliarov, Yuri
    Su, Qin
    Quezado, Martha
    Oberholtzer, J. Carl
    Park, John
    Zenklusen, Jean C.
    Fine, Howard A.
    [J]. CANCER RESEARCH, 2009, 69 (05) : 2091 - 2099
  • [24] The Molecular Signatures Database Hallmark Gene Set Collection
    Liberzon, Arthur
    Birger, Chet
    Thorvaldsdottir, Helga
    Ghandi, Mahmoud
    Mesirov, Jill P.
    Tamayo, Pablo
    [J]. CELL SYSTEMS, 2015, 1 (06) : 417 - 425
  • [25] The prognostic role of a gene signature from tumorigenic breast-cancer cells.
    Liu, Rui
    Wang, Xinhao
    Chen, Grace Y.
    Dalerba, Piero
    Gurney, Austin
    Hoey, Timothy
    Sherlock, Gavin
    Lewicki, John
    Shedden, Kerby
    Clarke, Michael F.
    [J]. NEW ENGLAND JOURNAL OF MEDICINE, 2007, 356 (03) : 217 - 226
  • [26] A Core Human Primary Tumor Angiogenesis Signature Identifies the Endothelial Orphan Receptor ELTD1 as a Key Regulator of Angiogenesis
    Masiero, Massimo
    Simoes, Filipa Costa
    Han, Hee Dong
    Snell, Cameron
    Peterkin, Tessa
    Bridges, Esther
    Mangala, Lingegowda S.
    Wu, Sherry Yen-Yao
    Pradeep, Sunila
    Li, Demin
    Han, Cheng
    Dalton, Heather
    Lopez-Berestein, Gabriel
    Tuynman, Jurriaan B.
    Mortensen, Neil
    Li, Ji-Liang
    Patient, Roger
    Sood, Anil K.
    Banham, Alison H.
    Harris, Adrian L.
    Buffa, Francesca M.
    [J]. CANCER CELL, 2013, 24 (02) : 229 - 241
  • [27] Navigating gene expression using microarrays - a technology review
    Schulze, A
    Downward, J
    [J]. NATURE CELL BIOLOGY, 2001, 3 (08) : E190 - E195
  • [28] Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning
    Shipp, MA
    Ross, KN
    Tamayo, P
    Weng, AP
    Kutok, JL
    Aguiar, RCT
    Gaasenbeek, M
    Angelo, M
    Reich, M
    Pinkus, GS
    Ray, TS
    Koval, MA
    Last, KW
    Norton, A
    Lister, TA
    Mesirov, J
    Neuberg, DS
    Lander, ES
    Aster, JC
    Golub, TR
    [J]. NATURE MEDICINE, 2002, 8 (01) : 68 - 74
  • [29] Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles
    Subramanian, A
    Tamayo, P
    Mootha, VK
    Mukherjee, S
    Ebert, BL
    Gillette, MA
    Paulovich, A
    Pomeroy, SL
    Golub, TR
    Lander, ES
    Mesirov, JP
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2005, 102 (43) : 15545 - 15550
  • [30] Pathway level analysis of gene expression using singular value decomposition
    Tomfohr, J
    Lu, J
    Kepler, TB
    [J]. BMC BIOINFORMATICS, 2005, 6 (1)