CNARA: reliability assessment for genomic copy number profiles

被引:2
作者
Ai, Ni [1 ,2 ]
Cai, Haoyang [3 ]
Solovan, Caius [4 ]
Baudis, Michael [1 ,2 ]
机构
[1] Univ Zurich, Inst Mol Life Sci, Winterthurerstr 190, CH-8057 Zurich, Switzerland
[2] Univ Zurich, Swiss Inst Bioinformat, Winterthurerstr 190, CH-8057 Zurich, Switzerland
[3] Sichuan Univ, Coll Life Sci, Key Lab Bioresources & Ecoenvironm, Ctr Growth Metab & Aging, Chengdu 610064, Sichuan, Peoples R China
[4] Victor Babes Univ Med & Pharm, Dept Dermatol, Timisoara, Romania
来源
BMC GENOMICS | 2016年 / 17卷
关键词
Copy number profile; CNA; Reliability assessment; CIRCULAR BINARY SEGMENTATION; ARRAY; AMPLIFICATIONS; HYBRIDIZATION; EXPRESSION; DELETIONS; SHOW;
D O I
10.1186/s12864-016-3074-7
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: DNA copy number profiles from microarray and sequencing experiments sometimes contain wave artefacts which may be introduced during sample preparation and cannot be removed completely by existing preprocessing methods. Besides, large derivative log ratio spread (DLRS) of the probes correlating with poor DNA quality is sometimes observed in genome screening experiments and may lead to unreliable copy number profiles. Depending on the extent of these artefacts and the resulting misidentification of copy number alterations/variations (CNA/CNV), it may be desirable to exclude such samples from analyses or to adapt the downstream data analysis strategy accordingly. Results: Here, we propose a method to distinguish reliable genomic copy number profiles from those containing heavy wave artefacts and/or large DLRS. We define four features that adequately summarize the copy number profiles for reliability assessment, and train a classifier on a dataset of 1522 copy number profiles from various microarray platforms. The method can be applied to predict the reliability of copy number profiles irrespective of the underlying microarray platform and may be adapted for those sequencing platforms from which copy number estimates could be computed as a piecewise constant signal. Further details can be found at https://github.com/baudisgroup/CNARA. Conclusions: We have developed a method for the assessment of genomic copy number profiling data, and suggest to apply the method in addition to and after other state-of-the-art noise correction and quality control procedures. CNARA could be instrumental in improving the assessment of data used for genomic data mining experiments and support the reliable functional attribution of copy number aberrations especially in cancer research.
引用
收藏
页数:11
相关论文
共 33 条
  • [1] NCBI GEO: archive for functional genomics data sets-update
    Barrett, Tanya
    Wilhite, Stephen E.
    Ledoux, Pierre
    Evangelista, Carlos
    Kim, Irene F.
    Tomashevsky, Maxim
    Marshall, Kimberly A.
    Phillippy, Katherine H.
    Sherman, Patti M.
    Holko, Michelle
    Yefanov, Andrey
    Lee, Hyeseung
    Zhang, Naigong
    Robertson, Cynthia L.
    Serova, Nadezhda
    Davis, Sean
    Soboleva, Alexandra
    [J]. NUCLEIC ACIDS RESEARCH, 2013, 41 (D1) : D991 - D995
  • [2] Assessing the significance of chromosomal aberrations in cancer: Methodology and application to glioma
    Beroukhim, Rameen
    Getz, Gad
    Nghiemphu, Leia
    Barretina, Jordi
    Hsueh, Teli
    Linhart, David
    Vivanco, Igor
    Lee, Jeffrey C.
    Huang, Julie H.
    Alexander, Sethu
    Du, Jinyan
    Kau, Tweeny
    Thomas, Roman K.
    Shah, Kinial
    Soto, Horacio
    Perner, Sven
    Prensner, John
    Debiasi, Ralph M.
    Demichelis, Francesca
    Hatton, Charlie
    Rubin, Mark A.
    Garraway, Levi A.
    Nelson, Stan F.
    Liau, Linda
    Mischel, Paul S.
    Cloughesy, Tim F.
    Meyerson, Matthew
    Golub, Todd A.
    Lander, Eric S.
    Mellinghoff, Ingo K.
    Sellers, William R.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (50) : 20007 - 20012
  • [3] Cai H, 2014, NUCLEIC ACIDS RES
  • [4] arrayMap: A Reference Resource for Genomic Copy Number Imbalances in Human Malignancies
    Cai, Haoyang
    Kumar, Nitin
    Baudis, Michael
    [J]. PLOS ONE, 2012, 7 (05):
  • [5] LIBSVM: A Library for Support Vector Machines
    Chang, Chih-Chung
    Lin, Chih-Jen
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
  • [6] A Hierarchy of Self-Renewing Tumor-Initiating Cell Types in Glioblastoma
    Chen, Ruihuan
    Nishimura, Merry C.
    Bumbaca, Stephanie M.
    Kharbanda, Samir
    Forrest, William F.
    Kasman, Ian M.
    Greve, Joan M.
    Soriano, Robert H.
    Gilmour, Laurie L.
    Rivers, Celina Sanchez
    Modrusan, Zora
    Nacu, Serban
    Guerrero, Steve
    Edgar, Kyle A.
    Wallin, Jeffrey J.
    Lamszus, Katrin
    Westphal, Manfred
    Heim, Susanne
    James, C. David
    VandenBerg, Scott R.
    Costello, Joseph F.
    Moorefield, Scott
    Cowdrey, Cynthia J.
    Prados, Michael
    Phillips, Heidi S.
    [J]. CANCER CELL, 2010, 17 (04) : 362 - 375
  • [7] DETECTION OF COMPLETE AND PARTIAL CHROMOSOME GAINS AND LOSSES BY COMPARATIVE GENOMIC INSITU HYBRIDIZATION
    DUMANOIR, S
    SPEICHER, MR
    JOOS, S
    SCHROCK, E
    POPP, S
    DOHNER, H
    KOVACS, G
    ROBERTNICOUD, M
    LICHTER, P
    CREMER, T
    [J]. HUMAN GENETICS, 1993, 90 (06) : 590 - 610
  • [8] Gene Expression Omnibus: NCBI gene expression and hybridization array data repository
    Edgar, R
    Domrachev, M
    Lash, AE
    [J]. NUCLEIC ACIDS RESEARCH, 2002, 30 (01) : 207 - 210
  • [9] Hidden Markov models approach to the analysis of array CGH data
    Fridlyand, J
    Snijders, AM
    Pinkel, D
    Albertson, DG
    Jain, AN
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2004, 90 (01) : 132 - 153
  • [10] High-resolution genomic and expression analyses of copy number alterations in breast tumors
    Haverty, Peter M.
    Fridlyand, Jane
    Li, Li
    Getz, Gad
    Beroukhim, Rameen
    Lohr, Scott
    Wu, Thomas D.
    Cavet, Guy
    Zhang, Zemin
    Chant, John
    [J]. GENES CHROMOSOMES & CANCER, 2008, 47 (06) : 530 - 542