A Study of Cell-Free DNA Fragmentation Pattern and Its Application in DNA Sample Type Classification

被引:2
作者
Chen, Shifu [1 ,2 ,3 ]
Liu, Ming [2 ]
Zhang, Xiaoni [2 ]
Long, Renwen [2 ]
Wang, Yixing [2 ]
Han, Yue [2 ]
Zhang, Shiwei [2 ]
Xu, Mingyan [2 ]
Gu, Jia [1 ]
机构
[1] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen 518055, Peoples R China
[2] HaploX Biotechnol, Shenzhen 518057, Peoples R China
[3] Univ Chinese Acad Sci, Beijing, Peoples R China
基金
美国国家科学基金会;
关键词
Cell free DNA; liquid biopsy; fragmentation; pattern recognition; CANCER; ORIGIN; FETAL;
D O I
10.1109/TCBB.2017.2723388
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Plasma cell-free DNA (cfDNA) has certain fragmentation patterns, which can bring non-random base content curves of the sequencing data's beginning cycles. We studied the patterns and found that we could determine whether a sample is cfDNA or not by just looking into the first 10 cycles of its base content curves. We analyzed 3,189 FastQ files, including 1,442 plasma cfDNA, 1,234 genomic DNA, 507 FFPE tumour DNA, and 6 urinary cfDNA. By deep analyzing these data, we found the patterns were stable enough to distinguish cfDNA from other kinds of DNA samples. Based on this finding, we built classification models to recognize cfDNA samples by their sequencing data. Pattern recognition models were then trained with different classification algorithms like k-nearest neighbors (KNN), random forest, and support vector machine (SVM). The result of 1,000 iteration .632+ bootstrapping showed that all these classifiers could give an average accuracy higher than 98 percent, indicating that the cfDNA patterns are unique and can make the dataset highly separable. The best result was obtained using a random forest classifier with a 99.89 percent average accuracy (sigma = 0.00068). A tool called CfdnaPattern (http://github.com/OpenGene/CfdnaPattern) has been developed to train the model and to predict whether a sample is cfDNA or not.
引用
收藏
页码:1718 / 1722
页数:5
相关论文
共 30 条
  • [1] Apoptotic cell-free DNA promotes inflammation in haemodialysis patients
    Atamaniuk, Johanna
    Kopecky, Chantal
    Skoupy, Sonja
    Saeemann, Marcus D.
    Weichhart, Thomas
    [J]. NEPHROLOGY DIALYSIS TRANSPLANTATION, 2012, 27 (03) : 902 - 905
  • [2] Botezatu I, 2000, CLIN CHEM, V46, P1078
  • [3] High-resolution characterization of sequence signatures due to non-random cleavage of cell-free DNA
    Chandrananda, Dineika
    Thorne, Natalie P.
    Bahlo, Melanie
    [J]. BMC MEDICAL GENOMICS, 2015, 8
  • [4] Investigating and Correcting Plasma DNA Sequencing Coverage Bias to Enhance Aneuploidy Discovery
    Chandrananda, Dineika
    Thorne, Natalie P.
    Ganesamoorthy, Devika
    Bruno, Damien L.
    Benjamini, Yuval
    Speed, Terence P.
    Slater, Howard R.
    Bahlo, Melanie
    [J]. PLOS ONE, 2014, 9 (01):
  • [5] Noninvasive prenatal diagnosis of fetal chromosomal aneuploidy by massively parallel genomic sequencing of DNA in maternal plasma
    Chiu, Rossa W. K.
    Chan, K. C. Allen
    Gao, Yuan
    Lau, Virginia Y. M.
    Zheng, Wenli
    Leung, Tak Y.
    Foo, Chris H. F.
    Xie, Bin
    Tsui, Nancy B. Y.
    Lun, Fiona M. F.
    Zee, Benny C. Y.
    Lau, Tze K.
    Cantor, Charles R.
    Lo, Y. M. Dennis
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2008, 105 (51) : 20458 - 20463
  • [6] Gene selection and classification of microarray data using random forest -: art. no. 3
    Díaz-Uriarte, R
    de Andrés, SA
    [J]. BMC BIOINFORMATICS, 2006, 7 (1)
  • [7] Circulating mutant DNA to assess tumor dynamics
    Diehl, Frank
    Schmidt, Kerstin
    Choti, Michael A.
    Romans, Katharine
    Goodman, Steven
    Li, Meng
    Thornton, Katherine
    Agrawal, Nishant
    Sokoll, Lori
    Szabo, Steve A.
    Kinzler, Kenneth W.
    Vogelstein, Bert
    Diaz, Luis A., Jr.
    [J]. NATURE MEDICINE, 2008, 14 (09) : 985 - 990
  • [8] Do H., 2013, CLIN CHEM, V59
  • [9] Improvements on cross-validation: The .632+ bootstrap method
    Efron, B
    Tibshirani, R
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1997, 92 (438) : 548 - 560
  • [10] 1977 RIETZ LECTURE - BOOTSTRAP METHODS - ANOTHER LOOK AT THE JACKKNIFE
    EFRON, B
    [J]. ANNALS OF STATISTICS, 1979, 7 (01) : 1 - 26