Combinatorial detection algorithm for copy number variations using high-throughput sequencing reads

被引:0
作者
Yang H. [1 ]
Zhu D. [1 ]
机构
[1] School of Computer Science and Technology, Shandong University, Qingdao
来源
International Journal of Pattern Recognition and Artificial Intelligence | 2019年 / 33卷 / 14期
基金
中国国家自然科学基金;
关键词
Combinatorial detection algorithm; Copy number variation; Hidden Markov model; High-throughput sequencing; Split read;
D O I
10.1142/S0218001419500228
中图分类号
学科分类号
摘要
Copy number variation (CNV) is a prevalent kind of genetic structural variation which leads to an abnormal number of copies of large genomic regions, such as gain or loss of DNA segments larger than 1kb. CNV exists not only in human genome but also in plant genome. Current researches have testified that CNV is associated with many complex diseases. In this paper, guanine-cytosine (GC) bias, mappability and their effect on read depth signals in sequencing data are discussed first. Subsequently, a new correction method for GC bias and an improved combinatorial detection algorithm for CNV using high-throughput sequencing reads based on hidden Markov model (CNV-HMM) are proposed. The corrected read depth signals have lower correlation with GC content, mappability of reads and the width of analysis window. Then we create a hidden Markov model which maps the reads onto the reference genome and records the unmapped reads. The unmapped reads are counted and normalized. The CNV-HMM detects the abnormal signal of read count and gains the candidate CNVs using the expectation maximization (EM) algorithm. Finally, we filter the candidate CNVs using split reads to promote the performance of our algorithm. The experiment result indicates that the CNV-HMM algorithm has higher accuracy and sensitivity for CNVs detection than most current detection algorithms. © 2019 World Scientific Publishing Company.
引用
收藏
相关论文
共 50 条
  • [21] qKAT: a high-throughput qPCR method for KIR gene copy number and haplotype determination
    Jiang, W.
    Johnson, C.
    Simecek, N.
    Lopez-Alvarez, M. R.
    Di, D.
    Trowsdale, J.
    Traherne, J. A.
    GENOME MEDICINE, 2016, 8
  • [22] RKDOSCNV: A Local Kernel Density-Based Approach to the Detection of Copy Number Variations by Using Next-Generation Sequencing Data
    Liu, Guojun
    Zhang, Junying
    Yuan, Xiguo
    Wei, Chao
    FRONTIERS IN GENETICS, 2020, 11
  • [23] Transcriptome analysis of the variations between autotetraploid Paulownia tomentosa and its diploid using high-throughput sequencing
    Guoqiang Fan
    Limin Wang
    Minjie Deng
    Suyan Niu
    Zhenli Zhao
    Enkai Xu
    Xibin Cao
    Xiaoshen Zhang
    Molecular Genetics and Genomics, 2015, 290 : 1627 - 1638
  • [24] Transcriptome analysis of the variations between autotetraploid Paulownia tomentosa and its diploid using high-throughput sequencing
    Fan, Guoqiang
    Wang, Limin
    Deng, Minjie
    Niu, Suyan
    Zhao, Zhenli
    Xu, Enkai
    Cao, Xibin
    Zhang, Xiaoshen
    MOLECULAR GENETICS AND GENOMICS, 2015, 290 (04) : 1627 - 1638
  • [25] High-throughput sequencing for algal systematics
    Oliveira, Mariana C.
    Repetti, Sonja I.
    Iha, Cintia
    Jackson, Christopher J.
    Diaz-Tapia, Pilar
    Lubiana, Karoline Magalhaes Ferreira
    Cassano, Valeria
    Costa, Joana F.
    Cremen, Ma Chiela M.
    Marcelino, Vanessa R.
    Verbruggen, Heroen
    EUROPEAN JOURNAL OF PHYCOLOGY, 2018, 53 (03) : 256 - 272
  • [26] Copy Number Variation Detection Using Single Cell Sequencing Data
    Zare, Fatima
    Stark, Jacob
    Nabavi, Sheida
    12TH ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS (ACM-BCB 2021), 2021,
  • [27] Detection and discovery of plant viruses in Disporopsis through high-throughput sequencing
    Li, Qiannan
    Yang, Lianfu
    Zhu, Ting
    Yv, Xiyv
    Zhang, Boxin
    Li, Hongzhe
    Hao, Junjie
    Zhang, Lei
    Ji, Pengzhang
    Dong, Jiahong
    FRONTIERS IN MICROBIOLOGY, 2024, 15
  • [28] Considerations for Optimization of High-Throughput Sequencing Bioinformatics Pipelines for Virus Detection
    Lambert, Christophe
    Braxton, Cassandra
    Charlebois, Robert L.
    Deyati, Avisek
    Duncan, Paul
    La Neve, Fabio
    Malicki, Heather D.
    Ribrioux, Sebastien
    Rozelle, Daniel K.
    Michaels, Brandye
    Sun, Wenping
    Yang, Zhihui
    Khan, Arifa S.
    VIRUSES-BASEL, 2018, 10 (10):
  • [29] A Primer on the Analysis of High-Throughput Sequencing Data for Detection of Plant Viruses
    Kutnjak, Denis
    Tamisier, Lucie
    Adams, Ian
    Boonham, Neil
    Candresse, Thierry
    Chiumenti, Michela
    De Jonghe, Kris
    Kreuze, Jan F.
    Lefebvre, Marie
    Silva, Goncalo
    Malapi-Wight, Martha
    Margaria, Paolo
    Plesko, Irena Mavriric
    McGreig, Sam
    Miozzi, Laura
    Remenant, Benoit
    Reynard, Jean-Sebastien
    Rollin, Johan
    Rott, Mike
    Schumpp, Olivier
    Massart, Sebastien
    Haegeman, Annelies
    MICROORGANISMS, 2021, 9 (04)
  • [30] A Multicenter Study To Evaluate the Performance of High-Throughput Sequencing for Virus Detection
    Khan, Arifa S.
    Ng, Siemon H. S.
    Vandeputte, Olivier
    Aljanahi, Aisha
    Deyati, Avisek
    Cassart, Jean-Pol
    Charlebois, Robert L.
    Taliaferro, Lanyn P.
    MSPHERE, 2017, 2 (05):