Combinatorial detection algorithm for copy number variations using high-throughput sequencing reads

被引:0
作者
Yang H. [1 ]
Zhu D. [1 ]
机构
[1] School of Computer Science and Technology, Shandong University, Qingdao
来源
International Journal of Pattern Recognition and Artificial Intelligence | 2019年 / 33卷 / 14期
基金
中国国家自然科学基金;
关键词
Combinatorial detection algorithm; Copy number variation; Hidden Markov model; High-throughput sequencing; Split read;
D O I
10.1142/S0218001419500228
中图分类号
学科分类号
摘要
Copy number variation (CNV) is a prevalent kind of genetic structural variation which leads to an abnormal number of copies of large genomic regions, such as gain or loss of DNA segments larger than 1kb. CNV exists not only in human genome but also in plant genome. Current researches have testified that CNV is associated with many complex diseases. In this paper, guanine-cytosine (GC) bias, mappability and their effect on read depth signals in sequencing data are discussed first. Subsequently, a new correction method for GC bias and an improved combinatorial detection algorithm for CNV using high-throughput sequencing reads based on hidden Markov model (CNV-HMM) are proposed. The corrected read depth signals have lower correlation with GC content, mappability of reads and the width of analysis window. Then we create a hidden Markov model which maps the reads onto the reference genome and records the unmapped reads. The unmapped reads are counted and normalized. The CNV-HMM detects the abnormal signal of read count and gains the candidate CNVs using the expectation maximization (EM) algorithm. Finally, we filter the candidate CNVs using split reads to promote the performance of our algorithm. The experiment result indicates that the CNV-HMM algorithm has higher accuracy and sensitivity for CNVs detection than most current detection algorithms. © 2019 World Scientific Publishing Company.
引用
收藏
相关论文
共 50 条
  • [31] A Multicenter Study To Evaluate the Performance of High-Throughput Sequencing for Virus Detection
    Khan, Arifa S.
    Ng, Siemon H. S.
    Vandeputte, Olivier
    Aljanahi, Aisha
    Deyati, Avisek
    Cassart, Jean-Pol
    Charlebois, Robert L.
    Taliaferro, Lanyn P.
    [J]. MSPHERE, 2017, 2 (05):
  • [32] Estimating copy numbers of alleles from population-scale high-throughput sequencing data
    Takahiro Mimori
    Naoki Nariai
    Kaname Kojima
    Yukuto Sato
    Yosuke Kawai
    Yumi Yamaguchi-Kabata
    Masao Nagasaki
    [J]. BMC Bioinformatics, 16
  • [33] Combinatorial approach to estimate copy number genotype using whole-exome sequencing data
    Hwang, Mi Yeong
    Moon, Sanghoon
    Heo, Lyong
    Kim, Young Jin
    Oh, Ji Hee
    Kim, Yeon-Jung
    Kim, Yun Kyoung
    Lee, Juyoung
    Han, Bok-Ghee
    Kim, Bong-Jo
    [J]. GENOMICS, 2015, 105 (03) : 145 - 149
  • [34] Analysis of copy number variations and possible candidate genes in spontaneous abortion by copy number variation sequencing
    Bai, Wei
    Zhang, Qi
    Lin, Zhi
    Ye, Jin
    Shen, Xiaoqi
    Zhou, Linshuang
    Cai, Wenpin
    [J]. FRONTIERS IN ENDOCRINOLOGY, 2023, 14
  • [35] CNV Workshop: an integrated platform for high-throughput copy number variation discovery and clinical diagnostics
    Gai, Xiaowu
    Perin, Juan C.
    Murphy, Kevin
    O'Hara, Ryan
    D'arcy, Monica
    Wenocur, Adam
    Xie, Hongbo M.
    Rappaport, Eric F.
    Shaikh, Tamim H.
    White, Peter S.
    [J]. BMC BIOINFORMATICS, 2010, 11
  • [36] Development of a High-throughput Sequencing Platform for Detection of Viral Encephalitis Pathogens Based on Amplicon Sequencing
    Li, Zhang Ya
    Zhe, Su Wen
    Chen, Wang Rui
    Yan, Li
    Feng, Zhang Jun
    Hui, Liu Sheng
    He, Hu Dan
    Xiao, Xu Chong
    Yu, Yin Jia
    Kai, Yin Qi
    Ying, He
    Fan, Li
    Hong, F. U. Shi
    Kai, Nie
    Dong, Liang Guo
    Yong, Tao
    Tao, Xu Song
    Feng, Ma Chao
    Yu, Wang Huan
    [J]. BIOMEDICAL AND ENVIRONMENTAL SCIENCES, 2024, 37 (03) : 294 - 302
  • [37] Copy number variations in high and low fertility breeding boars
    Revay, Tamas
    Quach, Anh T.
    Maignel, Laurence
    Sullivan, Brian
    King, W. Allan
    [J]. BMC GENOMICS, 2015, 16
  • [38] Single nucleotide variant detection in Jaffrabadi buffalo (Bubalus bubalis) using high-throughput targeted sequencing
    Upadhyay, Maulik R.
    Patel, Anand B.
    Subramanian, Ramalingam B.
    Shah, Tejas M.
    Jakhesara, Subhash J.
    Bhatt, Vaibhav D.
    Koringa, Prakash G.
    Rank, Dharamshibhai N.
    Joshi, Chaitanya G.
    [J]. FRONTIERS IN LIFE SCIENCE, 2015, 8 (02): : 192 - 199
  • [39] Copy number variation detection using next generation sequencing read counts
    Wang, Heng
    Nettleton, Dan
    Ying, Kai
    [J]. BMC BIOINFORMATICS, 2014, 15
  • [40] Identification and analysis of microRNAs in Botryococcus braunii using high-throughput sequencing
    Deng, Xiang-Yuan
    Hu, Xiao-Li
    Li, Da
    Wang, Ling
    Cheng, Jie
    Gao, Kun
    [J]. AQUATIC BIOLOGY, 2017, 26 : 41 - 48