Adaptive Savitzky-Golay Filters for Analysis of Copy Number Variation Peaks from Whole-Exome Sequencing Data

被引:5
|
作者
Ochieng, Peter Juma [1 ]
Maroti, Zoltan [2 ,3 ]
Dombi, Jozsef [1 ]
Kresz, Miklos [4 ,5 ,6 ]
Bekesi, Jozsef [1 ]
Kalmar, Tibor [2 ,3 ]
机构
[1] Univ Szeged, Inst Informat, 2 Arpad Ter, H-6720 Szeged, Hungary
[2] Univ Szeged, Albert Szent Gyorgy Hlth Ctr, Dept Pediat, H-6725 Szeged, Hungary
[3] Univ Szeged, Pediat Hlth Ctr, H-6725 Szeged, Hungary
[4] InnoRenew CoE, Livade 6, Izola 6310, Slovenia
[5] Univ Primorska, Andrej Marusic Inst, Muzejski Trg 2, Koper 6000, Slovenia
[6] Univ Szeged, Dept Appl Informat, Boldogasszony Sgt 6, H-6725 Szeged, Hungary
关键词
copy number variation; read depth; adaptive Savitzky-Golay;
D O I
10.3390/info14020128
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Copy number variation (CNV) is a form of structural variation in the human genome that provides medical insight into complex human diseases; while whole-genome sequencing is becoming more affordable, whole-exome sequencing (WES) remains an important tool in clinical diagnostics. Because of its discontinuous nature and unique characteristics of sparse target-enrichment-based WES data, the analysis and detection of CNV peaks remain difficult tasks. The Savitzky-Golay (SG) smoothing is well known as a fast and efficient smoothing method. However, no study has documented the use of this technique for CNV peak detection. It is well known that the effectiveness of the classical SG filter depends on the proper selection of the window length and polynomial degree, which should correspond with the scale of the peak because, in the case of peaks with a high rate of change, the effectiveness of the filter could be restricted. Based on the Savitzky-Golay algorithm, this paper introduces a novel adaptive method to smooth irregular peak distributions. The proposed method ensures high-precision noise reduction by dynamically modifying the results of the prior smoothing to automatically adjust parameters. Our method offers an additional feature extraction technique based on density and Euclidean distance. In comparison to classical Savitzky-Golay filtering and other peer filtering methods, the performance evaluation demonstrates that adaptive Savitzky-Golay filtering performs better. According to experimental results, our method effectively detects CNV peaks across all genomic segments for both short and long tags, with minimal peak height fidelity values (i.e., low estimation bias). As a result, we clearly demonstrate how well the adaptive Savitzky-Golay filtering method works and how its use in the detection of CNV peaks can complement the existing techniques used in CNV peak analysis.
引用
收藏
页数:21
相关论文
共 50 条
  • [21] Enhanced copy number variants detection from whole-exome sequencing data using EXCAVATOR2
    D'Aurizio, Romina
    Pippucci, Tommaso
    Tattini, Lorenzo
    Giusti, Betti
    Pellegrini, Marco
    Magi, Alberto
    NUCLEIC ACIDS RESEARCH, 2016, 44 (20)
  • [22] Evaluation of Copy Number Variation (CNV) detection methods in whole exome sequencing data
    Zhang, Peng
    Ling, Hua
    Pugh, Elizabeth
    Hetrick, Kurt
    Witmer, Dane
    Sobreira, Nara
    Valle, David
    Doheny, Kimberly
    GENETIC EPIDEMIOLOGY, 2015, 39 (07) : 597 - 597
  • [23] Platform comparison of detecting copy number variants with microarrays and whole-exome sequencing
    de Ligt, Joep
    Boone, Philip M.
    Pfundt, Rolph
    Vissers, Lisenka E. L. M.
    de Leeuw, Nicole
    Shaw, Christine
    Brunner, Han G.
    Lupski, James R.
    Veltman, Joris A.
    Hehir-Kwa, Jayne Y.
    GENOMICS DATA, 2014, 2 : 144 - 146
  • [24] Gene-based comparative analysis of tools for estimating copy number alterations using whole-exome sequencing data
    Kim, Hyung-Yong
    Choi, Jin-Woo
    Lee, Jeong-Yeon
    Kong, Gu
    ONCOTARGET, 2017, 8 (16) : 27277 - 27285
  • [25] CopyDetective: Detection threshold-aware copy number variant calling in whole-exome sequencing data
    Sandmann, Sarah
    Woeste, Marius
    de Graaf, Aniek O.
    Burkhardt, Birgit
    Jansen, Joop H.
    Dugas, Martin
    GIGASCIENCE, 2020, 9 (11): : 1 - 10
  • [26] Copy number alterations detected by whole-exome and whole-genome sequencing of esophageal adenocarcinoma
    Xiaoyu Wang
    Xiaohong Li
    Yichen Cheng
    Xin Sun
    Xibin Sun
    Steve Self
    Charles Kooperberg
    James Y. Dai
    Human Genomics, 9
  • [27] CloneCNA: detecting subclonal somatic copy number alterations in heterogeneous tumor samples from whole-exome sequencing data
    Yu, Zhenhua
    Li, Ao
    Wang, Minghui
    BMC BIOINFORMATICS, 2016, 17
  • [28] Copy number alterations detected by whole-exome and whole-genome sequencing of esophageal adenocarcinoma
    Wang, Xiaoyu
    Li, Xiaohong
    Cheng, Yichen
    Sun, Xin
    Sun, Xibin
    Self, Steve
    Kooperberg, Charles
    Dai, James Y.
    HUMAN GENOMICS, 2015, 9
  • [29] CloneCNA: detecting subclonal somatic copy number alterations in heterogeneous tumor samples from whole-exome sequencing data
    Zhenhua Yu
    Ao Li
    Minghui Wang
    BMC Bioinformatics, 17
  • [30] Allele-specific copy-number discovery from whole-genome and whole-exome sequencing
    Wang, WeiBo
    Wang, Wei
    Sun, Wei
    Crowley, James J.
    Szatkiewicz, Jin P.
    NUCLEIC ACIDS RESEARCH, 2015, 43 (14)