Adaptive Savitzky-Golay Filters for Analysis of Copy Number Variation Peaks from Whole-Exome Sequencing Data

被引:5
|
作者
Ochieng, Peter Juma [1 ]
Maroti, Zoltan [2 ,3 ]
Dombi, Jozsef [1 ]
Kresz, Miklos [4 ,5 ,6 ]
Bekesi, Jozsef [1 ]
Kalmar, Tibor [2 ,3 ]
机构
[1] Univ Szeged, Inst Informat, 2 Arpad Ter, H-6720 Szeged, Hungary
[2] Univ Szeged, Albert Szent Gyorgy Hlth Ctr, Dept Pediat, H-6725 Szeged, Hungary
[3] Univ Szeged, Pediat Hlth Ctr, H-6725 Szeged, Hungary
[4] InnoRenew CoE, Livade 6, Izola 6310, Slovenia
[5] Univ Primorska, Andrej Marusic Inst, Muzejski Trg 2, Koper 6000, Slovenia
[6] Univ Szeged, Dept Appl Informat, Boldogasszony Sgt 6, H-6725 Szeged, Hungary
关键词
copy number variation; read depth; adaptive Savitzky-Golay;
D O I
10.3390/info14020128
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Copy number variation (CNV) is a form of structural variation in the human genome that provides medical insight into complex human diseases; while whole-genome sequencing is becoming more affordable, whole-exome sequencing (WES) remains an important tool in clinical diagnostics. Because of its discontinuous nature and unique characteristics of sparse target-enrichment-based WES data, the analysis and detection of CNV peaks remain difficult tasks. The Savitzky-Golay (SG) smoothing is well known as a fast and efficient smoothing method. However, no study has documented the use of this technique for CNV peak detection. It is well known that the effectiveness of the classical SG filter depends on the proper selection of the window length and polynomial degree, which should correspond with the scale of the peak because, in the case of peaks with a high rate of change, the effectiveness of the filter could be restricted. Based on the Savitzky-Golay algorithm, this paper introduces a novel adaptive method to smooth irregular peak distributions. The proposed method ensures high-precision noise reduction by dynamically modifying the results of the prior smoothing to automatically adjust parameters. Our method offers an additional feature extraction technique based on density and Euclidean distance. In comparison to classical Savitzky-Golay filtering and other peer filtering methods, the performance evaluation demonstrates that adaptive Savitzky-Golay filtering performs better. According to experimental results, our method effectively detects CNV peaks across all genomic segments for both short and long tags, with minimal peak height fidelity values (i.e., low estimation bias). As a result, we clearly demonstrate how well the adaptive Savitzky-Golay filtering method works and how its use in the detection of CNV peaks can complement the existing techniques used in CNV peak analysis.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] An Evaluation of Copy Number Variation Detection Tools from Whole-Exome Sequencing Data
    Tan, Renjie
    Wang, Yadong
    Kleinstein, Sarah E.
    Liu, Yongzhuang
    Zhu, Xiaolin
    Guo, Hongzhe
    Jiang, Qinghua
    Allen, Andrew S.
    Zhu, Mingfu
    HUMAN MUTATION, 2014, 35 (07) : 899 - 907
  • [2] EXCAVATOR: detecting copy number variants from whole-exome sequencing data
    Alberto Magi
    Lorenzo Tattini
    Ingrid Cifola
    Romina D’Aurizio
    Matteo Benelli
    Eleonora Mangano
    Cristina Battaglia
    Elena Bonora
    Ants Kurg
    Marco Seri
    Pamela Magini
    Betti Giusti
    Giovanni Romeo
    Tommaso Pippucci
    Gianluca De Bellis
    Rosanna Abbate
    Gian Franco Gensini
    Genome Biology, 14
  • [3] EXCAVATOR: detecting copy number variants from whole-exome sequencing data
    Magi, Alberto
    Tattini, Lorenzo
    Cifola, Ingrid
    D'Aurizio, Romina
    Benelli, Matteo
    Mangano, Eleonora
    Battaglia, Cristina
    Bonora, Elena
    Kurg, Ants
    Seri, Marco
    Magini, Pamela
    Giusti, Betti
    Romeo, Giovanni
    Pippucci, Tommaso
    De Bellis, Gianluca
    Abbate, Rosanna
    Gensini, Gian Franco
    GENOME BIOLOGY, 2013, 14 (10):
  • [4] Copy number estimation from whole-exome sequencing in tumors
    Anderson, Shawn
    Che, Zhiwei
    Keshavan, Raja
    O'Hara, Andrea
    Lin, Dong
    Wang, Yuzhuo
    Collins, Colin
    Shams, Soheil
    CANCER RESEARCH, 2018, 78 (13)
  • [5] CODEX: A Normalization and Copy Number Variation Detection Method for Whole-Exome Sequencing
    Jiang, Yuchao
    Oldridge, Derek A.
    Diskin, Sharon J.
    Zhang, Nancy R.
    HUMAN HEREDITY, 2016, 81 (02) : 54 - 55
  • [6] CODEX: a normalization and copy number variation detection method for whole-exome sequencing
    Jiang, Yuchao
    Oldridge, Derek A.
    Diskin, Sharon J.
    Zhang, Nancy R.
    CANCER RESEARCH, 2015, 75
  • [7] Copy Number Analysis of Whole Exome Sequencing Data
    Madubata, Chinwe
    Bi, Xin
    Pang, Jiuhong
    Gu, Yue
    Koganti, Lahari
    Liao, Jun
    Hsiao, Susan
    Aggarwal, Vimla
    Mansukhani, Mahesh
    Jobanputra, Vaidehi
    AMERICAN JOURNAL OF CLINICAL PATHOLOGY, 2024, 162 : S158 - S159
  • [8] Discovery and Statistical Genotyping of Copy-Number Variation from Whole-Exome Sequencing Depth
    Fromer, Menachem
    Moran, Jennifer L.
    Chambert, Kimberly
    Banks, Eric
    Bergen, Sarah E.
    Ruderfer, Douglas M.
    Handsaker, Robert E.
    McCarroll, Steven A.
    O'Donovan, Michael C.
    Owen, Michael J.
    Kirov, George
    Sullivan, Patrick F.
    Hultman, Christina M.
    Sklar, Pamela
    Purcell, Shaun M.
    AMERICAN JOURNAL OF HUMAN GENETICS, 2012, 91 (04) : 597 - 607
  • [9] ExomeHMM: A Hidden Markov Model for Detecting Copy Number Variation Using Whole-Exome Sequencing Data
    Guo, Cheng
    Yu, Zhenhua
    Wang, Minghui
    Li, Ao
    CURRENT BIOINFORMATICS, 2017, 12 (02) : 147 - 155
  • [10] Evaluation of somatic copy number estimation tools for whole-exome sequencing data
    Nam, Jae-Yong
    Kim, Nayoung K. D.
    Kim, Sang Cheol
    Joung, Je-Gun
    Xi, Ruibin
    Lee, Semin
    Park, Peter J.
    Park, Woong-Yang
    BRIEFINGS IN BIOINFORMATICS, 2016, 17 (02) : 185 - 192