vcfpp: a C plus plus API for rapid processing of the variant call format

被引:1
作者
Li, Zilong [1 ]
机构
[1] Univ Copenhagen, Sect Computat & RNA Biol, DK-2200 Copenhagen, Denmark
关键词
GENOTYPE IMPUTATION; SEQUENCE;
D O I
10.1093/bioinformatics/btae049
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation Given the widespread use of the variant call format (VCF/BCF) coupled with continuous surge in big data, there remains a perpetual demand for fast and flexible methods to manipulate these comprehensive formats across various programming languages.Results This work presents vcfpp, a C++ API of HTSlib in a single file, providing an intuitive interface to manipulate VCF/BCF files rapidly and safely, in addition to being portable. Moreover, this work introduces the vcfppR package to demonstrate the development of a high-performance R package with vcfpp, allowing for rapid and straightforward variants analyses.Availability and implementation vcfpp is available from https://github.com/Zilong-Li/vcfpp under MIT license. vcfppR is available from https://cran.r-project.org/web/packages/vcfppR.
引用
收藏
页数:4
相关论文
共 13 条
[1]   HTSlib: C library for reading/writing high-throughput sequencing data [J].
Bonfield, James K. ;
Marshall, John ;
Danecek, Petr ;
Li, Heng ;
Ohan, Valeriu ;
Whitwham, Andrew ;
Keane, Thomas ;
Davies, Robert M. .
GIGASCIENCE, 2021, 10 (02)
[2]   High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios [J].
Byrska-Bishop, Marta ;
Evani, Uday S. ;
Zhao, Xuefang ;
Basile, Anna O. ;
Abel, Haley J. ;
Regier, Allison A. ;
Corvelo, Andre ;
Clarke, Wayne E. ;
Musunuri, Rajeeva ;
Nagulapalli, Kshithija ;
Fairley, Susan ;
Runnels, Alexi ;
Winterkorn, Lara ;
Lowy, Ernesto ;
Flicek, Paul ;
Germer, Soren ;
Brand, Harrison ;
Hall, Ira M. ;
Talkowski, Michael E. ;
Narzisi, Giuseppe ;
Zody, Michael C. .
CELL, 2022, 185 (18) :3426-+
[3]   The variant call format and VCFtools [J].
Danecek, Petr ;
Auton, Adam ;
Abecasis, Goncalo ;
Albers, Cornelis A. ;
Banks, Eric ;
DePristo, Mark A. ;
Handsaker, Robert E. ;
Lunter, Gerton ;
Marth, Gabor T. ;
Sherry, Stephen T. ;
McVean, Gilean ;
Durbin, Richard .
BIOINFORMATICS, 2011, 27 (15) :2156-2158
[4]   Rapid genotype imputation from sequence with reference panels [J].
Davies, Robert W. ;
Kucka, Marek ;
Su, Dingwen ;
Shi, Sinan ;
Flanagan, Maeve ;
Cunniff, Christopher M. ;
Chan, Yingguang Frank ;
Myers, Simon .
NATURE GENETICS, 2021, 53 (07) :1104-+
[5]   Rapid genotype imputation from sequence without reference panels [J].
Davies, Robert W. ;
Flint, Jonathan ;
Myers, Simon ;
Mott, Richard .
NATURE GENETICS, 2016, 48 (08) :965-+
[6]  
Eddelbuettel D, 2011, J STAT SOFTW, V40, P1
[7]   A spectrum of free software tools for processing the VCF variant call format: vcflib, bio-vcf, cyvcf2, hts-nim and slivar [J].
Garrison, Erik ;
Kronenberg, Zev N. N. ;
Dawson, Eric T. T. ;
Pedersen, Brent S. S. ;
Prins, Pjotr .
PLOS COMPUTATIONAL BIOLOGY, 2022, 18 (05)
[8]   VCFR: a package to manipulate and visualize variant call format data in R [J].
Knaus, Brian J. ;
Grunwald, Niklaus J. .
MOLECULAR ECOLOGY RESOURCES, 2017, 17 (01) :44-53
[9]   The Sequence Alignment/Map format and SAMtools [J].
Li, Heng ;
Handsaker, Bob ;
Wysoker, Alec ;
Fennell, Tim ;
Ruan, Jue ;
Homer, Nils ;
Marth, Gabor ;
Abecasis, Goncalo ;
Durbin, Richard .
BIOINFORMATICS, 2009, 25 (16) :2078-2079
[10]   hts-nim: scripting high-performance genomic analyses [J].
Pedersen, Brent S. ;
Quinlan, Aaron R. .
BIOINFORMATICS, 2018, 34 (19) :3387-3389