Tabix: fast retrieval of sequence features from generic TAB-delimited files

被引:358
作者
Li, Heng [1 ]
机构
[1] Broad Inst Harvard & MIT, Program Med Populat Genet, Cambridge, MA 02142 USA
关键词
GENOME BROWSER; DATABASE; UCSC;
D O I
10.1093/bioinformatics/btq671
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Tabix is the first generic tool that indexes position sorted files in TAB-delimited formats such as GFF, BED, PSL, SAM and SQL export, and quickly retrieves features overlapping specified regions. Tabix features include few seek function calls per query, data compression with gzip compatibility and direct FTP/HTTP access. Tabix is implemented as a free command-line tool as well as a library in C, Java, Perl and Python. It is particularly useful for manually examining local genomic features on the command line and enables genome viewers to support huge data files and remote custom tracks over networks.
引用
收藏
页码:718 / 719
页数:2
相关论文
共 7 条
  • [1] Nested containment list (NCList): a new algorithm for accelerating interval query of genome alignment and interval databases
    Alekseyenko, Alexander V.
    Lee, Christopher J.
    [J]. BIOINFORMATICS, 2007, 23 (11) : 1386 - 1393
  • [2] Accurate whole human genome sequencing using reversible terminator chemistry
    Bentley, David R.
    Balasubramanian, Shankar
    Swerdlow, Harold P.
    Smith, Geoffrey P.
    Milton, John
    Brown, Clive G.
    Hall, Kevin P.
    Evers, Dirk J.
    Barnes, Colin L.
    Bignell, Helen R.
    Boutell, Jonathan M.
    Bryant, Jason
    Carter, Richard J.
    Cheetham, R. Keira
    Cox, Anthony J.
    Ellis, Darren J.
    Flatbush, Michael R.
    Gormley, Niall A.
    Humphray, Sean J.
    Irving, Leslie J.
    Karbelashvili, Mirian S.
    Kirk, Scott M.
    Li, Heng
    Liu, Xiaohai
    Maisinger, Klaus S.
    Murray, Lisa J.
    Obradovic, Bojan
    Ost, Tobias
    Parkinson, Michael L.
    Pratt, Mark R.
    Rasolonjatovo, Isabelle M. J.
    Reed, Mark T.
    Rigatti, Roberto
    Rodighiero, Chiara
    Ross, Mark T.
    Sabot, Andrea
    Sankar, Subramanian V.
    Scally, Aylwyn
    Schroth, Gary P.
    Smith, Mark E.
    Smith, Vincent P.
    Spiridou, Anastassia
    Torrance, Peta E.
    Tzonev, Svilen S.
    Vermaas, Eric H.
    Walter, Klaudia
    Wu, Xiaolin
    Zhang, Lu
    Alam, Mohammed D.
    Anastasi, Carole
    [J]. NATURE, 2008, 456 (7218) : 53 - 59
  • [3] BigWig and BigBed: enabling browsing of large distributed datasets
    Kent, W. J.
    Zweig, A. S.
    Barber, G.
    Hinrichs, A. S.
    Karolchik, D.
    [J]. BIOINFORMATICS, 2010, 26 (17) : 2204 - 2207
  • [4] The human genome browser at UCSC
    Kent, WJ
    Sugnet, CW
    Furey, TS
    Roskin, KM
    Pringle, TH
    Zahler, AM
    Haussler, D
    [J]. GENOME RESEARCH, 2002, 12 (06) : 996 - 1006
  • [5] The Sequence Alignment/Map format and SAMtools
    Li, Heng
    Handsaker, Bob
    Wysoker, Alec
    Fennell, Tim
    Ruan, Jue
    Homer, Nils
    Marth, Gabor
    Abecasis, Goncalo
    Durbin, Richard
    [J]. BIOINFORMATICS, 2009, 25 (16) : 2078 - 2079
  • [6] The UCSC Genome Browser database: update 2010
    Rhead, Brooke
    Karolchik, Donna
    Kuhn, Robert M.
    Hinrichs, Angie S.
    Zweig, Ann S.
    Fujita, Pauline A.
    Diekhans, Mark
    Smith, Kayla E.
    Rosenbloom, Kate R.
    Raney, Brian J.
    Pohl, Andy
    Pheasant, Michael
    Meyer, Laurence R.
    Learned, Katrina
    Hsu, Fan
    Hillman-Jackson, Jennifer
    Harte, Rachel A.
    Giardine, Belinda
    Dreszer, Timothy R.
    Clawson, Hiram
    Barber, Galt P.
    Haussler, David
    Kent, W. James
    [J]. NUCLEIC ACIDS RESEARCH, 2010, 38 : D613 - D619
  • [7] The Generic Genome Browser: A building block for a model organism system database
    Stein, LD
    Mungall, C
    Shu, SQ
    Caudy, M
    Mangone, M
    Day, A
    Nickerson, E
    Stajich, JE
    Harris, TW
    Arva, A
    Lewis, S
    [J]. GENOME RESEARCH, 2002, 12 (10) : 1599 - 1610