PEAT: an intelligent and efficient paired-end sequencing adapter trimming algorithm

被引:38
作者
Li, Yun-Lung [1 ]
Weng, Jui-Cheng [1 ]
Hsiao, Chiung-Chih [1 ]
Chou, Min-Te [1 ]
Tseng, Chin-Wen [1 ]
Hung, Jui-Hung [1 ,2 ]
机构
[1] Natl Chiao Tung Univ, Inst Bioinformat & Syst Biol, Hsinchu, Taiwan
[2] Natl Chiao Tung Univ, Dept Biol Sci & Technol, Hsinchu, Taiwan
关键词
INSERT SIZES; FASTQ DATA; IDENTIFICATION;
D O I
10.1186/1471-2105-16-S1-S2
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: In modern paired-end sequencing protocols short DNA fragments lead to adapter-appended reads. Current paired-end adapter removal approaches trim adapter by scanning the fragment of adapter on the 3' end of the reads, which are not competent in some applications. Results: Here, we propose a fast and highly accurate adapter-trimming algorithm, PEAT, designed specifically for paired-end sequencing. PEAT requires no a priori adaptor sequence, which is convenient for large-scale meta-analyses. We assessed the performance of PEAT with many adapter trimmers in both simulated and real life paired-end sequencing libraries. The importance of adapter trimming was exemplified by the influence of the downstream analyses on RNA-seq, ChIP-seq and MNase-seq. Several useful guidelines of applying adapter trimmers with aligners were suggested. Conclusions: PEAT can be easily included in the routine paired-end sequencing pipeline. The executable binaries and the standalone C++ source code package of PEAT are freely available online.
引用
收藏
页数:11
相关论文
共 20 条
[1]  
Aronesty E., 2013, OPEN BIOINFORM J, V7, DOI 10.2174/1875036201307010001
[2]   Assessing the accuracy of prediction algorithms for classification: an overview [J].
Baldi, P ;
Brunak, S ;
Chauvin, Y ;
Andersen, CAF ;
Nielsen, H .
BIOINFORMATICS, 2000, 16 (05) :412-424
[3]   The Mouse Genome Database: integration of and access to knowledge about the laboratory mouse [J].
Blake, Judith A. ;
Bult, Carol J. ;
Eppig, Janan T. ;
Kadin, James A. ;
Richardson, Joel E. .
NUCLEIC ACIDS RESEARCH, 2014, 42 (D1) :D810-D817
[4]  
Bolger AM, 2014, BIOINFORMATICS
[5]   Integration of external signaling pathways with the core transcriptional network in embryonic stem cells [J].
Chen, Xi ;
Xu, Han ;
Yuan, Ping ;
Fang, Fang ;
Huss, Mikael ;
Vega, Vinsensius B. ;
Wong, Eleanor ;
Orlov, Yuriy L. ;
Zhang, Weiwei ;
Jiang, Jianming ;
Loh, Yuin-Han ;
Yeo, Hock Chuan ;
Yeo, Zhen Xuan ;
Narang, Vipin ;
Govindarajan, Kunde Ramamoorthy ;
Leong, Bernard ;
Shahab, Atif ;
Ruan, Yijun ;
Bourque, Guillaume ;
Sung, Wing-Kin ;
Clarke, Neil D. ;
Wei, Chia-Lin ;
Ng, Huck-Hui .
CELL, 2008, 133 (06) :1106-1117
[6]   ChIP-Based Methods for the Identification of Long-Range Chromatin Interactions [J].
Fullwood, Melissa J. ;
Ruan, Yijun .
JOURNAL OF CELLULAR BIOCHEMISTRY, 2009, 107 (01) :30-39
[7]   BS-Seeker2: a versatile aligning pipeline for bisulfite sequencing data [J].
Guo, Weilong ;
Fiziev, Petko ;
Yan, Weihong ;
Cokus, Shawn ;
Sun, Xueguang ;
Zhang, Michael Q. ;
Chen, Pao-Yang ;
Pellegrini, Matteo .
BMC GENOMICS, 2013, 14
[8]   Epigenome characterization at single base-pair resolution [J].
Henikoff, Jorja G. ;
Belsky, Jason A. ;
Krassovsky, Kristina ;
MacAlpine, David M. ;
Henikoff, Steven .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2011, 108 (45) :18318-18323
[9]  
Langmead B, 2012, NAT METHODS, V9, P357, DOI [10.1038/NMETH.1923, 10.1038/nmeth.1923]
[10]   The spliceosome deposits multiple proteins 20-24 nucleotides upstream of mRNA exon-exon junctions [J].
Le Hir, H ;
Izaurralde, E ;
Maquat, LE ;
Moore, MJ .
EMBO JOURNAL, 2000, 19 (24) :6860-6869