Porechop_ABI: discovering unknown adapters in Oxford Nanopore Technology sequencing reads for downstream trimming

被引:62
作者
Bonenfant, Quentin [1 ]
Noe, Laurent [1 ]
Touzet, Helene [1 ]
机构
[1] Univ Lille, CRIStAL Ctr Rech Informat Signal & Automatique Lil, CNRS, Cent Lille,UMR 9189, F-59000 Lille, France
来源
BIOINFORMATICS ADVANCES | 2023年 / 3卷 / 01期
关键词
TRANSCRIPTION FACTOR; GENE-EXPRESSION; GENOME; MICROGLIA;
D O I
10.1093/bioadv/vbac085
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Motivation Oxford Nanopore Technologies (ONT) sequencing has become very popular over the past few years and offers a cost-effective solution for many genomic and transcriptomic projects. One distinctive feature of the technology is that the protocol includes the ligation of adapters to both ends of each fragment. Those adapters should then be removed before downstream analyses, either during the basecalling step or by explicit trimming. This basic task may be tricky when the definition of the adapter sequence is not well documented.Results We have developed a new method to scan a set of ONT reads to see if it contains adapters, without any prior knowledge on the sequence of the potential adapters, and then trim out those adapters. The algorithm is based on approximate k-mers and is able to discover adapter sequences based on their frequency alone. The method was successfully tested on a variety of ONT datasets with different flowcells, sequencing kits and basecallers.Availability and implementation The resulting software, named Porechop_ABI, is open-source and is available at https://github.com/bonsai-team/Porechop_ABI.Supplementary information are available at Bioinformatics advances online.
引用
收藏
页数:4
相关论文
共 58 条
[1]   Effective gene expression prediction from sequence by integrating long-range interactions [J].
Avsec, Ziga ;
Agarwal, Vikram ;
Visentin, Daniel ;
Ledsam, Joseph R. ;
Grabska-Barwinska, Agnieszka ;
Taylor, Kyle R. ;
Assael, Yannis ;
Jumper, John ;
Kohli, Pushmeet ;
Kelley, David R. .
NATURE METHODS, 2021, 18 (10) :1196-+
[2]   FINEMAP: efficient variable selection using summary data from genome-wide association studies [J].
Benner, Christian ;
Spencer, Chris C. A. ;
Havulinna, Aki S. ;
Salomaa, Veikko ;
Ripatti, Samuli ;
Pirinen, Matti .
BIOINFORMATICS, 2016, 32 (10) :1493-1501
[3]   JASPAR 2022: the 9th release of the open-access database of transcription factor binding profiles [J].
Castro-Mondragon, Jaime A. ;
Riudavets-Puig, Rafael ;
Rauluseviciute, Ieva ;
Lemma, Roza Berhanu ;
Turchi, Laura ;
Blanc-Mathieu, Romain ;
Lucas, Jeremy ;
Boddie, Paul ;
Khan, Aziz ;
Perez, Nicolas Manosalva ;
Fornes, Oriol ;
Leung, Tiffany Y. ;
Aguirre, Alejandro ;
Hammal, Fayrouz ;
Schmelter, Daniel ;
Baranasic, Damir ;
Ballester, Benoit ;
Sandelin, Albin ;
Lenhard, Boris ;
Vandepoele, Klaas ;
Wasserman, Wyeth W. ;
Parcy, Francois ;
Mathelier, Anthony .
NUCLEIC ACIDS RESEARCH, 2022, 50 (D1) :D165-D173
[4]   Differential neuronal and glial expression of nuclear factor I proteins in the cerebral cortex of adult mice [J].
Chen, Kok-Siong ;
Harris, Lachlan ;
Lim, Jonathan W. C. ;
Harvey, Tracey J. ;
Piper, Michael ;
Gronostajski, Richard M. ;
Richards, Linda J. ;
Bunt, Jens .
JOURNAL OF COMPARATIVE NEUROLOGY, 2017, 525 (11) :2465-2483
[5]   DECODE: a Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays [J].
Chen, Zhanlin ;
Zhang, Jing ;
Liu, Jason ;
Dai, Yi ;
Lee, Donghoon ;
Min, Martin Renqiang ;
Xu, Min ;
Gerstein, Mark .
BIOINFORMATICS, 2021, 37 :I280-I288
[6]   Single-cell epigenomic analyses implicate candidate causal variants at inherited risk loci for Alzheimer's and Parkinson's diseases [J].
Corces, M. Ryan ;
Shcherbina, Anna ;
Kundu, Soumya ;
Gloudemans, Michael J. ;
Fresard, Laure ;
Granja, Jeffrey M. ;
Louie, Bryan H. ;
Eulalio, Tiffany ;
Shams, Shadi ;
Bagdatli, S. Tansu ;
Mumbach, Maxwell R. ;
Liu, Boxiang ;
Montine, Kathleen S. ;
Greenleaf, William J. ;
Kundaje, Anshul ;
Montgomery, Stephen B. ;
Chang, Howard Y. ;
Montine, Thomas J. .
NATURE GENETICS, 2020, 52 (11) :1158-+
[7]   Twelve years of SAMtools and BCFtools [J].
Danecek, Petr ;
Bonfield, James K. ;
Liddle, Jennifer ;
Marshall, John ;
Ohan, Valeriu ;
Pollard, Martin O. ;
Whitwham, Andrew ;
Keane, Thomas ;
McCarthy, Shane A. ;
Davies, Robert M. ;
Li, Heng .
GIGASCIENCE, 2021, 10 (02)
[8]   Disease-Associated Microglia: A Universal Immune Sensor of Neurodegeneration [J].
Deczkowska, Aleksandra ;
Keren-Shaul, Hadas ;
Weiner, Assaf ;
Colonna, Marco ;
Schwartz, Michal ;
Amit, Ido .
CELL, 2018, 173 (05) :1073-1081
[9]   The Role of Early Growth Response 1 (EGR1) in Brain Plasticity and Neuropsychiatric Disorders [J].
Duclot, Florian ;
Kabbaj, Mohamed .
FRONTIERS IN BEHAVIORAL NEUROSCIENCE, 2017, 11
[10]   CoordConv-Unet: Investigating CoordConv for Organ Segmentation [J].
El Jurdi, R. ;
Petitjean, C. ;
Honeine, P. ;
Abdallah, F. .
IRBM, 2021, 42 (06) :415-423