Accurate long-read de novo assembly evaluation with Inspector

被引:88
作者
Chen, Yu [1 ,2 ]
Zhang, Yixin [3 ]
Wang, Amy Y. [2 ,4 ]
Gao, Min [2 ,5 ]
Chong, Zechen [1 ,2 ]
机构
[1] Univ Alabama Birmingham, Heersink Sch Med, Dept Genet, Birmingham, AL 35294 USA
[2] Univ Alabama Birmingham, Heersink Sch Med, Informat Inst, Birmingham, AL 35294 USA
[3] Univ Alabama Birmingham, Coll Arts & Sci, Dept Comp Sci, Birmingham, AL 35294 USA
[4] Univ Alabama Birmingham, Heersink Sch Med, Dept Med, Div Gen Internal Med, Birmingham, AL 35294 USA
[5] Univ Alabama Birmingham, Heersink Sch Med, Dept Med, Div Cardiovasc Dis, Birmingham, AL 35233 USA
关键词
De novo assembly; Long reads; Assembly evaluation; Assembly error; Genome assembly; GENOME ASSEMBLIES; SEQUENCE;
D O I
10.1186/s13059-021-02527-4
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Long-read de novo genome assembly continues to advance rapidly. However, there is a lack of effective tools to accurately evaluate the assembly results, especially for structural errors. We present Inspector, a reference-free long-read de novo assembly evaluator which faithfully reports types of errors and their precise locations. Notably, Inspector can correct the assembly errors based on consensus sequences derived from raw reads covering erroneous regions. Based on in silico and long-read assembly results from multiple long-read data and assemblers, we demonstrate that in addition to providing generic metrics, Inspector can accurately identify both large-scale and small-scale assembly errors.
引用
收藏
页数:21
相关论文
共 51 条
[1]   Major Impacts of Widespread Structural Variation on Gene Expression and Crop Improvement in Tomato [J].
Alonge, Michael ;
Wang, Xingang ;
Benoit, Matthias ;
Soyk, Sebastian ;
Pereira, Lara ;
Zhang, Lei ;
Suresh, Hamsini ;
Ramakrishnan, Srividya ;
Maumus, Florian ;
Ciren, Danielle ;
Levy, Yuval ;
Harel, Tom Hai ;
Shalev-Schlosser, Gili ;
Amsellem, Ziva ;
Razifard, Hamid ;
Caicedo, Ana L. ;
Tieman, Denise M. ;
Klee, Harry ;
Kirsche, Melanie ;
Aganezov, Sergey ;
Ranallo-Benavidez, T. Rhyker ;
Lemmon, Zachary H. ;
Kim, Jennifer ;
Robitaille, Gina ;
Kramer, Melissa ;
Goodwin, Sara ;
McCombie, W. Richard ;
Hutton, Samuel ;
Van Eck, Joyce ;
Gillis, Jesse ;
Eshed, Yuval ;
Sedlazeck, Fritz J. ;
van der Knaap, Esther ;
Schatz, Michael C. ;
Lippman, Zachary B. .
CELL, 2020, 182 (01) :145-+
[2]   A global reference for human genetic variation [J].
Altshuler, David M. ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Donnelly, Peter ;
Eichler, Evan E. ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Green, Eric D. ;
Hurles, Matthew E. ;
Knoppers, Bartha M. ;
Korbel, Jan O. ;
Lander, Eric S. ;
Lee, Charles ;
Lehrach, Hans ;
Mardis, Elaine R. ;
Marth, Gabor T. ;
McVean, Gil A. ;
Nickerson, Deborah A. ;
Wang, Jun ;
Wilson, Richard K. ;
Boerwinkle, Eric ;
Doddapaneni, Harsha ;
Han, Yi ;
Korchina, Viktoriya ;
Kovar, Christie ;
Lee, Sandra ;
Muzny, Donna ;
Reid, Jeffrey G. ;
Zhu, Yiming ;
Chang, Yuqi ;
Feng, Qiang ;
Fang, Xiaodong ;
Guo, Xiaosen ;
Jian, Min ;
Jiang, Hui ;
Jin, Xin ;
Lan, Tianming ;
Li, Guoqing ;
Li, Jingxiang ;
Li, Yingrui ;
Liu, Shengmao ;
Liu, Xiao ;
Lu, Yao ;
Ma, Xuedi ;
Tang, Meifang ;
Wang, Bo .
NATURE, 2015, 526 (7571) :68-+
[3]  
[Anonymous], 2021, INSP V 1 0 1 CODEOCE
[4]  
[Anonymous], 2021, INSP V1 0 1 COD GEN
[5]   Characterizing the Major Structural Variant Alleles of the Human Genome [J].
Audano, Peter A. ;
Sulovari, Arvis ;
Graves-Lindsay, Tina A. ;
Cantsilieris, Stuart ;
Sorensen, Melanie ;
Welch, AnneMarie E. ;
Dougherty, Max L. ;
Nelson, Bradley J. ;
Shah, Ankeeta ;
Dutcher, Susan K. ;
Warren, Wesley C. ;
Magrini, Vincent ;
McGrath, Sean D. ;
Li, Yang I. ;
Wilson, Richard K. ;
Eichler, Evan E. .
CELL, 2019, 176 (03) :663-+
[6]   Assembling large genomes with single-molecule sequencing and locality-sensitive hashing [J].
Berlin, Konstantin ;
Koren, Sergey ;
Chin, Chen-Shan ;
Drake, James P. ;
Landolin, Jane M. ;
Phillippy, Adam M. .
NATURE BIOTECHNOLOGY, 2015, 33 (06) :623-+
[7]   Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species [J].
Bradnam, Keith R. ;
Fass, Joseph N. ;
Alexandrov, Anton ;
Baranay, Paul ;
Bechner, Michael ;
Birol, Inanc ;
Boisvert, Sebastien ;
Chapman, Jarrod A. ;
Chapuis, Guillaume ;
Chikhi, Rayan ;
Chitsaz, Hamidreza ;
Chou, Wen-Chi ;
Corbeil, Jacques ;
Del Fabbro, Cristian ;
Docking, T. Roderick ;
Durbin, Richard ;
Earl, Dent ;
Emrich, Scott ;
Fedotov, Pavel ;
Fonseca, Nuno A. ;
Ganapathy, Ganeshkumar ;
Gibbs, Richard A. ;
Gnerre, Sante ;
Godzaridis, Elenie ;
Goldstein, Steve ;
Haimel, Matthias ;
Hall, Giles ;
Haussler, David ;
Hiatt, Joseph B. ;
Ho, Isaac Y. ;
Howard, Jason ;
Hunt, Martin ;
Jackman, Shaun D. ;
Jaffe, David B. ;
Jarvis, Erich D. ;
Jiang, Huaiyang ;
Kazakov, Sergey ;
Kersey, Paul J. ;
Kitzman, Jacob O. ;
Knight, James R. ;
Koren, Sergey ;
Lam, Tak-Wah ;
Lavenier, Dominique ;
Laviolette, Francois ;
Li, Yingrui ;
Li, Zhenyu ;
Liu, Binghang ;
Liu, Yue ;
Luo, Ruibang ;
MacCallum, Iain .
GIGASCIENCE, 2013, 2
[8]   GRIDSS: sensitive and specific genomic rearrangement detection using positional de Bruijn graph assembly [J].
Cameron, Daniel L. ;
Schroder, Jan ;
Penington, Jocelyn Sietsma ;
Do, Hongdo ;
Molania, Ramyar ;
Dobrovic, Alexander ;
Speed, Terence P. ;
Papenfuss, Anthony T. .
GENOME RESEARCH, 2017, 27 (12) :2050-2060
[9]   Multi-platform discovery of haplotype-resolved structural variation in human genomes [J].
Chaisson, Mark J. P. ;
Sanders, Ashley D. ;
Zhao, Xuefang ;
Malhotra, Ankit ;
Porubsky, David ;
Rausch, Tobias ;
Gardner, Eugene J. ;
Rodriguez, Oscar L. ;
Guo, Li ;
Collins, Ryan L. ;
Fan, Xian ;
Wen, Jia ;
Handsaker, Robert E. ;
Fairley, Susan ;
Kronenberg, Zev N. ;
Kong, Xiangmeng ;
Hormozdiari, Fereydoun ;
Lee, Dillon ;
Wenger, Aaron M. ;
Hastie, Alex R. ;
Antaki, Danny ;
Anantharaman, Thomas ;
Audano, Peter A. ;
Brand, Harrison ;
Cantsilieris, Stuart ;
Cao, Han ;
Cerveira, Eliza ;
Chen, Chong ;
Chen, Xintong ;
Chin, Chen-Shan ;
Chong, Zechen ;
Chuang, Nelson T. ;
Lambert, Christine C. ;
Church, Deanna M. ;
Clarke, Laura ;
Farrell, Andrew ;
Flores, Joey ;
Galeev, Timur ;
Gorkin, David U. ;
Gujral, Madhusudan ;
Guryev, Victor ;
Heaton, William Haynes ;
Korlach, Jonas ;
Kumar, Sushant ;
Kwon, Jee Young ;
Lam, Ernest T. ;
Lee, Jong Eun ;
Lee, Joyce ;
Lee, Wan-Ping ;
Lee, Sau Peng .
NATURE COMMUNICATIONS, 2019, 10 (1)
[10]   Efficient assembly of nanopore reads via highly accurate and intact error correction [J].
Chen, Ying ;
Nie, Fan ;
Xie, Shang-Qian ;
Zheng, Ying-Feng ;
Dai, Qi ;
Bray, Thomas ;
Wang, Yao-Xin ;
Xing, Jian-Feng ;
Huang, Zhi-Jian ;
Wang, De-Peng ;
He, Li-Juan ;
Luo, Feng ;
Wang, Jian-Xin ;
Liu, Yi-Zhi ;
Xiao, Chuan-Le .
NATURE COMMUNICATIONS, 2021, 12 (01)