The greedy path-merging algorithm for Contig Scaffolding

被引:56
作者
Huson, DH
Reinert, K
Myers, EW
机构
[1] Univ Tubingen, Inst Comp Sci, D-72076 Tubingen, Germany
[2] Free Univ Berlin, Inst Comp Sci, D-14195 Berlin, Germany
[3] Univ Calif Berkeley, Dept Comp Sci, Berkeley, CA 94720 USA
关键词
algorithms; experimentation; genome assembly;
D O I
10.1145/585265.585267
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Given a collection of contigs and mate-pairs. The Contig Scaffolding Problem is to order and orientate the given contigs in a manner that is consistent with as many mate-pairs as possible. This paper describes an efficient heuristic called the greedy-path merging algorithm for solving this problem. The method was originally developed as a key component of the compartmentalized assembly strategy developed at Celera Genomics. This interim approach was used at an early stage of the sequencing of the human genome to produce a preliminary assembly based on preliminary whole genome shotgun data produced at Celera and preliminary human contigs produced by the Human Genome Project.
引用
收藏
页码:603 / 615
页数:13
相关论文
共 14 条
[1]  
[Anonymous], 1979, Computers and Intractablity: A Guide to the Theoryof NP-Completeness
[2]  
[Anonymous], BIOINFORMATICS
[3]  
[Anonymous], P 5 ANN INT C COMP M
[4]   GenBank [J].
Benson, DA ;
Karsch-Mizrachi, I ;
Lipman, DJ ;
Ostell, J ;
Rapp, BA ;
Wheeler, DL .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :15-18
[5]  
Bevington R., 1969, DATA REDUCTION ERROR
[6]  
Green P, 1994, DOCUMENTATION PHRAP
[7]  
LANDER E S, 1988, Genomics, V2, P231
[8]   Initial sequencing and analysis of the human genome [J].
Lander, ES ;
Int Human Genome Sequencing Consortium ;
Linton, LM ;
Birren, B ;
Nusbaum, C ;
Zody, MC ;
Baldwin, J ;
Devon, K ;
Dewar, K ;
Doyle, M ;
FitzHugh, W ;
Funke, R ;
Gage, D ;
Harris, K ;
Heaford, A ;
Howland, J ;
Kann, L ;
Lehoczky, J ;
LeVine, R ;
McEwan, P ;
McKernan, K ;
Meldrim, J ;
Mesirov, JP ;
Miranda, C ;
Morris, W ;
Naylor, J ;
Raymond, C ;
Rosetti, M ;
Santos, R ;
Sheridan, A ;
Sougnez, C ;
Stange-Thomann, N ;
Stojanovic, N ;
Subramanian, A ;
Wyman, D ;
Rogers, J ;
Sulston, J ;
Ainscough, R ;
Beck, S ;
Bentley, D ;
Burton, J ;
Clee, C ;
Carter, N ;
Coulson, A ;
Deadman, R ;
Deloukas, P ;
Dunham, A ;
Dunham, I ;
Durbin, R ;
French, L .
NATURE, 2001, 409 (6822) :860-921
[9]   A whole-genome assembly of Drosophila [J].
Myers, EW ;
Sutton, GG ;
Delcher, AL ;
Dew, IM ;
Fasulo, DP ;
Flanigan, MJ ;
Kravitz, SA ;
Mobarry, CM ;
Reinert, KHJ ;
Remington, KA ;
Anson, EL ;
Bolanos, RA ;
Chou, HH ;
Jordan, CM ;
Halpern, AL ;
Lonardi, S ;
Beasley, EM ;
Brandon, RC ;
Chen, L ;
Dunn, PJ ;
Lai, ZW ;
Liang, Y ;
Nusskern, DR ;
Zhan, M ;
Zhang, Q ;
Zheng, XQ ;
Rubin, GM ;
Adams, MD ;
Venter, JC .
SCIENCE, 2000, 287 (5461) :2196-2204
[10]   NUCLEOTIDE-SEQUENCE OF BACTERIOPHAGE-GAMMA DNA [J].
SANGER, F ;
COULSON, AR ;
HONG, GF ;
HILL, DF ;
PETERSEN, GB .
JOURNAL OF MOLECULAR BIOLOGY, 1982, 162 (04) :729-773