EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates

被引:892
作者
Vilella, Albert J. [1 ]
Severin, Jessica [1 ]
Ureta-Vidal, Abel [1 ]
Heng, Li [2 ]
Durbin, Richard [2 ]
Birney, Ewan [1 ]
机构
[1] EMBL EBI, Cambridge CB10 1SD, England
[2] Wellcome Trust Sanger Inst, Cambridge CB10 1HH, England
基金
英国惠康基金;
关键词
MAXIMUM-LIKELIHOOD; GENOME SEQUENCE; DATABASE; EVOLUTION; INSIGHTS; ALGORITHM; ORTHOLOGS; FAMILIES; PARALOGS;
D O I
10.1101/gr.073585.107
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We have developed a comprehensive gene orientated phylogenetic resource, EnsemblCompara GeneTrees, based on a computational pipeline to handle clustering, multiple alignment, and tree generation, including the handling of large gene families. We developed two novel non-sequence-based metrics of gene tree correctness and benchmarked a number of tree methods. The TreeBeST method from TreeFam shows the best performance in our hands. We also compared this phylogenetic approach to clustering approaches for ortholog prediction, showing a large increase in coverage using the phylogenetic approach. All data are made available in a number of formats and will be kept up to date with the Ensembl project.
引用
收藏
页码:327 / 335
页数:9
相关论文
共 27 条
  • [1] The genome sequence of Drosophila melanogaster
    Adams, MD
    Celniker, SE
    Holt, RA
    Evans, CA
    Gocayne, JD
    Amanatides, PG
    Scherer, SE
    Li, PW
    Hoskins, RA
    Galle, RF
    George, RA
    Lewis, SE
    Richards, S
    Ashburner, M
    Henderson, SN
    Sutton, GG
    Wortman, JR
    Yandell, MD
    Zhang, Q
    Chen, LX
    Brandon, RC
    Rogers, YHC
    Blazej, RG
    Champe, M
    Pfeiffer, BD
    Wan, KH
    Doyle, C
    Baxter, EG
    Helt, G
    Nelson, CR
    Miklos, GLG
    Abril, JF
    Agbayani, A
    An, HJ
    Andrews-Pfannkoch, C
    Baldwin, D
    Ballew, RM
    Basu, A
    Baxendale, J
    Bayraktaroglu, L
    Beasley, EM
    Beeson, KY
    Benos, PV
    Berman, BP
    Bhandari, D
    Bolshakov, S
    Borkova, D
    Botchan, MR
    Bouck, J
    Brokstein, P
    [J]. SCIENCE, 2000, 287 (5461) : 2185 - 2195
  • [2] The draft genome of Ciona intestinalis:: Insights into chordate and vertebrate origins
    Dehal, P
    Satou, Y
    Campbell, RK
    Chapman, J
    Degnan, B
    De Tomaso, A
    Davidson, B
    Di Gregorio, A
    Gelpke, M
    Goodstein, DM
    Harafuji, N
    Hastings, KEM
    Ho, I
    Hotta, K
    Huang, W
    Kawashima, T
    Lemaire, P
    Martinez, D
    Meinertzhagen, IA
    Necula, S
    Nonaka, M
    Putnam, N
    Rash, S
    Saiga, H
    Satake, M
    Terry, A
    Yamada, L
    Wang, HG
    Awazu, S
    Azumi, K
    Boore, J
    Branno, M
    Chin-bow, S
    DeSantis, R
    Doyle, S
    Francino, P
    Keys, DN
    Haga, S
    Hayashi, H
    Hino, K
    Imai, KS
    Inaba, K
    Kano, S
    Kobayashi, K
    Kobayashi, M
    Lee, BI
    Makabe, KW
    Manohar, C
    Matassi, G
    Medina, M
    [J]. SCIENCE, 2002, 298 (5601) : 2157 - 2167
  • [3] A phylogenomic gene cluster resource: the Phylogenetically Inferred Groups (PhIGs) database
    Dehal, Paramvir S.
    Boore, Jeffrey L.
    [J]. BMC BIOINFORMATICS, 2006, 7 (1)
  • [4] Tree pattern matching in phylogenetic trees:: automatic search for orthologs or paralogs in homologous gene sequence databases
    Dufayard, JF
    Duret, L
    Penel, S
    Gouy, M
    Rechenmann, F
    Perrière, G
    [J]. BIOINFORMATICS, 2005, 21 (11) : 2596 - 2603
  • [5] MUSCLE: a multiple sequence alignment method with reduced time and space complexity
    Edgar, RC
    [J]. BMC BIOINFORMATICS, 2004, 5 (1) : 1 - 19
  • [6] An efficient algorithm for large-scale detection of protein families
    Enright, AJ
    Van Dongen, S
    Ouzounis, CA
    [J]. NUCLEIC ACIDS RESEARCH, 2002, 30 (07) : 1575 - 1584
  • [7] MSOAR: A high-throughput ortholog assignment system based on genome rearrangement
    Fu, Zheng
    Chen, Xin
    Vacic, Vladimir
    Nan, Peng
    Zhong, Yang
    Jiang, Tao
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2007, 14 (09) : 1160 - 1175
  • [8] Genome sequence of the Brown Norway rat yields insights into mammalian evolution
    Gibbs, RA
    Weinstock, GM
    Metzker, ML
    Muzny, DM
    Sodergren, EJ
    Scherer, S
    Scott, G
    Steffen, D
    Worley, KC
    Burch, PE
    Okwuonu, G
    Hines, S
    Lewis, L
    DeRamo, C
    Delgado, O
    Dugan-Rocha, S
    Miner, G
    Morgan, M
    Hawes, A
    Gill, R
    Holt, RA
    Adams, MD
    Amanatides, PG
    Baden-Tillson, H
    Barnstead, M
    Chin, S
    Evans, CA
    Ferriera, S
    Fosler, C
    Glodek, A
    Gu, ZP
    Jennings, D
    Kraft, CL
    Nguyen, T
    Pfannkoch, CM
    Sitter, C
    Sutton, GG
    Venter, JC
    Woodage, T
    Smith, D
    Lee, HM
    Gustafson, E
    Cahill, P
    Kana, A
    Doucette-Stamm, L
    Weinstock, K
    Fechtel, K
    Weiss, RB
    Dunn, DM
    Green, ED
    [J]. NATURE, 2004, 428 (6982) : 493 - 521
  • [9] Evolutionary and biomedical insights from the rhesus macaque genome
    Gibbs, Richard A.
    Rogers, Jeffrey
    Katze, Michael G.
    Bumgarner, Roger
    Weinstock, George M.
    Mardis, Elaine R.
    Remington, Karin A.
    Strausberg, Robert L.
    Venter, J. Craig
    Wilson, Richard K.
    Batzer, Mark A.
    Bustamante, Carlos D.
    Eichler, Evan E.
    Hahn, Matthew W.
    Hardison, Ross C.
    Makova, Kateryna D.
    Miller, Webb
    Milosavljevic, Aleksandar
    Palermo, Robert E.
    Siepel, Adam
    Sikela, James M.
    Attaway, Tony
    Bell, Stephanie
    Bernard, Kelly E.
    Buhay, Christian J.
    Chandrabose, Mimi N.
    Dao, Marvin
    Davis, Clay
    Delehaunty, Kimberly D.
    Ding, Yan
    Dinh, Huyen H.
    Dugan-Rocha, Shannon
    Fulton, Lucinda A.
    Gabisi, Ramatu Ayiesha
    Garner, Toni T.
    Godfrey, Jennifer
    Hawes, Alicia C.
    Hernandez, Judith
    Hines, Sandra
    Holder, Michael
    Hume, Jennifer
    Jhangiani, Shalini N.
    Joshi, Vandita
    Khan, Ziad Mohid
    Kirkness, Ewen F.
    Cree, Andrew
    Fowler, R. Gerald
    Lee, Sandra
    Lewis, Lora R.
    Li, Zhangwan
    [J]. SCIENCE, 2007, 316 (5822) : 222 - 234
  • [10] Phylogenetic reconstruction of orthology, paralogy, and conserved synteny for dog and human
    Goodstadt, Leo
    Ponting, Chris P.
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2006, 2 (09) : 1134 - 1150