Motivation: We present a new feature of the MAFFT multiple alignment program for suppressing over-alignment (aligning unrelated segments). Conventional MAFFT is highly sensitive in aligning conserved regions in remote homologs, but the risk of over-alignment is recently becoming greater, as low-quality or noisy sequences are increasing in protein sequence databases, due, for example, to sequencing errors and difficulty in gene prediction. Results: The proposed method utilizes a variable scoring matrix for different pairs of sequences (or groups) in a single multiple sequence alignment, based on the global similarity of each pair. This method significantly increases the correctly gapped sites in real examples and in simulations under various conditions. Regarding sensitivity, the effect of the proposed method is slightly negative in real protein-based benchmarks, and mostly neutral in simulation-based benchmarks. This approach is based on natural biological reasoning and should be compatible with many methods based on dynamic programming for multiple sequence alignment.
机构:
Univ Calif Berkeley, Dept Math, Berkeley, CA 94720 USA
Univ Calif Berkeley, Dept Mol & Cellular Biol, Berkeley, CA 94720 USAUniv Calif Berkeley, Dept Math, Berkeley, CA 94720 USA
Bradley, Robert K.
Roberts, Adam
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USAUniv Calif Berkeley, Dept Math, Berkeley, CA 94720 USA
Roberts, Adam
Smoot, Michael
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif San Diego, Dept Bioengn, San Diego, CA 92103 USAUniv Calif Berkeley, Dept Math, Berkeley, CA 94720 USA
Smoot, Michael
Juvekar, Sudeep
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USAUniv Calif Berkeley, Dept Math, Berkeley, CA 94720 USA
Juvekar, Sudeep
Do, Jaeyoung
论文数: 0引用数: 0
h-index: 0
机构:
Univ Wisconsin, Dept Comp Sci, Madison, WI 53706 USAUniv Calif Berkeley, Dept Math, Berkeley, CA 94720 USA
Do, Jaeyoung
Dewey, Colin
论文数: 0引用数: 0
h-index: 0
机构:
Univ Wisconsin, Dept Comp Sci, Madison, WI 53706 USA
Univ Wisconsin, Dept Biostat & Med Informat, Madison, WI USAUniv Calif Berkeley, Dept Math, Berkeley, CA 94720 USA
Dewey, Colin
Holmes, Ian
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif Berkeley, Dept Bioengn, Berkeley, CA 94720 USAUniv Calif Berkeley, Dept Math, Berkeley, CA 94720 USA
Holmes, Ian
Pachter, Lior
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif Berkeley, Dept Math, Berkeley, CA 94720 USA
Univ Calif Berkeley, Dept Mol & Cellular Biol, Berkeley, CA 94720 USAUniv Calif Berkeley, Dept Math, Berkeley, CA 94720 USA
机构:
Ctr Genom Regulat CRG, Comparat Genom Grp, Bioinformat & Genom Programme, Barcelona 08003, SpainCtr Genom Regulat CRG, Comparat Genom Grp, Bioinformat & Genom Programme, Barcelona 08003, Spain
Capella-Gutierrez, Salvador
Silla-Martinez, Jose M.
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Genom Regulat CRG, Comparat Genom Grp, Bioinformat & Genom Programme, Barcelona 08003, SpainCtr Genom Regulat CRG, Comparat Genom Grp, Bioinformat & Genom Programme, Barcelona 08003, Spain
Silla-Martinez, Jose M.
Gabaldon, Toni
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Genom Regulat CRG, Comparat Genom Grp, Bioinformat & Genom Programme, Barcelona 08003, SpainCtr Genom Regulat CRG, Comparat Genom Grp, Bioinformat & Genom Programme, Barcelona 08003, Spain
机构:
Ctr Genom Regulat CRG, Comparat Bioinformat Bioinformat & Genom Programm, Barcelona 08003, Spain
UPF, Barcelona, SpainCtr Genom Regulat CRG, Comparat Bioinformat Bioinformat & Genom Programm, Barcelona 08003, Spain
Chang, Jia-Ming
Di Tommaso, Paolo
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Genom Regulat CRG, Comparat Bioinformat Bioinformat & Genom Programm, Barcelona 08003, Spain
UPF, Barcelona, SpainCtr Genom Regulat CRG, Comparat Bioinformat Bioinformat & Genom Programm, Barcelona 08003, Spain
Di Tommaso, Paolo
Notredame, Cedric
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Genom Regulat CRG, Comparat Bioinformat Bioinformat & Genom Programm, Barcelona 08003, Spain
UPF, Barcelona, SpainCtr Genom Regulat CRG, Comparat Bioinformat Bioinformat & Genom Programm, Barcelona 08003, Spain
机构:
Univ Calif Berkeley, Dept Math, Berkeley, CA 94720 USA
Univ Calif Berkeley, Dept Mol & Cellular Biol, Berkeley, CA 94720 USAUniv Calif Berkeley, Dept Math, Berkeley, CA 94720 USA
Bradley, Robert K.
Roberts, Adam
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USAUniv Calif Berkeley, Dept Math, Berkeley, CA 94720 USA
Roberts, Adam
Smoot, Michael
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif San Diego, Dept Bioengn, San Diego, CA 92103 USAUniv Calif Berkeley, Dept Math, Berkeley, CA 94720 USA
Smoot, Michael
Juvekar, Sudeep
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USAUniv Calif Berkeley, Dept Math, Berkeley, CA 94720 USA
Juvekar, Sudeep
Do, Jaeyoung
论文数: 0引用数: 0
h-index: 0
机构:
Univ Wisconsin, Dept Comp Sci, Madison, WI 53706 USAUniv Calif Berkeley, Dept Math, Berkeley, CA 94720 USA
Do, Jaeyoung
Dewey, Colin
论文数: 0引用数: 0
h-index: 0
机构:
Univ Wisconsin, Dept Comp Sci, Madison, WI 53706 USA
Univ Wisconsin, Dept Biostat & Med Informat, Madison, WI USAUniv Calif Berkeley, Dept Math, Berkeley, CA 94720 USA
Dewey, Colin
Holmes, Ian
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif Berkeley, Dept Bioengn, Berkeley, CA 94720 USAUniv Calif Berkeley, Dept Math, Berkeley, CA 94720 USA
Holmes, Ian
Pachter, Lior
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif Berkeley, Dept Math, Berkeley, CA 94720 USA
Univ Calif Berkeley, Dept Mol & Cellular Biol, Berkeley, CA 94720 USAUniv Calif Berkeley, Dept Math, Berkeley, CA 94720 USA
机构:
Ctr Genom Regulat CRG, Comparat Genom Grp, Bioinformat & Genom Programme, Barcelona 08003, SpainCtr Genom Regulat CRG, Comparat Genom Grp, Bioinformat & Genom Programme, Barcelona 08003, Spain
Capella-Gutierrez, Salvador
Silla-Martinez, Jose M.
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Genom Regulat CRG, Comparat Genom Grp, Bioinformat & Genom Programme, Barcelona 08003, SpainCtr Genom Regulat CRG, Comparat Genom Grp, Bioinformat & Genom Programme, Barcelona 08003, Spain
Silla-Martinez, Jose M.
Gabaldon, Toni
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Genom Regulat CRG, Comparat Genom Grp, Bioinformat & Genom Programme, Barcelona 08003, SpainCtr Genom Regulat CRG, Comparat Genom Grp, Bioinformat & Genom Programme, Barcelona 08003, Spain
机构:
Ctr Genom Regulat CRG, Comparat Bioinformat Bioinformat & Genom Programm, Barcelona 08003, Spain
UPF, Barcelona, SpainCtr Genom Regulat CRG, Comparat Bioinformat Bioinformat & Genom Programm, Barcelona 08003, Spain
Chang, Jia-Ming
Di Tommaso, Paolo
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Genom Regulat CRG, Comparat Bioinformat Bioinformat & Genom Programm, Barcelona 08003, Spain
UPF, Barcelona, SpainCtr Genom Regulat CRG, Comparat Bioinformat Bioinformat & Genom Programm, Barcelona 08003, Spain
Di Tommaso, Paolo
Notredame, Cedric
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Genom Regulat CRG, Comparat Bioinformat Bioinformat & Genom Programm, Barcelona 08003, Spain
UPF, Barcelona, SpainCtr Genom Regulat CRG, Comparat Bioinformat Bioinformat & Genom Programm, Barcelona 08003, Spain