A parallel Expressed Sequence Tag (EST) clustering program

被引:0
|
作者
Pedretti, K [1 ]
Scheetz, T
Braun, T
Roberts, C
Robinson, N
Casavant, T
机构
[1] Univ Iowa, Parallel Proc Lab, Dept Elect & Comp Engn, Iowa City, IA 52242 USA
[2] Univ Iowa, Coodinated Lab Computat Genomics, Dept Elect & Comp Engn, Iowa City, IA 52242 USA
来源
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper describes the Ulcluster software tool, which partitions Expressed Sequence Tag (EST) sequences and other genetic sequences into "clusters" based on sequence similarity. Ideally, each cluster will contain sequences that all represent the same gene. If a naive approach such as an NxN comparison (N is the number of sequences input) is taken, the problem is only feasible for very small data sets. Ulcluster has been developed over the course of four years to solve this problem efficiently and accurately for large data sets consisting of tens or hundreds of thousands of EST sequences. The latest version of the application has been parallelized using the MPI (message passing interface) standard. Both the computation and memory requirements of the program can be distributed among multiple (possibly distributed) UNIX processes.
引用
收藏
页码:490 / 497
页数:8
相关论文
共 50 条
  • [21] Mining of Expressed Sequence Tag (EST) Libraries and Core Nucleotide Sequences for Simple Sequence Repeats (SSR) in Papaya
    Arumugam, V.
    Riju, A.
    Arunachalam, V.
    II INTERNATIONAL SYMPOSIUM ON PAPAYA, 2010, 851
  • [22] A Review of Recent Alignment-free Clustering Algorithms in Expressed Sequence Tag
    Ng, Keng-Hoong
    Phon-Amnuaisuk, Somnuk
    Ho, Chin-Kuan
    2009 INTERNATIONAL CONFERENCE OF SOFT COMPUTING AND PATTERN RECOGNITION, 2009, : 25 - 30
  • [23] Defining fovea gene expression using existing expressed sequence tag (EST) databases.
    Ziesel, A
    Bernstein, SL
    Stepczynski, J
    Chrenek, MA
    Yong, L
    Carlsonl, C
    Clum, R
    Aggarwal, A
    Chambers, M
    Wong, P
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 1999, 40 (04) : S599 - S599
  • [24] ESTpass: a web-based server for processing and annotating expressed sequence tag (EST) sequences
    Lee, Byungwook
    Hong, Taehui
    Byun, Sang Jin
    Woo, Taeha
    Choi, Yoon Jeong
    NUCLEIC ACIDS RESEARCH, 2007, 35 : W159 - W162
  • [25] A comprehensive approach to clustering of expressed human gene sequence: The sequence tag alignment and consensus knowledge base
    Miller, RT
    Christoffels, AG
    Gopalakrishnan, C
    Burke, J
    Ptitsyn, AA
    Broveak, TR
    Hide, WA
    GENOME RESEARCH, 1999, 9 (11) : 1143 - 1155
  • [26] Update of the gene discovery program in Schistosoma mansoni with the expressed sequence tag approach
    Rabelo, EML
    Franco, GR
    Azevedo, VAC
    Pena, HB
    Santos, TM
    Meira, WSF
    Rodrigues, NA
    Ortega, JM
    Pena, SDJ
    MEMORIAS DO INSTITUTO OSWALDO CRUZ, 1997, 92 (05): : 625 - 629
  • [27] Comparative analysis of expressed sequence tag (EST) libraries in the seagrass Zostera marina subjected to temperature stress
    Reusch, Thorsten B. H.
    Veron, Amelie S.
    Preuss, Christoph
    Weiner, January
    Wissler, Lothar
    Beck, Alfred
    Klages, Sven
    Kube, Michael
    Reinhardt, Richard
    Bornberg-Bauer, Erich
    MARINE BIOTECHNOLOGY, 2008, 10 (03) : 297 - 309
  • [28] Development of expressed sequence tag resources for Vanda Mimi Palmer and data mining for EST-SSR
    Seow-Ling Teh
    Wai-Sun Chan
    Janna Ong Abdullah
    Parameswari Namasivayam
    Molecular Biology Reports, 2011, 38 : 3903 - 3909
  • [29] Analysis of a normalised expressed sequence tag (EST) library from a key pollinator, the bumblebee Bombus terrestris
    Sadd, Ben M.
    Kube, Michael
    Klages, Sven
    Reinhardt, Richard
    Schmid-Hempel, Paul
    BMC GENOMICS, 2010, 11
  • [30] Development of expressed sequence tag resources for Vanda Mimi Palmer and data mining for EST-SSR
    Teh, Seow-Ling
    Chan, Wai-Sun
    Abdullah, Janna Ong
    Namasivayam, Parameswari
    MOLECULAR BIOLOGY REPORTS, 2011, 38 (06) : 3903 - 3909