EBIC: an open source software for high-dimensional and big data analyses

被引:8
|
作者
Orzechowski, Patryk [1 ,2 ]
Moore, Jason H. [1 ]
机构
[1] Univ Penn, Inst Biomed Informat, Philadelphia, PA 19104 USA
[2] AGH Univ Sci & Technol, Dept Automat & Robot, PL-30059 Krakow, Poland
基金
美国国家卫生研究院;
关键词
D O I
10.1093/bioinformatics/btz027
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: In this paper, we present an open source package with the latest release of Evolutionary-based BIClustering (EBIC), a next-generation biclustering algorithm for mining genetic data. The major contribution of this paper is adding a full support for multiple graphics processing units (GPUs) support, which makes it possible to run efficiently large genomic data mining analyses. Multiple enhancements to the first release of the algorithm include integration with R and Bioconductor, and an option to exclude missing values from the analysis. Results: Evolutionary-based BIClustering was applied to datasets of different sizes, including a large DNA methylation dataset with 436 444 rows. For the largest dataset we observed over 6.6-fold speedup in computation time on a cluster of eight GPUs compared to running the method on a single GPU. This proves high scalability of the method.
引用
收藏
页码:3181 / 3183
页数:3
相关论文
共 50 条
  • [41] On Criticality in High-Dimensional Data
    Saremi, Saeed
    Sejnowski, Terrence J.
    NEURAL COMPUTATION, 2014, 26 (07) : 1329 - 1339
  • [42] High-Dimensional Data Bootstrap
    Chernozhukov, Victor
    Chetverikov, Denis
    Kato, Kengo
    Koike, Yuta
    ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION, 2023, 10 : 427 - 449
  • [43] High-dimensional data clustering
    Bouveyron, C.
    Girard, S.
    Schmid, C.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 52 (01) : 502 - 519
  • [45] High-dimensional data visualization
    Tang, Lin
    NATURE METHODS, 2020, 17 (02) : 129 - 129
  • [46] High-dimensional data visualization
    Lin Tang
    Nature Methods, 2020, 17 : 129 - 129
  • [47] Haery: A Hadoop Based Query System on Accumulative and High-Dimensional Data Model for Big Data
    Song, Jie
    He, HongYan
    Thomas, Richard
    Bao, Yubin
    Yu, Ge
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (07) : 1362 - 1377
  • [48] High-dimensional Data Cubes
    John, Sachin Basil
    Koch, Christoph
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2022, 15 (13): : 3828 - 3840
  • [49] Modeling High-Dimensional Data
    Vempala, Santosh S.
    COMMUNICATIONS OF THE ACM, 2012, 55 (02) : 112 - 112
  • [50] A telescope for high-dimensional data
    Shneiderman, B
    COMPUTING IN SCIENCE & ENGINEERING, 2006, 8 (02) : 48 - 53