EBIC: an open source software for high-dimensional and big data analyses

被引:8
|
作者
Orzechowski, Patryk [1 ,2 ]
Moore, Jason H. [1 ]
机构
[1] Univ Penn, Inst Biomed Informat, Philadelphia, PA 19104 USA
[2] AGH Univ Sci & Technol, Dept Automat & Robot, PL-30059 Krakow, Poland
基金
美国国家卫生研究院;
关键词
D O I
10.1093/bioinformatics/btz027
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: In this paper, we present an open source package with the latest release of Evolutionary-based BIClustering (EBIC), a next-generation biclustering algorithm for mining genetic data. The major contribution of this paper is adding a full support for multiple graphics processing units (GPUs) support, which makes it possible to run efficiently large genomic data mining analyses. Multiple enhancements to the first release of the algorithm include integration with R and Bioconductor, and an option to exclude missing values from the analysis. Results: Evolutionary-based BIClustering was applied to datasets of different sizes, including a large DNA methylation dataset with 436 444 rows. For the largest dataset we observed over 6.6-fold speedup in computation time on a cluster of eight GPUs compared to running the method on a single GPU. This proves high scalability of the method.
引用
收藏
页码:3181 / 3183
页数:3
相关论文
共 50 条
  • [31] Big Data Open Source Platforms
    Coimbra de Almeida, Pedro Daniel
    Bernardino, Jorge
    2015 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2015, 2015, : 268 - 275
  • [32] Tournament screening cum EBIC for feature selection with high-dimensional feature spaces
    ZeHua Chen
    JiaHua Chen
    Science in China Series A: Mathematics, 2009, 52 : 1327 - 1341
  • [33] Deep mining method for high-dimensional big data based on association rule
    Xu, Shu
    INTERNATIONAL JOURNAL OF INTERNET PROTOCOL TECHNOLOGY, 2021, 14 (03) : 147 - 154
  • [34] A GPU-Aware Parallel Index for Processing High-Dimensional Big Data
    Kim, Mincheol
    Liu, Ling
    Choi, Wonik
    IEEE TRANSACTIONS ON COMPUTERS, 2018, 67 (10) : 1388 - 1402
  • [35] A Feature Grouping Method for Ensemble Clustering of High-Dimensional Genomic Big Data
    Farid, Dewan Md.
    Nowe, Ann
    Manderick, Bernard
    PROCEEDINGS OF 2016 FUTURE TECHNOLOGIES CONFERENCE (FTC), 2016, : 260 - 268
  • [36] The Improved Research of Association Rules Mining Algorithm in High-Dimensional Big Data
    Du, Lingling
    NEW INDUSTRIALIZATION AND URBANIZATION DEVELOPMENT ANNUAL CONFERENCE: THE INTERNATIONAL FORUM ON NEW INDUSTRIALIZATION DEVELOPMENT IN BIG-DATA ERA, 2015, : 239 - 244
  • [37] Tournament screening cum EBIC for feature selection with high-dimensional feature spaces
    CHEN ZeHua1 & CHEN JiaHua2 1 Department of Statistics & Applied Probability
    Science China Mathematics, 2009, (06) : 1327 - 1341
  • [38] Feature Selection in High-Dimensional Models via EBIC with Energy Distance Correlation
    Ocloo, Isaac Xoese
    Chen, Hanfeng
    ENTROPY, 2023, 25 (01)
  • [39] Tournament screening cum EBIC for feature selection with high-dimensional feature spaces
    Chen Zehua
    Chen JiaHua
    SCIENCE IN CHINA SERIES A-MATHEMATICS, 2009, 52 (06): : 1327 - 1341
  • [40] The challenge of complexity in the Big Data era: how to ride the wave of high-dimensional data revolution
    Bossa, Cecilia
    Branchi, Igor
    Caccia, Barbara
    Cisbani, Evaristo
    Daniele, Carla
    D'Avenio, Giuseppe
    Esposito, Giuseppe
    Facchiano, Francesco
    Frustagli, Gianluca
    Gagliardi, Roberta Valentina
    Galluzzi, Andrea
    Giansanti, Daniele
    Gigante, Guido
    Giuliani, Alessandro
    Le Pera, Loredana
    Mattia, Maurizio
    Morelli, Sandra
    Moro, Ornella
    Palma, Alessandra
    Pazienti, Antonio
    Picconi, Orietta
    Pizzi, Elisabetta
    Poli, Cecilia
    Ruspantini, Irene
    Tait, Sabrina
    Tcheremenskaia, Olga
    ANNALI DELL ISTITUTO SUPERIORE DI SANITA, 2022, 58 (03): : 151 - 153