Design Exploration of Geometric Biclustering for Microarray Data Analysis in Data Mining

被引:8
作者
Wang, Doris Z. [1 ]
Cheung, Ray C. C. [1 ]
Yan, Hong [1 ]
机构
[1] City Univ Hong Kong, Dept Elect Engn, Hong Kong, Hong Kong, Peoples R China
关键词
Geometric biclustering (GBC); microarray data; graphics processing unit (GPU); field-programmable gate array (FPGA);
D O I
10.1109/TPDS.2013.204
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Biclustering is an important technique in data mining for searching similar patterns. Geometric biclustering (GBC) method is used to reduce the complexity of the NP-complete biclustering algorithm. This paper studies three commonly used modern platforms including multi-core CPU, GPU and FPGA to accelerate this GBC algorithm. By analyzing the parallelizing property of the GBC algorithm, we design 1) a multi-threaded software running on a server grade multi-core CPU system, 2) a CUDA program for GPU to accelerate the GBC algorithm, and 3) a novel parameterizable and scalable hardware architecture implemented on an FPGA. Genes microarray pattern analysis is employed as an example to demonstrate performance comparisons on different platforms. In particular, we compare the speed and energy efficiency of the three proposed methods. We found that 1) GPU achieves the highest average speedup of 48x compared to single-threaded GBC program, 2) Our FPGA design can achieve higher speedup of 4x for the computation for large microarray, and 3) FPGA consumes the least energy, which is about 3.53x more efficient than the single-threaded GBC program.
引用
收藏
页码:2540 / 2550
页数:11
相关论文
共 20 条
[1]  
Atkociunas E., 2005, Nonlinear Analysis Modelling and Control, V10, P315
[2]  
Cheng Y, 2000, Proc Int Conf Intell Syst Mol Biol, V8, P93
[3]   New perspectives for the biclustering problem [J].
de Franca, Fabricio O. ;
Bezerra, George ;
Von Zuben, Fernando J. .
2006 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-6, 2006, :753-+
[4]   Discovering biclusters in gene expression data based on high-dimensional linear geometries [J].
Gan, Xiangchao ;
Liew, Alan Wee-Chung ;
Yan, Hong .
BMC BIOINFORMATICS, 2008, 9 (1)
[5]   Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring [J].
Golub, TR ;
Slonim, DK ;
Tamayo, P ;
Huard, C ;
Gaasenbeek, M ;
Mesirov, JP ;
Coller, H ;
Loh, ML ;
Downing, JR ;
Caligiuri, MA ;
Bloomfield, CD ;
Lander, ES .
SCIENCE, 1999, 286 (5439) :531-537
[6]  
Hussain H. M., 2011, Proceedings of the 2011 NASA/ESA Conference on Adaptive Hardware and Systems (AHS), P248, DOI 10.1109/AHS.2011.5963944
[7]   A 2-stage-pipelined 16 port SRAM with 590 Gbps random access bandwidth and large noise margin [J].
Johguchi, Koh ;
Mukuda, Yuya ;
Aoyama, Ken-ichi ;
Mattausch, Hans Juergen ;
Koide, Tetsushi .
IEICE ELECTRONICS EXPRESS, 2007, 4 (02) :21-25
[8]   HIERARCHICAL CLUSTERING SCHEMES [J].
JOHNSON, SC .
PSYCHOMETRIKA, 1967, 32 (03) :241-254
[9]   Bicluster Algorithm and Used in Market Analysis [J].
Liu Shuyong ;
Chen yan ;
Yang ming ;
Ding rui .
WKDD: 2009 SECOND INTERNATIONAL WORKSHOP ON KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2009, :504-507
[10]  
Lo AWY, 2012, LECT NOTES COMPUT SC, V7667, P134, DOI 10.1007/978-3-642-34500-5_17