Data mining in bioinformatics using Weka

被引:704
作者
Frank, E
Hall, M
Trigg, L
Holmes, G
Witten, IH
机构
[1] Univ Waikato, Dept Comp Sci, Hamilton, New Zealand
[2] Reel Two, Hamilton, New Zealand
关键词
D O I
10.1093/bioinformatics/bth261
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The Weka machine learning workbench provides a general-purpose environment for automatic classification, regression, clustering and feature selection-common data mining problems in bioinformatics research. It contains an extensive collection of machine learning algorithms and data pre-processing methods complemented by graphical user interfaces for data exploration and the experimental comparison of different machine learning techniques on the same problem. Weka can process data given in the form of a single relational table. Its main objectives are to (a) assist users in extracting useful information from data and (b) enable them to easily identify a suitable algorithm for generating an accurate predictive model from it.
引用
收藏
页码:2479 / 2481
页数:3
相关论文
共 9 条
[1]   Automated annotation of keywords for proteins related to mycoplasmataceae using machine learning techniques [J].
Bazzan, ALC ;
Engel, PM ;
Schroeder, LF ;
da Silva, SC .
BIOINFORMATICS, 2002, 18 :S35-S43
[2]   Towards a computational model for-1 eukaryotic frameshifting sites [J].
Bekaert, M ;
Bidou, L ;
Denise, A ;
Duchateau-Nguyen, G ;
Forest, JP ;
Froidevaux, C ;
Hatin, I ;
Rousset, JP ;
Termier, M .
BIOINFORMATICS, 2003, 19 (03) :327-335
[3]   Automatic rule generation for protein annotation with the C4.5 data mining algorithm applied on SWISS-PROT [J].
Kretschmann, E ;
Fleischmann, W ;
Apweiler, R .
BIOINFORMATICS, 2001, 17 (10) :920-926
[4]  
LI J, 2003, BIOINFORMATICS S2, V19, P93
[5]   Simple rules underlying gene expression profiles of more than six subtypes of acute lymphoblastic leukemia (ALL) patients [J].
Li, JY ;
Liu, HQ ;
Downing, JR ;
Yeoh, AEJ ;
Wong, LS .
BIOINFORMATICS, 2003, 19 (01) :71-78
[6]   Identifying good diagnostic gene groups from gene expression profiles using the concept of emerging patterns [J].
Li, JY ;
Wong, LS .
BIOINFORMATICS, 2002, 18 (05) :725-734
[7]   Application of metabolomics to plant genotype discrimination using statistics and machine learning [J].
Taylor, J ;
King, RD ;
Altmann, T ;
Fiehn, O .
BIOINFORMATICS, 2002, 18 :S241-S248
[8]  
Tobler J B, 2002, Bioinformatics, V18 Suppl 1, pS164
[9]   Witten IH, Frank E: Data Mining: Practical Machine Learning Tools and Techniques 2nd editionSan Francisco: Morgan Kaufmann Publishers; 2005:560. ISBN 0-12-088407-0, £34.99 [J].
Francisco Azuaje .
BioMedical Engineering OnLine, 5 (1)