Virulence factor prediction in Streptococcus pyogenes using classification and clustering based on microarray data

被引:6
作者
Lopez-Kleine, Liliana [1 ]
Torres-Aviles, Francisco [2 ]
Tejedor, Fabio H. [1 ]
Gordillo, Luz A. [1 ]
机构
[1] Univ Nacl Colombia, Dept Estadist, Bogota, Colombia
[2] Univ Santiago Chile, Dept Matemat & Ciencia Computac, Santiago 8330111, Chile
关键词
Classification; Microarray data; Protein function; Statistical genomics; Support vector machines; Virulence factor; NETWORKS;
D O I
10.1007/s00253-012-3917-3
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Interesting biological information as, for example, gene expression data (microarrays), can be extracted from publicly available genomic data. As a starting point in order to narrow down the great possibilities of wet lab experiments, global high throughput data and available knowledge should be used to infer biological knowledge and emit biological hypothesis. Here, based on microarray data, we propose the use of cluster and classification methods that have become very popular and are implemented in freely available software in order to predict the participation in virulence mechanisms of different proteins coded by genes of the pathogen Streptococcus pyogenes. Confidence of predictions is based on classification errors of known genes and repetitive prediction by more than three methods. A special emphasis is done on the nonlinear kernel classification methods used. We propose a list of interesting candidates that could be virulence factors or that participate in the virulence process of S. pyogenes. Biological validations should start using this list of candidates as they show similar behavior to known virulence factors.
引用
收藏
页码:2091 / 2098
页数:8
相关论文
共 20 条
[1]  
[Anonymous], 2011, R: A Language and Environment for Statistical Computing
[2]  
[Anonymous], 2000, Springer Series in Information Sciences
[3]   Molecular basis of group A streptococcal virulence [J].
Bisno, AL ;
Brito, MO ;
Collins, CM .
LANCET INFECTIOUS DISEASES, 2003, 3 (04) :191-200
[4]   Supervised reconstruction of biological networks with local models [J].
Bleakley, Kevin ;
Biau, Gerard ;
Vert, Jean-Philippe .
BIOINFORMATICS, 2007, 23 (13) :I57-I65
[5]  
Clarke B, 2009, SPRINGER SER STAT, P1, DOI 10.1007/978-0-387-98135-2
[6]   Inactivation of DltA Modulates Virulence Factor Expression in Streptococcus pyogenes [J].
Cox, Kathleen H. ;
Ruiz-Bustos, Eduardo ;
Courtney, Harry S. ;
Dale, James B. ;
Pence, Morgan A. ;
Nizet, Victor ;
Aziz, Ramy K. ;
Gerling, Ivan ;
Price, Susan M. ;
Hasty, David L. .
PLOS ONE, 2009, 4 (04)
[7]   REGULARIZED DISCRIMINANT-ANALYSIS [J].
FRIEDMAN, JH .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1989, 84 (405) :165-175
[8]  
Hartigan J. A., 1979, Applied Statistics, V28, P100, DOI 10.2307/2346830
[9]  
Kaufman L., 2009, Finding Groups in Data: An Introduction to Cluster Analysis
[10]   Role of bacterial peptidase F inferred by statistical analysis and further experimental validation [J].
Kleine, Liliana Lopez ;
Monnet, Veronique ;
Pechoux, Christine ;
Trubuil, Alain .
HFSP JOURNAL, 2008, 2 (01) :29-41