BLKnn: A K-Nearest Neighbors Method For Predicting Bioluminescent Proteins

被引:0
作者
Hu, Jing [1 ]
机构
[1] Franklin & Marshall Coll, Dept Math & Comp Sci, Lancaster, PA 17604 USA
来源
2014 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY | 2014年
关键词
K-nearest neighbors method; bit-score weighted Euclidean distance; pseudo-amino acid composition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Bioluminescence is a chemical process in which light is produced and emitted by a living organism. Recent biotechnological applications of bioluminescence include using of bioluminescent proteins in gene expression analysis, bioluminescent imaging, study of protein-protein interaction and disease progression, drug discovery, toxicity determination, etc. Therefore, it is of great medical and commercial significances to identify bioluminescent proteins accurately and efficiently. In this study, we present BLKnn, a K-nearest neighbors method that can predict bioluminescent proteins. This method is based on the bit-score weighted Euclidean distance, which is calculated from compositions of selected amino acids and pseudo-amino acids. On a balanced training dataset, BLKnn achieved 74.9% sensitivity, 95.5% specificity, 85.2% accuracy, and 0.919 AUC (area under the ROC curve) by 10-fold cross-validation. When tested on a much bigger independent test dataset, the method also achieved a consistent performance of 88.0% overall accuracy and 0.989 AUC. Comparisons showed that BLKnn outperformed previously published methods. The method is available at https://edisk.fandm.edu/jing.hu/blknn/blknn.html.
引用
收藏
页数:6
相关论文
共 16 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Prediction and classification of protein subcellular location - Sequence-order effect and pseudo amino acid composition [J].
Chou, KC ;
Cai, YD .
JOURNAL OF CELLULAR BIOCHEMISTRY, 2003, 90 (06) :1250-1260
[3]   Prediction of protein cellular attributes using pseudo-amino acid composition [J].
Chou, KC .
PROTEINS-STRUCTURE FUNCTION AND GENETICS, 2001, 43 (03) :246-255
[4]   Advances in vivo bioluminescence imaging of gene expression [J].
Contag, CH ;
Bachmann, MH .
ANNUAL REVIEW OF BIOMEDICAL ENGINEERING, 2002, 4 :235-260
[5]   Prediction of protein submitochondria locations by hybridizing pseudo-amino acid composition with various physicochemical features of segmented sequence [J].
Du, Pufeng ;
Li, Yanda .
BMC BIOINFORMATICS, 2006, 7 (1)
[6]  
Fan G.L., 2013, J THEOR BIOL, V7, P334
[7]   Bioluminescence in the Sea [J].
Haddock, Steven H. D. ;
Moline, Mark A. ;
Case, James F. .
ANNUAL REVIEW OF MARINE SCIENCE, 2010, 2 :443-493
[8]   Predicting the effects of frameshifting indels [J].
Hu, Jing ;
Ng, Pauline C. .
GENOME BIOLOGY, 2012, 13 (02)
[9]   BS-KNN: An Effective Algorithm for Predicting Protein Subchloroplast Localization [J].
Hu, Jing ;
Yan, Xianghe .
EVOLUTIONARY BIOINFORMATICS, 2012, 8 :79-87
[10]   BLProt: prediction of bioluminescent proteins based on support vector machine and relieff feature selection [J].
Kandaswamy, Krishna Kumar ;
Pugalenthi, Ganesan ;
Hazrati, Mehrnaz Khodam ;
Kalies, Kai-Uwe ;
Martinetz, Thomas .
BMC BIOINFORMATICS, 2011, 12