Predicting single nucleotide polymorphisms (SNP) from DNA sequence by support vector machine

被引:8
作者
Kong, Waiming [1 ]
Choo, Keng Wah [1 ]
机构
[1] Nanyang Polytech, Bioinformat Grp, Singapore 569830, Singapore
来源
FRONTIERS IN BIOSCIENCE-LANDMARK | 2007年 / 12卷
关键词
SVM; SNP; SNP prediction;
D O I
10.2741/2173
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Recently, SNP has gained substantial attention as genetic markers and is recognized as a key element in the development of personalized medicine. Computational prediction of SNP can be used as a guide for SNP discovery to reduce the cost and time needed for the development of personalized medicine. We have developed a method for SNP prediction based on support vector machines ( SVMs) using different features extracted from the SNP data. Prediction rates of 60.9 % was achieved by sequence feature, 59.1% by free- energy feature, 58.1% by GC content feature, 58.0% by melting temperature feature, 56.2% by enthalpy feature, 55.1% by entropy feature and 54.3% by the gene, exon and intron feature. We introduced a new feature, the SNP distribution score that achieved a prediction rate of 77.3%. Thus, the proposed SNP prediction algorithm can be used to in SNP discovery.
引用
收藏
页码:1610 / 1614
页数:5
相关论文
共 17 条
[1]   Knowledge-based analysis of microarray gene expression data by using support vector machines [J].
Brown, MPS ;
Grundy, WN ;
Lin, D ;
Cristianini, N ;
Sugnet, CW ;
Furey, TS ;
Ares, M ;
Haussler, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (01) :262-267
[2]   Support Vector Machines for predicting protein structural class [J].
Cai, Yu-Dong ;
Liu, Xiao-Jun ;
Xu, Xue-biao ;
Zhou, Guo-Ping .
BMC BIOINFORMATICS, 2001, 2 (1)
[3]   Parallel genotyping of human SNPs using generic high-density oligonucleotide tag arrays [J].
Fan, JB ;
Chen, XQ ;
Halushka, MK ;
Berno, A ;
Huang, XH ;
Ryder, T ;
Lipshutz, RJ ;
Lockhart, DJ ;
Chakravarti, A .
GENOME RESEARCH, 2000, 10 (06) :853-860
[4]   Support vector machine classification and validation of cancer tissue samples using microarray expression data [J].
Furey, TS ;
Cristianini, N ;
Duffy, N ;
Bednarski, DW ;
Schummer, M ;
Haussler, D .
BIOINFORMATICS, 2000, 16 (10) :906-914
[5]   JS']JSNP: a database of common gene variations in the Japanese population [J].
Hirakawa, M ;
Tanaka, T ;
Hashimoto, Y ;
Kuroda, M ;
Takagi, T ;
Nakamura, Y .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :158-162
[6]   Support vector machine approach for protein subcellular localization prediction [J].
Hua, SJ ;
Sun, ZR .
BIOINFORMATICS, 2001, 17 (08) :721-728
[7]   Mass spectrometry for genotyping: an emerging tool for molecular medicine [J].
Jackson, PE ;
Scholl, PF ;
Groopman, JD .
MOLECULAR MEDICINE TODAY, 2000, 6 (07) :271-276
[8]  
Joachims J., 1999, ADV KERNEL METHODS S
[9]  
Joachims T., 1999, P INT C MACH LEARN
[10]  
Lipsky RH, 2001, CLIN CHEM, V47, P635