Effective DNA Encoding for Splice Site Prediction Using SVM

被引:1
作者
Bari, A. T. M. Golam [1 ]
Reaz, M. Rokeya [1 ]
Jeong, Byeong-Soo [1 ]
机构
[1] Kyung Hee Univ, Dept Comp Engn, Yongin 446701, Gyeonggi Do, South Korea
关键词
SUPPORT VECTOR MACHINES; PRE-MESSENGER-RNA; GRAPHICAL REPRESENTATION; SEQUENCES;
D O I
暂无
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Splice site prediction in the pre-mRNA is a very important task for understanding gene structure and its function. To predict splice sites, SVM (support vector machine)-based classification technique is frequently used because of its classification accuracy. High performance of SVM largely depends on DNA encoding method. However, existing encoding approaches do not reveal the characteristics of DNA sequences very well enough to provide as much information as sequences have. In this paper, we propose new effective DNA encoding method for feature extraction which can give more information of DNA sequence. Our encoding method can provide density information of each nucleotide along with positional information and chemical property. Extensive performance study shows that the proposed method can provide better performance than existing encoding methods based on several performance criteria such as classification accuracy, sensitivity, specificity and auROC (area under receiver operating characteristicscurve).
引用
收藏
页码:241 / 258
页数:18
相关论文
共 30 条
[1]  
Baten A., 2008, BMC BIOINFORMATICS, V8, P1
[2]  
Baten A., 2006, BMC BIOINFORM S5, V7, P1
[3]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[4]  
Chen YF, 2009, J UNIVERS COMPUT SCI, V15, P2528
[5]  
Cherkassky V, 1997, IEEE Trans Neural Netw, V8, P1564, DOI 10.1109/TNN.1997.641482
[6]  
Fawcett T., HPL20032004
[7]   A novel 2-D graphical representation of DNA sequences of low degeneracy [J].
Guo, XF ;
Randic, M ;
Basak, SC .
CHEMICAL PHYSICS LETTERS, 2001, 350 (1-2) :106-112
[8]  
Hall M., 2009, SIGKDD Explorations, V11, P10, DOI DOI 10.1145/1656274.1656278
[9]   Splice site prediction in Arabidopsis thaliana pre-mRNA by combining local and global sequence information [J].
Hebsgaard, SM ;
Korning, PG ;
Tolstrup, N ;
Engelbrecht, J ;
Rouze, P ;
Brunak, S .
NUCLEIC ACIDS RESEARCH, 1996, 24 (17) :3439-3452
[10]   An approach of encoding for prediction of splice sites using SVM [J].
Huang, J. ;
Li, T. ;
Chen, K. ;
Wu, J. .
BIOCHIMIE, 2006, 88 (07) :923-929