Protein Sequence Comparison Based on Physicochemical Properties and the Position-Feature Energy Matrix

被引:30
作者
Yu, Lulu [1 ]
Zhang, Yusen [2 ]
Gutman, Ivan [2 ]
Shi, Yongtang [3 ,4 ]
Dehmer, Matthias [5 ,6 ]
机构
[1] Shandong Univ Weihai, Sch Math & Stat, Weihai 264209, Peoples R China
[2] Univ Kragujevac, Fac Sci, POB 60, Kragujevac 34000, Serbia
[3] Nankai Univ, Ctr Combinator, Tianjin 300071, Peoples R China
[4] Nankai Univ, LPMC, Tianjin 300071, Peoples R China
[5] UMIT, Dept Mechatron & Biomed Comp Sci, Hall In Tirol, Austria
[6] Nankai Univ, Coll Comp & Control Engn, Tianjin 300071, Peoples R China
基金
奥地利科学基金会; 中国国家自然科学基金;
关键词
GRAPHICAL REPRESENTATION; ANTIFREEZE PROTEIN; ALIGNMENT; SIMILARITY/DISSIMILARITY; PREDICTION; MAP;
D O I
10.1038/srep46237
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
We develop a novel position-feature-based model for protein sequences by employing physicochemical properties of 20 amino acids and the measure of graph energy. The method puts the emphasis on sequence order information and describes local dynamic distributions of sequences, from which one can get a characteristic B-vector. Afterwards, we apply the relative entropy to the sequences representing B-vectors to measure their similarity/dissimilarity. The numerical results obtained in this study show that the proposed methods leads to meaningful results compared with competitors such as Clustal W.
引用
收藏
页数:8
相关论文
共 51 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]  
[Anonymous], 2009, Analysis of Complex Networks: From Biology to Linguistics, DOI DOI 10.1002/9783527627981.CH7
[3]   Integrating alignment-based and alignment-free sequence similarity measures for biological sequence classification [J].
Borozan, Ivan ;
Watt, Stuart ;
Ferretti, Vincent .
BIOINFORMATICS, 2015, 31 (09) :1396-1404
[4]   Fast Statistical Alignment [J].
Bradley, Robert K. ;
Roberts, Adam ;
Smoot, Michael ;
Juvekar, Sudeep ;
Do, Jaeyoung ;
Dewey, Colin ;
Holmes, Ian ;
Pachter, Lior .
PLOS COMPUTATIONAL BIOLOGY, 2009, 5 (05)
[5]   FOGSAA: Fast Optimal Global Sequence Alignment Algorithm [J].
Chakraborty, Angana ;
Bandyopadhyay, Sanghamitra .
SCIENTIFIC REPORTS, 2013, 3
[6]   Phylogenetic Analysis of Protein Sequences Based on Distribution of Length About Common Substring [J].
Chang, Guisong ;
Wang, Tianming .
PROTEIN JOURNAL, 2011, 30 (03) :167-172
[7]   NATURE OF ACCESSIBLE AND BURIED SURFACES IN PROTEINS [J].
CHOTHIA, C .
JOURNAL OF MOLECULAR BIOLOGY, 1976, 105 (01) :1-14
[8]  
CHOU KC, 1979, SCI SINICA, V22, P341
[9]   Some remarks on protein attribute prediction and pseudo amino acid composition [J].
Chou, Kuo-Chen .
JOURNAL OF THEORETICAL BIOLOGY, 2011, 273 (01) :236-247
[10]  
Cover TM., 1999, ELEMENTS INFORM THEO, DOI DOI 10.1002/047174882X