Identifying Antifreeze Proteins Based on Key Evolutionary Information

被引:9
作者
Sun, Shanwen [1 ]
Ding, Hui [2 ]
Wang, Donghua [3 ]
Han, Shuguang [2 ]
机构
[1] Univ Elect Sci & Technol China, Inst Fundamental & Frontier Sci, Chengdu, Peoples R China
[2] Univ Elect Sci & Technol China, Ctr Informat Biol, Chengdu, Peoples R China
[3] Heilongjiang Prov Land Reclamat Headquarters Gen, Dept Gen Surg, Harbin, Peoples R China
来源
FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY | 2020年 / 8卷
关键词
antifreeze proteins; support vector machine; evolution; machine learning; position-specific scoring matrix; EXTRACTION; MACHINE; LARVAE;
D O I
10.3389/fbioe.2020.00244
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Antifreeze proteins are important antifreeze materials that have been widely used in industry, including in cryopreservation, de-icing, and food storage applications. However, the quantity of some commercially produced antifreeze proteins is insufficient for large-scale industrial applications. Further, many antifreeze proteins have properties such as cytotoxicity, severely hindering their applications. Understanding the mechanisms underlying the protein-ice interactions and identifying novel antifreeze proteins are, therefore, urgently needed. In this study, to uncover the mechanisms underlying protein-ice interactions and provide an efficient and accurate tool for identifying antifreeze proteins, we assessed various evolutionary features based on position-specific scoring matrices (PSSMs) and evaluated their importance for discriminating of antifreeze and non-antifreeze proteins. We then parsimoniously selected seven key features with the highest importance. We found that the selected features showed opposite tendencies (regarding the conservation of certain amino acids) between antifreeze and non-antifreeze proteins. Five out of the seven features had relatively high contributions to the discrimination of antifreeze and non-antifreeze proteins, as revealed by a principal component analysis, i.e., the conservation of the replacement of Cys, Trp, and Gly in antifreeze proteins by Ala, Met, and Ala, respectively, in the related proteins, and the conservation of the replacement of Arg in non-antifreeze proteins by Ser and Arg in the related proteins. Based on the seven parsimoniously selected key features, we established a classifier using support vector machine, which outperformed the state-of-the-art tools. These results suggest that understanding evolutionary information is crucial to designing accurate automated methods for discriminating antifreeze and non-antifreeze proteins. Our classifier, therefore, is an efficient tool for annotating new proteins with antifreeze functions based on sequence information and can facilitate their application in industry.
引用
收藏
页数:8
相关论文
共 66 条
[1]  
[Anonymous], 2003, BIOINFORMATICS GENET
[2]   Shifting the limits in wheat research and breeding using a fully annotated reference genome [J].
Appels, Rudi ;
Eversole, Kellye ;
Feuillet, Catherine ;
Keller, Beat ;
Rogers, Jane ;
Stein, Nils ;
Pozniak, Curtis J. ;
Choulet, Frederic ;
Distelfeld, Assaf ;
Poland, Jesse ;
Ronen, Gil ;
Sharpe, Andrew G. ;
Pozniak, Curtis ;
Barad, Omer ;
Baruch, Kobi ;
Keeble-Gagnere, Gabriel ;
Mascher, Martin ;
Ben-Zvi, Gil ;
Josselin, Ambre-Aurore ;
Himmelbach, Axel ;
Balfourier, Francois ;
Gutierrez-Gonzalez, Juan ;
Hayden, Matthew ;
Koh, ChuShin ;
Muehlbauer, Gary ;
Pasam, Raj K. ;
Paux, Etienne ;
Rigault, Philippe ;
Tibbits, Josquin ;
Tiwari, Vijay ;
Spannagl, Manuel ;
Lang, Daniel ;
Gundlach, Heidrun ;
Haberer, Georg ;
Mayer, Klaus F. X. ;
Ormanbekova, Danara ;
Prade, Verena ;
Simkova, Hana ;
Wicker, Thomas ;
Swarbreck, David ;
Rimbert, Helene ;
Felder, Marius ;
Guilhot, Nicolas ;
Kaithakottil, Gemy ;
Keilwagen, Jens ;
Leroy, Philippe ;
Lux, Thomas ;
Twardziok, Sven ;
Venturini, Luca ;
Juhasz, Angela .
SCIENCE, 2018, 361 (6403) :661-+
[3]   Antifreeze proteins in higher plants [J].
Atici, O ;
Nalbantoglu, B .
PHYTOCHEMISTRY, 2003, 64 (07) :1187-1196
[4]   Efficient production of a folded and functional, highly disulfide-bonded β-helix antifreeze protein in bacteria [J].
Bar, Maya ;
Bar-Ziv, Roy ;
Scherf, Tali ;
Fass, Deborah .
PROTEIN EXPRESSION AND PURIFICATION, 2006, 48 (02) :243-252
[5]   MWMOTE-Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning [J].
Barua, Sukarna ;
Islam, Md. Monirul ;
Yao, Xin ;
Murase, Kazuyuki .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (02) :405-425
[6]  
Bhagwat M., 2007, COMP GENOMICS
[7]   Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric [J].
Boughorbel, Sabri ;
Jarray, Fethi ;
El-Anbari, Mohammed .
PLOS ONE, 2017, 12 (06)
[8]   Predicting Enhancers from Multiple Cell Lines and Tissues across Different Developmental Stages Based On SVM Method [J].
Bu, Hongda ;
Hao, Jiaqi ;
Guan, Jihong ;
Zhou, Shuigeng .
CURRENT BIOINFORMATICS, 2018, 13 (06) :655-660
[9]   Detecting N6-methyladenosine sites from RNA transcriptomes using ensemble Support Vector Machines [J].
Chen, Wei ;
Xing, Pengwei ;
Zou, Quan .
SCIENTIFIC REPORTS, 2017, 7
[10]   Antifreeze Proteins from Diverse Organisms and their Applications: An Overview [J].
Cheung, Randy Chi Fai ;
Ng, Tzi Bun ;
Wong, Jack Ho .
CURRENT PROTEIN & PEPTIDE SCIENCE, 2017, 18 (03) :262-283