Predicting protein phosphorylation sites in soybean using interpretable deep tabular learning network

被引:17
作者
Khalili, Elham [1 ]
Ramazi, Shahin [2 ]
Ghanati, Faezeh [1 ]
Kouchaki, Samaneh [3 ]
机构
[1] Tarbiat Modarres Univ TMU, Fac Sci, Dept Plant Sci, Tehran, Iran
[2] Tarbiat Modares Univ TMU, Fac Biol Sci, Dept Biophys, Tehran, Iran
[3] Univ Surrey, Fac Engn & Phys Sci, Ctr Vis Speech & Signal Prc, Guildford, Surrey, England
关键词
protein phosphorylation; soybean; machine learning; computational prediction; interpretable deep tabular learning network (TabNet); POSTTRANSLATIONAL MODIFICATIONS; COMPUTATIONAL PREDICTION; IN-VIVO; PLANT; PHOSPHOPROTEOMICS; CLASSIFIERS; PERFORMANCE; ROC;
D O I
10.1093/bib/bbac015
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Phosphorylation of proteins is one of the most significant post-translational modifications (PTMs) and plays a crucial role in plant functionality due to its impact on signaling, gene expression, enzyme kinetics, protein stability and interactions. Accurate prediction of plant phosphorylation sites (p-sites) is vital as abnormal regulation of phosphorylation usually leads to plant diseases. However, current experimental methods for PTM prediction suffers from high-computational cost and are error-prone. The present study develops machine learning-based prediction techniques, including a high-performance interpretable deep tabular learning network (TabNet) to improve the prediction of protein p-sites in soybean. Moreover, we use a hybrid feature set of sequential-based features, physicochemical properties and position-specific scoring matrices to predict serine (Ser/S), threonine (Thr/T) and tyrosine (Tyr/Y) p-sites in soybean for the first time. The experimentally verified p-sites data of soybean proteins are collected from the eukaryotic phosphorylation sites database and database post-translational modification. We then remove the redundant set of positive and negative samples by dropping protein sequences with >40% similarity. It is found that the developed techniques perform >70% in terms of accuracy. The results demonstrate that the TabNet model is the best performing classifier using hybrid features and with window size of 13, resulted in 78.96 and 77.24% sensitivity and specificity, respectively. The results indicate that the TabNet method has advantages in terms of high-performance and interpretability. The proposed technique can automatically analyze the data without any measurement errors and any human intervention. Furthermore, it can be used to predict putative protein p-sites in plants effectively.
引用
收藏
页数:21
相关论文
共 104 条
[1]  
Adetiloye T, 2017, HANDBOOK OF NEURAL COMPUTATION, P145, DOI 10.1016/B978-0-12-811318-9.00008-9
[2]   RF-MaloSite and DL-Malosite: Methods based on random forest and deep learning to identify malonylation sites [J].
AL-barakati, Hussam ;
Thapa, Niraj ;
Hiroto, Saigo ;
Roy, Kaushik ;
Newman, Robert H. ;
Kc, Dukka .
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2020, 18 :852-860
[3]  
Althnian A., 2021, APPL SCI, V11, P27, DOI DOI 10.1002/PMIC.200500172
[4]  
[Anonymous], 2021, CELL SYST, V18
[5]   The crucial role of protein phosphorylation in cell signaling and its use as targeted therapy [J].
Ardito, Fatima ;
Giuliani, Michele ;
Perrone, Donatella ;
Troiano, Giuseppe ;
Lo Muzio, Lorenzo .
INTERNATIONAL JOURNAL OF MOLECULAR MEDICINE, 2017, 40 (02) :271-280
[6]  
Arik SO., P AAAI C ART INT, V8
[7]   Monitoring of Plant Protein Post-translational Modifications Using Targeted Proteomics [J].
Arsova, Borjana ;
Watt, Michelle ;
Usadel, Bjoern .
FRONTIERS IN PLANT SCIENCE, 2018, 9
[8]   Protein post-translational modifications: &ITIn silico&IT prediction tools and molecular modeling [J].
Audagnotto, Martina ;
Dal Peraro, Matteo .
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2017, 15 :307-319
[9]   Incorporating Machine Learning into Established Bioinformatics Frameworks [J].
Auslander, Noam ;
Gussow, Ayal B. ;
Koonin, Eugene V. .
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2021, 22 (06) :1-19
[10]   Machine Learning in Agriculture: A Comprehensive Updated Review [J].
Benos, Lefteris ;
Tagarakis, Aristotelis C. ;
Dolias, Georgios ;
Berruto, Remigio ;
Kateris, Dimitrios ;
Bochtis, Dionysis .
SENSORS, 2021, 21 (11)