Tree based models for classification of membrane and secreted proteins in heart

被引:0
作者
Sona Charles [1 ]
A. Subeesh [2 ]
Jeyakumar Natarajan [3 ]
机构
[1] Bharathiar University,Data Mining and Text Mining Laboratory, Department of Bioinformatics
[2] ICAR-Indian Institute of Spices Research,Division of Crop Improvement and Biotechnology
[3] ICAR- Central Institute of Agricultural Engineering,Agricultural Mechanization Division
关键词
Machine learning; Membrane proteins; Mutual information; Secretory proteins; Tree-based algorithms;
D O I
10.1007/s42485-024-00131-1
中图分类号
学科分类号
摘要
Computational differentiation of membrane and secreted proteins is one of the challenging and interesting topics in bioinformatics. It is a laborious as well as time-consuming task to experimentally differentiate between membrane and secreted proteins. In this study, we used tree-based classifiers such as decision trees, random forest, light gradient boosting machine, gradient boosting decision tree and extreme gradient boosting trees using sequence-based descriptors, viz. amino acid composition, dipeptide composition, conjoint triads, composition/transition/distribution and pseudo amino acid composition in the prediction scheme to enhance the predictive power of algorithms. RF on CTD was able to better discriminate the classes in the classification problems secreted versus non-secreted and secreted versus membrane proteins while in membrane versus non-membrane. Feature selection using mutual information considerably increased the prediction accuracy of the models. Multiclass models to distinguish membrane protein and secreted proteins from other proteins in the heart were enhanced by the addition of protein interaction network-based features, with highest accuracy being displayed by XgBoost. It is expected that the models developed using tree-based algorithms will be useful for classification and annotation proteins with known sequences.
引用
收藏
页码:147 / 157
页数:10
相关论文
共 50 条
[41]   ASPIRER: a new computational approach for identifying non-classical secreted proteins based on deep learning [J].
Wang, Xiaoyu ;
Li, Fuyi ;
Xu, Jing ;
Rong, Jia ;
Webb, Geoffrey, I ;
Ge, Zongyuan ;
Li, Jian ;
Song, Jiangning .
BRIEFINGS IN BIOINFORMATICS, 2022, 23 (02)
[42]   Feature Extraction and Classification of Heart Murmurs Based on Acoustic Qualities [J].
Shen, C. -H. .
IRBM, 2022, 43 (05) :470-478
[43]   Performance evaluation of feature selection and tree-based algorithms for traffic classification [J].
Aouedi, Ons ;
Piamrat, Kandaraj ;
Parrein, Benoit .
2021 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS WORKSHOPS (ICC WORKSHOPS), 2021,
[44]   Pavement roughness level classification based on logistic and decision tree machine learnings [J].
Han, Haihang ;
Zhang, Tianjie ;
Dong, Qiao ;
Chen, Xueqin ;
Wang, Yangyang .
GREEN AND INTELLIGENT TECHNOLOGIES FOR SUSTAINABLE AND SMART ASPHALT PAVEMENTS, IFRAE 2021, 2022, :400-405
[45]   Automatic Wafer Defect Classification Based on Decision Tree of Deep Neural Network [J].
Li, Zhixing ;
Wang, Zhangyang ;
Shi, Weiping .
2022 33RD ANNUAL SEMI ADVANCED SEMICONDUCTOR MANUFACTURING CONFERENCE (ASMC), 2022,
[46]   Customized decision tree-based approach for classification of soil on cloud environment [J].
K. Aditya Shastry ;
H. A. Sanjay .
Computing, 2023, 105 :1295-1336
[47]   Evaluation of Tree-Based Voting Algorithms in Water Quality Classification Prediction [J].
Li, Lili ;
Wei, Jianhui .
SUSTAINABILITY, 2024, 16 (23)
[48]   Customized decision tree-based approach for classification of soil on cloud environment [J].
Shastry, K. Aditya ;
Sanjay, H. A. .
COMPUTING, 2023, 105 (06) :1295-1336
[49]   COMPARISON OF TREE-BASED CLASSIFICATION ALGORITHMS IN MAPPING BURNED FOREST AREAS [J].
Matci, Dilek Kucuk ;
Comert, Resul ;
Avdan, Ugur .
GEODETSKI VESTNIK, 2020, 64 (03) :348-360
[50]   Optimisation-Based Classification Tree: A Game Theoretic Approach to Group Fairness [J].
Liapis, Georgios I. ;
Papageorgiou, Lazaros G. .
OPTIMIZATION AND LEARNING, OLA 2024, 2025, 2311 :28-40