Protein Secondary Structural Class Prediction using Effective Feature Modeling and Machine Learning Techniques

被引:8
|
作者
Bankapur, Sanjay [1 ]
Patil, Nagamma [1 ]
机构
[1] Natl Inst Technol Karnataka, Dept Informat Technol, Mangalore, India
来源
PROCEEDINGS 2018 IEEE 18TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE) | 2018年
关键词
amino acid sequence; bi-gram; character embedding; machine learning; protein secondary structural sequence; skip-gram; AMINO-ACID-COMPOSITION;
D O I
10.1109/BIBE.2018.00012
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Protein Secondary Structural Class (PSSC) prediction is an important step to find its further folds, tertiary structure and functions, which in turn have potential applications in drug discovery. Various computational methods have been developed to predict the PSSC, however, predicting PSSC on the basis of protein sequences is still a challenging task. In this study, we propose an effective approach to extract features using two techniques (i) SkipXGram bi-gram: in which skipped bi-gram features are extracted and (ii) Character embedded features: in which features are extracted using word embedding approach. The combined feature sets from the proposed feature modeling approach are explored using various machine learning classifiers. The best performing classifier (i.e. Random Forest) is benchmarked against state-of-the-art PSSC prediction models. The proposed model was assessed on two low sequence similarity benchmark datasets i.e. 25PDB and FC699. The performance analysis demonstrates that the proposed model consistently outperformed state-of-the-art models by a factor of 3% to 23% and 4% to 6% for 25PDB and FC699 datasets respectively.
引用
收藏
页码:18 / 21
页数:4
相关论文
共 50 条
  • [31] PROTEIN SECONDARY STRUCTURE PREDICTION USING LOGIC-BASED MACHINE LEARNING
    MUGGLETON, S
    KING, RD
    STERNBERG, MJE
    PROTEIN ENGINEERING, 1993, 6 (05): : 549 - 549
  • [32] ACCURATE PREDICTION OF PROTEIN SECONDARY STRUCTURAL CLASS WITH FUZZY STRUCTURAL VECTORS
    BOBERG, J
    SALAKOSKI, T
    VIHINEN, M
    PROTEIN ENGINEERING, 1995, 8 (06): : 505 - 512
  • [33] MACHINE LEARNING APPROACH FOR THE PREDICTION OF PROTEIN SECONDARY STRUCTURE
    KING, RD
    STERNBERG, MJE
    JOURNAL OF MOLECULAR BIOLOGY, 1990, 216 (02) : 441 - 457
  • [34] Class Result Prediction using Machine Learning
    Pushpa, S. K.
    Manjunath, T. N.
    Mrunal, T., V
    Singh, Amartya
    Suhas, C.
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON SMART TECHNOLOGIES FOR SMART NATION (SMARTTECHCON), 2017, : 1208 - 1212
  • [35] Prediction of hypercholesterolemia using machine learning techniques
    Pooyan Moradifar
    Mohammad Meskarpour Amiri
    Journal of Diabetes & Metabolic Disorders, 2023, 22 : 255 - 265
  • [36] Bankruptcy Prediction Using Machine Learning Techniques
    Shetty, Shekar
    Musa, Mohamed
    Bredart, Xavier
    JOURNAL OF RISK AND FINANCIAL MANAGEMENT, 2022, 15 (01)
  • [37] Emotion Prediction using Machine Learning Techniques
    Shamsi, Areeba
    Nasir, Sabika
    Hajiani, Mishaal Amin
    Ejaz, Afshan
    Ali, Syed Asim
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2019, 19 (06): : 166 - 172
  • [38] Diabetes Prediction using Machine Learning Techniques
    Obulesu, O.
    Suresh, K.
    Ramudu, B. Venkata
    HELIX, 2020, 10 (02): : 136 - 142
  • [39] Prediction of hypercholesterolemia using machine learning techniques
    Moradifar, Pooyan
    Amiri, Mohammad Meskarpour
    JOURNAL OF DIABETES AND METABOLIC DISORDERS, 2023, 22 (01) : 255 - 265
  • [40] Efficient prediction of coronary artery disease using machine learning algorithms with feature selection techniques
    Hassan, Md. Mehedi
    Zaman, Sadika
    Rahman, Md. Mushfiqur
    Bairagi, Anupam Kumar
    El-Shafai, Walid
    Rathore, Rajkumar Singh
    Gupta, Deepak
    COMPUTERS & ELECTRICAL ENGINEERING, 2024, 115