A novel alignment-free method to classify protein folding types by combining spectral graph clustering with Chou's pseudo amino acid composition

被引:37
|
作者
Tripathi, Pooja [1 ,3 ]
Pandey, Paras N. [2 ]
机构
[1] Univ Allahabad, Ctr Bioinformat, Allahabad, Uttar Pradesh, India
[2] Univ Allahabad, Dept Math, Allahabad, Uttar Pradesh, India
[3] Weizmann Inst Sci, Dept Plant & Environm Sci, Rehovot, Israel
关键词
Protein sequence analysis; Spectral graph method; Machine learning; SEQUENCE-BASED PREDICTOR; WEB-SERVER; CLASSIFICATION; SITES; MODES; RNA; BIOINFORMATICS; RESIDUES; STEADY; PSEAAC;
D O I
10.1016/j.jtbi.2017.04.027
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The present work employs pseudo amino acid composition (PseAAC) for encoding the protein sequences in their numeric form. Later this will be arranged in the similarity matrix, which serves as input for spectral graph clustering method. Spectral methods are used previously also for clustering of protein sequences, but they uses pair wise alignment scores of protein sequences, in similarity matrix. The alignment score depends on the length of sequences, so clustering short and long sequences together may not good idea. Therefore the idea of introducing PseAAC with spectral clustering algorithm came into scene. We extensively tested our method and compared its performance with other existing machine learning methods. It is consistently observed that, the number of clusters that we obtained for a given set of proteins is close to the number of superfamilies in that set and PseAAC combined with spectral graph clustering shows the best classification results. (C) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:49 / 54
页数:6
相关论文
共 41 条
  • [1] An alignment-free method to find similarity among protein sequences via the general form of Chou's pseudo amino acid composition
    Gupta, M. K.
    Niyogi, R.
    Misra, M.
    SAR AND QSAR IN ENVIRONMENTAL RESEARCH, 2013, 24 (07) : 597 - 609
  • [2] Prediction of Protein Submitochondrial Locations by Incorporating Dipeptide Composition into Chou's General Pseudo Amino Acid Composition
    Ahmad, Khurshid
    Waris, Muhammad
    Hayat, Maqsood
    JOURNAL OF MEMBRANE BIOLOGY, 2016, 249 (03) : 293 - 304
  • [3] Classification of membrane protein types using Voting Feature Interval in combination with Chou's Pseudo Amino Acid Composition
    Ali, Farman
    Hayat, Maqsood
    JOURNAL OF THEORETICAL BIOLOGY, 2015, 384 : 78 - 83
  • [4] Numerical Characterization of Protein Sequences Based on the Generalized Chou's Pseudo Amino Acid Composition
    Li, Chun
    Li, Xueqin
    Lin, Yan-Xia
    APPLIED SCIENCES-BASEL, 2016, 6 (12):
  • [5] Using Chou's General Pseudo Amino Acid Composition to Classify Laccases from Bacterial and Fungal Sources via Chou's Five-Step Rule
    Behbahani, Mandana
    Nosrati, Mokhtar
    Moradi, Mohammad
    Mohabatkar, Hassan
    APPLIED BIOCHEMISTRY AND BIOTECHNOLOGY, 2020, 190 (03) : 1035 - 1048
  • [6] Predicting Lipase Types by Improved Chou's Pseudo-Amino Acid Composition
    Zhang, Guang-Ya
    Li, Hong-Chun
    Gao, Jia-Qiang
    Fang, Bai-Shan
    PROTEIN AND PEPTIDE LETTERS, 2008, 15 (10) : 1132 - 1137
  • [7] Wavelet images and Chou’s pseudo amino acid composition for protein classification
    Loris Nanni
    Sheryl Brahnam
    Alessandra Lumini
    Amino Acids, 2012, 43 : 657 - 665
  • [8] Wavelet images and Chou's pseudo amino acid composition for protein classification
    Nanni, Loris
    Brahnam, Sheryl
    Lumini, Alessandra
    AMINO ACIDS, 2012, 43 (02) : 657 - 665
  • [9] Prediction of Subcellular Localization of Apoptosis Protein Using Chou's Pseudo Amino Acid Composition
    Lin, Hao
    Wang, Hao
    Ding, Hui
    Chen, Ying-Li
    Li, Qian-Zhong
    ACTA BIOTHEORETICA, 2009, 57 (03) : 321 - 330
  • [10] Using Chou's Five-steps Rule to Classify and Predict Glutathione S-transferases with Different Machine Learning Algorithms and Pseudo Amino Acid Composition
    Mohabatkar, Hassan
    Ebrahimi, Samira
    Moradi, Mohammad
    INTERNATIONAL JOURNAL OF PEPTIDE RESEARCH AND THERAPEUTICS, 2021, 27 (01) : 309 - 316