Data Science for QSAR for Protease activity

被引:0
作者
Ueda, Hideki [1 ]
Fukumori, Akio [2 ]
Koge, Daiki [1 ]
Ono, Naoaki [1 ]
Altaf-Ul-Amin, Md. [1 ]
Kanaya, Shigehiko [1 ]
机构
[1] Nara Inst Sci & Technol, Grad Sch Sci & Technol, Div Sci & Technol, Ikoma, Nara 6300192, Japan
[2] Osaka Univ, Grad Sch Med, Dept Mental Hlth Promot, Machikaneyama Cho 1-17, Toyonaka 5600043, Japan
来源
JOURNAL OF COMPUTER AIDED CHEMISTRY | 2023年 / 23卷
关键词
QSAR; proteolysis; peptides; data-science;
D O I
暂无
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Proteolytic cleavage is influenced by the physicochemical properties of amino acids surrounding the cleavage site. Among these properties are 553 amino acid indices, and we considered that combining these indices with machine learning could create QSAR models for protease activity. In this study, we focused on gamma-secretase, an enzyme known to be involved in the pathogenesis of Alzheimer's disease. We created 10,680 regression models for the protease activity of gamma-secretase by using 10 amino acid indices compressed from the 553 amino acid indices through principal component analysis, 12 pocket models of protease binding sites, and 89 machine learning models. We used these regression models to predict cleavage sites for 23 substrates where the cleavage sites were known and examined the amino acid property information used in the model with the highest prediction accuracy (87.0%). We found that the amino acid property information used in this model was related to the secondary structure of proteins, which may imply that it contains important information on the transmembrane cleavage of gamma-secretase.
引用
收藏
页码:43 / 49
页数:7
相关论文
共 6 条
  • [1] Designed protease-based signaling networks
    Fink, Tina
    Jerala, Roman
    [J]. CURRENT OPINION IN CHEMICAL BIOLOGY, 2022, 68
  • [2] AAindex: Amino acid index database
    Kawashima, S
    Kanehisa, M
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 374 - 374
  • [3] Building Predictive Models in R Using the caret Package
    Kuhn, Max
    [J]. JOURNAL OF STATISTICAL SOFTWARE, 2008, 28 (05): : 1 - 26
  • [4] Twenty years of bioinformatics research for protease-specific substrate and cleavage site prediction: a comprehensive revisit and benchmarking of existing methods
    Li, Fuyi
    Wang, Yanan
    Li, Chen
    Marquez-Lago, Tatiana T.
    Leier, Andre
    Rawlings, Neil D.
    Haffari, Gholamreza
    Revote, Jerico
    Akutsu, Tatsuya
    Chou, Kuo-Chen
    Purcell, Anthony W.
    Pike, Robert N.
    Webb, Geoffrey, I
    Smith, A. Ian
    Lithgow, Trevor
    Daly, Roger J.
    Whisstock, James C.
    Song, Jiangning
    [J]. BRIEFINGS IN BIOINFORMATICS, 2019, 20 (06) : 2150 - 2166
  • [5] Riza LS, 2015, J STAT SOFTW, V65, P1
  • [6] Semagacestat Is a Pseudo-Inhibitor of γ-Secretase
    Tagami, Shinji
    Yanagida, Kanta
    Kodama, Takashi S.
    Takami, Mako
    Mizuta, Naoki
    Oyama, Hiroshi
    Nishitomi, Kouhei
    Chiu, Yu-Wen
    Okamoto, Toru
    Ikeuchi, Takeshi
    Sakaguchi, Gaku
    Kudo, Takashi
    Matsuura, Yoshiharu
    Fukumori, Akio
    Takeda, Masatoshi
    Ihara, Yasuo
    Okochi, Masayasu
    [J]. CELL REPORTS, 2017, 21 (01): : 259 - 273