Improved QSAR models for PARP-1 inhibition using data balancing, interpretable machine learning, and matched molecular pair analysis

被引:0
作者
Gomatam, Anish [1 ]
Hirlekar, Bhakti Umesh [1 ]
Singh, Krishan Dev [1 ]
Murty, Upadhyayula Suryanarayana [1 ]
Dixit, Vaibhav A. [1 ]
机构
[1] Govt India, Natl Inst Pharmaceut Educ & Res, Dept Med Chem,NIPER Guwahati, Dept Pharmaceut,Minist Chem & Fertilizers, PO Changsari, Gauhati 781101, Assam, India
关键词
PARP-1; inhibitor; Machine learning; QSAR; Data balancing; MMPA; IN-SILICO; PREDICTION; OPTIMIZATION; SOLUBILITY; REGRESSION; ENSEMBLES; SELECTION; DOCKING; DESIGN;
D O I
10.1007/s11030-024-10809-9
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The poly (ADP-ribose) polymerase-1 (PARP-1) enzyme is an important target in the treatment of breast cancer. Currently, treatment options include the drugs Olaparib, Niraparib, Rucaparib, and Talazoparib; however, these drugs can cause severe side effects including hematological toxicity and cardiotoxicity. Although in silico models for the prediction of PARP-1 activity have been developed, the drawbacks of these models include low specificity, a narrow applicability domain, and a lack of interpretability. To address these issues, a comprehensive machine learning (ML)-based quantitative structure-activity relationship (QSAR) approach for the informed prediction of PARP-1 activity is presented. Classification models built using the Synthetic Minority Oversampling Technique (SMOTE) for data balancing gave robust and predictive models based on the K-nearest neighbor algorithm (accuracy 0.86, sensitivity 0.88, specificity 0.80). Regression models were built on structurally congeneric datasets, with the models for the phthalazinone class and fused cyclic compounds giving the best performance. In accordance with the Organization for Economic Cooperation and Development (OECD) guidelines, a mechanistic interpretation is proposed using the Shapley Additive Explanations (SHAP) to identify the important topological features to differentiate between PARP-1 actives and inactives. Moreover, an analysis of the PARP-1 dataset revealed the prevalence of activity cliffs, which possibly negatively impacts the model's predictive performance. Finally, a set of chemical transformation rules were extracted using the matched molecular pair analysis (MMPA) which provided mechanistic insights and can guide medicinal chemists in the design of novel PARP-1 inhibitors.
引用
收藏
页码:2135 / 2152
页数:18
相关论文
共 80 条
  • [1] Principal component analysis
    Abdi, Herve
    Williams, Lynne J.
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2010, 2 (04): : 433 - 459
  • [2] Development and evaluation of a QSPR model for the prediction of diamagnetic susceptibility
    Afantitis, Antreas
    Melagraki, Georgia
    Sarimveis, Haralambos
    Koutentis, Panayiotis A.
    Markopoulos, John
    Igglessi-Markopoulou, Olga
    [J]. QSAR & COMBINATORIAL SCIENCE, 2008, 27 (04): : 432 - 436
  • [3] A multi-task FP-GNN framework enables accurate prediction of selective PARP inhibitors
    Ai, Daiqiao
    Wu, Jingxing
    Cai, Hanxuan
    Zhao, Duancheng
    Chen, Yihao
    Wei, Jiajia
    Xu, Jianrong
    Zhang, Jiquan
    Wang, Ling
    [J]. FRONTIERS IN PHARMACOLOGY, 2022, 13
  • [4] Beware of R2: Simple, Unambiguous Assessment of the Prediction Accuracy of QSAR and QSPR Models
    Alexander, D. L. J.
    Tropsha, A.
    Winkler, David A.
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2015, 55 (07) : 1316 - 1322
  • [5] [Anonymous], RES DRUG DISC
  • [6] Prediction-Inspired Intelligent Training for the Development of Classification Read-across Structure-Activity Relationship (c-RASAR) Models for Organic Skin Sensitizers: Assessment of Classification Error Rate from Novel Similarity Coefficients
    Banerjee, Arkaprava
    Roy, Kunal
    [J]. CHEMICAL RESEARCH IN TOXICOLOGY, 2023, 36 (09) : 1518 - 1531
  • [7] Safety profile of poly (ADP-ribose) polymerase (PARP) inhibitors in cancer: a network meta-analysis of randomized controlled trials
    Bao, Shengnan
    Yue, Yuanping
    Hua, Yijia
    Zeng, Tianyu
    Yang, Yiqi
    Yang, Fan
    Yan, Xueqi
    Sun, Chunxiao
    Yang, Mengzhu
    Fu, Ziyi
    Huang, Xiang
    Li, Jun
    Wu, Hao
    Li, Wei
    Zhao, Yang
    Yin, Yongmei
    [J]. ANNALS OF TRANSLATIONAL MEDICINE, 2021, 9 (15)
  • [8] Basak D., 2007, Neural Inf. Process. Lett. Rev., V11, P203, DOI DOI 10.1007/978-1-4302-5990-9_4
  • [9] A comparative analysis of gradient boosting algorithms
    Bentejac, Candice
    Csorgo, Anna
    Martinez-Munoz, Gonzalo
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (03) : 1937 - 1967
  • [10] Berrar Daniel, 2018, Encyclopedia of Bioinformatics and Computational Biology, V403, P412, DOI [10.1016/b978-0-12-809633-8.20473-1, DOI 10.1016/B978-0-12-809633-8.20473-1, 10.1016/B978-0-12-809633-8.20473-1]